Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register model #11

Open
niutech opened this issue Jul 4, 2024 · 2 comments
Open

Register model #11

niutech opened this issue Jul 4, 2024 · 2 comments

Comments

@niutech
Copy link

niutech commented Jul 4, 2024

Let's allow developers to register a new LLM in a web browser as a web extension, which then would be able to be chosen in #8. The model would be in a TFLite FlatBuffers format, so that it was compatible with MediaPipe LLM Inference as a possible fallback for unsupported browsers (compatible with Gemini Nano).

The method to register/add a custom model could be invoked by web extension like this:

ai.registerModel({
    id: 'phi-3-mini',
    version: '3.0',
    file: 'chrome-extension://azipopnxdpcknwapfrtdedlnjjkmpnao/phi-3-mini.bin',
    loraFile: 'chrome-extension://azipopnxdpcknwapfrtdedlnjjkmpnao/phi-3-mini-lora.bin', // optional
    defaultTemperature: 0.5,
    defaultTopK: 3,
    maxTopK: 10
})

Then it could be listed by web apps like this:

const models = await ai.listModels(); // ['gemini-nano', 'phi-3-mini']

The model metadata could be accessed like this:

const modelInfo = await ai.textModelInfo('phi-3-mini'); // {id: 'phi-3-mini', version: '3.0', defaultTemperature: 0.5, defaultTopK: 3, maxTopK: 10}
@niutech niutech mentioned this issue Jul 4, 2024
@captainbrosset
Copy link

This is more or less what VS Code does. See https://code.visualstudio.com/api/extension-guides/language-model and https://code.visualstudio.com/api/references/vscode-api#lm

VSCode extension devs can use LLMs, and they do this by first choosing a model from a predefined list with selectChatModels.
LLMs are contributed by other extensions, although I don't think the docs say how yet.

@KenjiBaheux
Copy link

Thanks for the detailed proposal.

While enabling developers to register custom LLMs via web extensions offers interesting possibilities, we need to carefully consider the implications of going with strong identifiers that might limit flexibility (e.g. avoiding an explosion of models or versions of a given model while what's already available could be sufficient, avoiding being stuck with what was popular at a given time while better options have become available), and also portability across browsers (e.g. I imagine that the chrome-extension://[id] may only make sense for Chrome). Especially with large models, it seems important to minimize an over-reliance on a specific version of a model, or a specific "location" (origin, or an extension ID) for the model.

See this related discussion: issue #5 that goes beyond the built-in AI APIs. We encourage you to engage there to contribute to a more future-proof solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants