Register model #11

niutech · 2024-07-04T10:11:06Z

Let's allow developers to register a new LLM in a web browser as a web extension, which then would be able to be chosen in #8. The model would be in a TFLite FlatBuffers format, so that it was compatible with MediaPipe LLM Inference as a possible fallback for unsupported browsers (compatible with Gemini Nano).

The method to register/add a custom model could be invoked by web extension like this:

ai.registerModel({
    id: 'phi-3-mini',
    version: '3.0',
    file: 'chrome-extension://azipopnxdpcknwapfrtdedlnjjkmpnao/phi-3-mini.bin',
    loraFile: 'chrome-extension://azipopnxdpcknwapfrtdedlnjjkmpnao/phi-3-mini-lora.bin', // optional
    defaultTemperature: 0.5,
    defaultTopK: 3,
    maxTopK: 10
})

Then it could be listed by web apps like this:

const models = await ai.listModels(); // ['gemini-nano', 'phi-3-mini']

The model metadata could be accessed like this:

const modelInfo = await ai.textModelInfo('phi-3-mini'); // {id: 'phi-3-mini', version: '3.0', defaultTemperature: 0.5, defaultTopK: 3, maxTopK: 10}

captainbrosset · 2024-07-11T13:31:25Z

This is more or less what VS Code does. See https://code.visualstudio.com/api/extension-guides/language-model and https://code.visualstudio.com/api/references/vscode-api#lm

VSCode extension devs can use LLMs, and they do this by first choosing a model from a predefined list with selectChatModels.
LLMs are contributed by other extensions, although I don't think the docs say how yet.

KenjiBaheux · 2024-08-20T09:16:42Z

Thanks for the detailed proposal.

While enabling developers to register custom LLMs via web extensions offers interesting possibilities, we need to carefully consider the implications of going with strong identifiers that might limit flexibility (e.g. avoiding an explosion of models or versions of a given model while what's already available could be sufficient, avoiding being stuck with what was popular at a given time while better options have become available), and also portability across browsers (e.g. I imagine that the chrome-extension://[id] may only make sense for Chrome). Especially with large models, it seems important to minimize an over-reliance on a specific version of a model, or a specific "location" (origin, or an extension ID) for the model.

See this related discussion: issue #5 that goes beyond the built-in AI APIs. We encourage you to engage there to contribute to a more future-proof solution.

niutech mentioned this issue Jul 4, 2024

Choose model #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Register model #11

Register model #11

niutech commented Jul 4, 2024 •

edited

Loading

captainbrosset commented Jul 11, 2024

KenjiBaheux commented Aug 20, 2024

Register model #11

Register model #11

Comments

niutech commented Jul 4, 2024 • edited Loading

captainbrosset commented Jul 11, 2024

KenjiBaheux commented Aug 20, 2024

niutech commented Jul 4, 2024 •

edited

Loading