Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> For the current version, we are using a Mistral LLM (Mistral 7B) hosted within Mozilla’s GCP instance.

Based on this I would assume they are using GCP Vertex AI as that's going to be WAY cheaper than rolling it all themselves and hosting the model on a GCP server instance. I would also assume they'd be using the gcloud SDK for Vertex AI/Model Garden, which I believe means they can't just provide for a different endpoint and payload shape if you had a service elsewhere.

Eitherway, at the (presumed) scale they'll probably also be using GCPs API management service, so I would expect further abstraction between what the extension is sending and what the model/Vertex AI expects as a payload. This means providing that kind of "bring your own endpoint" experience would require more bespoke build-out time.

BUT who knows? Maybe this is just straight up hitting the out-of-the-box GCP Vertex AI REST API directly from the extension like some hobby project.



Would have been cool if you could connect it to a local running Ollama instance.

You metnion GCP Vertex AI, is that something like MS Copilot Studio?


It's Google Cloud Platforms "AI" service[0], so actually more analogous to what is now called Azure AI Foundry[1], and what used to be called Azure OpenAI Studio.

Microsoft Copilot Studio[2] (formally Power Platform Power Virtual Agents) imho is unique in it's enterprise AI offering. I truly think Copilot Studio is going to be Microsoft's "killer app" when it comes to companies utilizing AI at scale internally, and not it's Azure service.

0. https://cloud.google.com/vertex-ai

1. https://ai.azure.com/

2. https://copilotstudio.microsoft.com




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: