It depends. The key for their vibe-cli is actually different. You need to get a separate key if you have a subscription and don't want to pay API usage prices.
We've forked excalidraw a while ago to allow running excalidraw without firebase as a backend. This can already be self-hosted. It needs some love, but it's a good starting point:
In ollama, how do you set up the larger context, and figure out what settings to use? I've yet to find a good guide. I'm also not quite sure how I should figure out what those settings should be for each model.
There's context length, but then, how does that relate to input length and output length? Should I just make the numbers match? 32k is 32k? Any pointers?
Ollama breaks for me. If I manually set the context higher. The next api call from clone resets it back.
And ollama keeps taking it out of memory every 4 minutes.
LM studio with MLX on Mac is performing perfectly and I can keep it in my ram indefinitely.
Ollama keep alive is broken as a new rest api call resets it after. I’m surprised it’s this glitched with longer running calls and custom context length.
> In a Discord vocal on March 13th 2025 Michel Becker revealed that a documentary film explaining the solution would be shown in cinemas around France on May 2nd 2025. He hoped a broadcast in other countries would follow.