I hope you are kidding, how is that a test of any capabilities? it's a miracle that any model can learn strawberry because it cannot see the actual characters and ALSO, it's likely misspelled a lot in the corpus. I've been playing with this model and I'm pleasantly surprised, it certainly knows a lot, quite a lot for 1.1G
Does asking it to think step by step, or character by character, improves the answer? It might be a tokenization+unawareness of its own tokenization shortcomings
Not today, but the architecture isn't fundamentally incompatible. The page grouping and seekable compression would translate well to browser fetch + range GETs. It would need a new storage backend targeting OPFS/fetch instead of S3/disk. I'm happy to discuss more if you'd like to open a Github issue - abstracting the storage API seems like a decent idea in itself.
correct it will be handled via backend which will just proxy requests to R2(its S3 compatible). i already have something working with vhttpfs but you have implemented some great ideas like optimizing request count instead of byte efficiency, i wanted something like that in vhttpfs but it will become another project for me.
I think it can be great frontend tool as well since you have decoupled storage with db engine.
i wonder what is the captcha equivalent of ai bots? ask about taboo topics to rule out commercial models and ask about specific reasoning questions that trip ai like walking vs driving to car wash? or your own set?
i am looking for something like this but for a cheap used phones i can give to kids without internet that has all the books, offline maps, wikipedia and some basic llm. They would have complete environment to explore depending on their curiosity.Is there something like this? otherwise i am thinking of creating my own collection and opensourcing it.
impressive, i wish someone takes a stab at using this technique on mobile gpu's even if it does not use storage it would still be a win. I am running llama.cpp on adreno 830 with oepncl and i am getting pathetic 2-3t/s for output tokens
do you use same accounts? how do you make sure that chatgpt/gemini etc. dont personalize the queries when used with same account?Also responses change based on location and ip(residetial ip's are treated differently)
This is actually a fundamental limitation of prompt-monitoring approaches — personalization, location variance, account history all introduce noise that's hard to control.
One alternative is page-level structural analysis: instead of asking ChatGPT "do you cite this site?", you analyze the page directly for the signals that predict citation — source density, answer structure, fluency, statistics. No account needed, no IP variance, fully reproducible.
That's the approach I took with writeseo.vercel.app/geo-check — based on the Princeton KDD research (same paper vincko linked above). Different layer of the problem, but more stable as a diagnostic.
Indeed, what is worse is expectation created by rich people that whatever little value you did create should be given away for free! I see it frequently on HN with product launches where people are demanding product to be opensource with liberal license which effectively means it should be free.
nice project but it would be good if you can make other codecs like h.264 etc. optional it will increase its adoption and help in battle testing entire framework.
Thanks for the suggestion. H.264 Baseline/High profile patents are largely expiring between 2027–2028, with a few stragglers potentially lasting until 2030. We're tracking this closely — once the patent landscape is clear, we plan to offer H.264/H.265 as an optional feature. In the meantime, AV1 matches or exceeds H.265 in compression efficiency, so for greenfield projects there's little reason to reach for the legacy codecs.
problem is users will still have files in mp4+h264 combo we cannot dictate that but i think i can workaround that by using system default codecs via other library.
EasyAnalytica.com
It lets you view all your dashboards in one place. Dashboard creation is a 3 step away, point to a file, confirm ds, choose template and done. Supports csv/json files, locl/remote url, Google sheets and api's with bearer auth.
i have also started experimenting with qwen3.5 0.8B model, my goal is to create agents with small models that are as robust as their commercial counterparts for specialized tasks. currently trying it for file editing.
reply