More

pdyc · 2026-04-02T07:56:10 1775116570

high on my own supply :-) i use https://easyanalytica.com/tools/html-playground frequently as it allows me to open html in new window and use dev tools like any other page.

pdyc · 2026-04-01T02:52:50 1775011970

thanks, i tested it, failed in strawberry test. qwen 3.5 0.8B with similar size passes it and is far more usable.

cztomsik · 2026-04-01T19:59:34 1775073574

I hope you are kidding, how is that a test of any capabilities? it's a miracle that any model can learn strawberry because it cannot see the actual characters and ALSO, it's likely misspelled a lot in the corpus. I've been playing with this model and I'm pleasantly surprised, it certainly knows a lot, quite a lot for 1.1G

algoth1 · 2026-04-01T09:09:19 1775034559

Does asking it to think step by step, or character by character, improves the answer? It might be a tokenization+unawareness of its own tokenization shortcomings

pdyc · 2026-04-01T09:56:23 1775037383

no it did not with character by character it concluded 2 :-)

selcuka · 2026-04-01T04:12:15 1775016735

Interesting. Qwen 3.5 0.8B failed the test for me.

pdyc · 2026-03-27T03:54:34 1774583674

this is a great project, does it support wasm? i want to use it in browser with sqlite wasm.

russellthehippo · 2026-03-27T08:28:34 1774600114

Not today, but the architecture isn't fundamentally incompatible. The page grouping and seekable compression would translate well to browser fetch + range GETs. It would need a new storage backend targeting OPFS/fetch instead of S3/disk. I'm happy to discuss more if you'd like to open a Github issue - abstracting the storage API seems like a decent idea in itself.

russellthehippo · 2026-03-27T21:14:18 1774646058

Also I'm curious how you would handle credentials - would you be proxying through your backend? To me turbolite is definitely a backend tool.

pdyc · 2026-03-29T08:00:12 1774771212

correct it will be handled via backend which will just proxy requests to R2(its S3 compatible). i already have something working with vhttpfs but you have implemented some great ideas like optimizing request count instead of byte efficiency, i wanted something like that in vhttpfs but it will become another project for me. I think it can be great frontend tool as well since you have decoupled storage with db engine.

pdyc · 2026-03-25T13:33:47 1774445627

i wonder what is the captcha equivalent of ai bots? ask about taboo topics to rule out commercial models and ask about specific reasoning questions that trip ai like walking vs driving to car wash? or your own set?

pdyc · 2026-03-23T06:16:23 1774246583

i am looking for something like this but for a cheap used phones i can give to kids without internet that has all the books, offline maps, wikipedia and some basic llm. They would have complete environment to explore depending on their curiosity.Is there something like this? otherwise i am thinking of creating my own collection and opensourcing it.

morjom · 2026-03-23T07:41:45 1774251705

Kiwix, maybe Kolibri? If up for tinkering, maybe something like Internet in a Box (can be done through Tmux+Proot-distro)

https://kiwix.org/en/

https://learningequality.org/kolibri/

https://internet-in-a-box.org/

pdyc · 2026-03-22T12:36:35 1774182995

impressive, i wish someone takes a stab at using this technique on mobile gpu's even if it does not use storage it would still be a win. I am running llama.cpp on adreno 830 with oepncl and i am getting pathetic 2-3t/s for output tokens

pdyc · 2026-03-21T08:54:42 1774083282

do you use same accounts? how do you make sure that chatgpt/gemini etc. dont personalize the queries when used with same account?Also responses change based on location and ip(residetial ip's are treated differently)

marzapower · 2026-03-31T08:44:14 1774946654

This is actually a fundamental limitation of prompt-monitoring approaches — personalization, location variance, account history all introduce noise that's hard to control.

One alternative is page-level structural analysis: instead of asking ChatGPT "do you cite this site?", you analyze the page directly for the signals that predict citation — source density, answer structure, fluency, statistics. No account needed, no IP variance, fully reproducible.

That's the approach I took with writeseo.vercel.app/geo-check — based on the Princeton KDD research (same paper vincko linked above). Different layer of the problem, but more stable as a diagnostic.

pdyc · 2026-03-11T12:06:04 1773230764

Indeed, what is worse is expectation created by rich people that whatever little value you did create should be given away for free! I see it frequently on HN with product launches where people are demanding product to be opensource with liberal license which effectively means it should be free.

pdyc · 2026-03-09T06:11:35 1773036695

nice project but it would be good if you can make other codecs like h.264 etc. optional it will increase its adoption and help in battle testing entire framework.

kitasan · 2026-03-09T11:35:14 1773056114

Thanks for the suggestion. H.264 Baseline/High profile patents are largely expiring between 2027–2028, with a few stragglers potentially lasting until 2030. We're tracking this closely — once the patent landscape is clear, we plan to offer H.264/H.265 as an optional feature. In the meantime, AV1 matches or exceeds H.265 in compression efficiency, so for greenfield projects there's little reason to reach for the legacy codecs.

pdyc · 2026-03-09T11:45:04 1773056704

problem is users will still have files in mp4+h264 combo we cannot dictate that but i think i can workaround that by using system default codecs via other library.

pdyc · 2026-03-09T05:09:50 1773032990

EasyAnalytica.com It lets you view all your dashboards in one place. Dashboard creation is a 3 step away, point to a file, confirm ds, choose template and done. Supports csv/json files, locl/remote url, Google sheets and api's with bearer auth.

i have also started experimenting with qwen3.5 0.8B model, my goal is to create agents with small models that are as robust as their commercial counterparts for specialized tasks. currently trying it for file editing.