Hacker Newsnew | past | comments | ask | show | jobs | submit | cphoover's commentslogin

Previously I made a chrome extension that removes them from web... But I haven't updated it in a while. Basically just inspects the HTML/CSS patterns of the shorts components and removes them from the page. You could probably code/vibe code a similar extension in 10m.

5-10% accuracy is like the difference between a usable model, and unusable model.

Definitely could be, but in the time I spent talking to the 4-bit models in comparison to the 16-bit original it seemed surprisingly capable still. I do recommend benchmarking quantized models at the specific tasks you care about.

yes, but the difference between one model and one 4x larger is usually a lot more than that.

It is not a question of do a run Qwen 8b at bf16 or a quantized version. It more of a question of do I run Qwen 8b at full precision or do I run a quantized version of Qwen 27b.

You will find that you are usually better off with the larger model.


Yes I was wondering why they mentioned those numbers without mentioning their practical significance.

block dude was my favorite.

I'm no expert but to me, what's particularly silly about "breaking encryption" is it does nothing to prevent using user agents from employing their own encryption layers over other messaging system like gpg/pgp or others. So this does nothing to stop someone who is intent on hiding illegal content and it decreases security and privacy for the average user.


Is all audio up to “Alexa” still processed on device?


It should, but as far as I know you have zero guarantees about that. I just hope there's privacy organizations and / or hackers that continuously verify these claims. Of course, Amazon can push an update at any time to change this, at which point it'll be too late to think "hmm, 1984 warned us about this".


The insane cost to process all this is the guarantee you need.


“They” have no constitutional authority to just “get rid of income tax” we live in a republic, NOT a monarchy.


Are you sure about that?


Isn’t that patch just converting inline strings to localized/internationalized strings?


No, it's not changing anything from an internationalization perspective. It's introducing a change to the content, gated by a 'firefox-tou' feature flag, and tagging the old text with an expiration date of 25-04-2025. The effect is that bits like "and we don’t sell your personal data" will be going away.


God this comment thread is a series of unsubstantiated subjective opinions trashing languages that people don’t like and ignoring the faults of the languages they do….

Why must we rehash this type of post every few months?


Why isn’t there a distributed, decentralized or open index that all of these startups can utilize? I understand that these startups are all are focusing in on different problem areas, but doesn’t it make sense to have something like open street maps so that all of these companies can share their compute resources in order to maintain something competitive with the big guys? Or even if it’s not fully decentralized these startups teaming up to build a bigger index for themselves makes a lot of sense to me.

I have no knowledge of this field but something like that would seem seem to make sense.


Yacy is still around. While I wouldn't want to disrupt it's decentralized/p2p nature, I think there's a case to be made for a community-managed central aggregation server to help seed the index at various snapshots. I might even be interested in helping run such a thing.


A shared index would surely be nice (Common crawl is perhaps an example of one that could be used) but say you had 10 search engines running from it. One decides a page is very important and updates constantly, so should be fetched every 30 minutes. Another search engine decides a page is spam and doesn't need to be recrawled. There's backend choices that affect the shape and crawl directions of the index.

Then things like whether the crawler should render the page (Using the end DOM content rather than the original source), does it do any tokenisation of the content, store other metrics etc, or does that need to be done by the end search engines.

Also there's issues with crawling Reddit, sites behind Cloudflare etc that others have went into more detail on this comment page.


Pretty much exactly what I have been thinking lately. Write about it recently here: https://nadh.in/blog/decentralised-open-indexes/


Wouldn't something like snapshot testing from a scheduled probe be more effective and reliable than using an LLM?

Every X hours test the endpoints and validate the types and field names are consistent... If they change then trigger some kind of alerting mechanism to the user.


if the types and field names change, our parsing script should be able to detect that so it should be covered. I was talking about handling the subtle changes that are undetectable by checking field types and names


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: