Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Agreed and not to mention that most people won't use anything close to the 50GB, combine that with the deduplication plans they have and you have a very different scenario.


they say, all files will be client-side encrypted which won't allow them to "deduplicate" anything.


There was a discussion the other day on how this was possible, where a hash would be created and that would allow for both encryption and deduplication. Let me see if I can find it.


I understand this would be possible but then they are essentially lying about their privacy attributes.


Doesn't seem like an unreasonable assumption to me.


Dropbox used to say: "All files stored on Dropbox servers are encrypted (AES256) and are inaccessible without your account password." even though the files were accessible without their account password.

Mega might be doing the same thing: saying one thing to attract early adopters, and changing the marketing language once it gets broader adoption by people who don't care about that attribute. It's dishonest, but it would hardly be shocking to learn that a multiple felon was being dishonest.


Just because a file is encrypted doesn't mean that a small block doesn't match another encrypted block even if the two are from two different files.

This would allow them to do deduplication at the block level (see ZFS for example).


Think about that one for a second. An encrypted block is essentially supposed to look like random data. I.e. if two people encrypt the same file with different keys, you shouldn't be able to tell that they're the same file (or your encryption sucks). So your block-level de-duping then depends on incidental matches between random data.

What's the probability of two 4KB (or whatever) blocks of random data being identical? Basically zero even with petabytes of data.


see: convergent encryption

The encryption on the client doesn't use a random key. The key is a hash of the unencrypted contents of the file.


Reading list-ed. Interesting.


The issue is that you don't get the full benefits of encryption.

If you upload the map to the rayiner family treasure that only you have seen you're good. No one else will be able to read it.

But if you upload the latest episode of Modern Family and Disney gets ahold of the same rip you used they (if they can get a government to help them out) can see what you did and charge you with copyright infringement (or whatever the appropriate crime would be).


After 3-4 years of high-profile CPA-2 attacks on TLS, .NET, Java, and other systems, you'd think we'd all be a lot more skeeved out about cryptosystems that demand known-plaintexts. There's already an obvious conceptual attack (beyond file confirmation) in naive "convergent encryption", which is that you can leverage small amounts of known plaintext to learn unknown plaintext.


In practise any strong encryption (ie, not AES-ECB) is indistinguishable from random noise, and that's by design. Even trying to de-duplicate 4kb blocks of random noise would be a completely fruitless task. If it was possible, storage is probably cheaper than the CPU time to find similar blocks.


I don't think this is true. 50 GB of free storage for each user would be cost prohibitive without (at least) block-level deduplication.


Some kind of homomorphic encryption scheme maybe?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: