I keep envisioning schemes where your last-resort backup of your media archive is just a list of file names and content hashes, and if you lose your copies you can just use a cloud service to retrieve new files with those hashes.
They probably know about it already. I think the eDonkey network is pretty much what I envision. The problem is that the network needs to be very comprehensive and long-lived to be a reliable solution that can be actually expected to find someone’s copy of most of the obscure downloads you want to hang on to, and things that people try to sue into oblivion whenever they get too big have trouble being either. There’s also the matter of agreeing on the hash function to use, since hash functions come with a shelf-life. A system made in the 90s that uses the MD5 function might be vulnerable nowadays to a bot attack substituting garbage for the known hashes using hash collision attacks. (eDonkey uses MD4, which seems to be similarly vulnerable to MD5.)
There probably are parts of the problem that are cultural instead of technical though. People aren’t in the mindset of wanting to have their media archive as a tiny hash metadata master list with the actual files treated as cached representations, so there isn’t demand and network effect potential for a widely used system accomplishing that.
Suggest it to the folks who run The Pirate Bay.
They probably know about it already. I think the eDonkey network is pretty much what I envision. The problem is that the network needs to be very comprehensive and long-lived to be a reliable solution that can be actually expected to find someone’s copy of most of the obscure downloads you want to hang on to, and things that people try to sue into oblivion whenever they get too big have trouble being either. There’s also the matter of agreeing on the hash function to use, since hash functions come with a shelf-life. A system made in the 90s that uses the MD5 function might be vulnerable nowadays to a bot attack substituting garbage for the known hashes using hash collision attacks. (eDonkey uses MD4, which seems to be similarly vulnerable to MD5.)
There’s an entire field called named data networking that deals with similar ideas.
There probably are parts of the problem that are cultural instead of technical though. People aren’t in the mindset of wanting to have their media archive as a tiny hash metadata master list with the actual files treated as cached representations, so there isn’t demand and network effect potential for a widely used system accomplishing that.
Zooko did this: Tahoe-LAFS
You can safely use it for private files too, just don’t lose your preencryption hashes.