Re: RealBay: A searchable torrent index
Posted: Fri Jan 23, 2015 7:13 pm
Where can I grab the source to this or what build should I try to start from to do a similar approach? I haven't run the numbers yet but I could probably process the entire blockchain, since it's actually pretty small, and see how large a bloom filter lookup would be. I know there is work in Namecoin to replicate some of the bloom filter stuff happening in Bitcoin, but I don't know any status on that work and to be frank the Bitcoin work is still not in the ref client.What you're talking about is known as a "FN36 (full node 36 kiloblock) client" (at least that's what we're calling it now; expect a blog post formalizing this terminology soon). It's only necessary to process 36 kiloblocks of history, since name outputs (except name_new) are always spent after at most 36 kiloblocks. I don't think anyone's developed this yet, but it's probably not incredibly hard. I have a mostly working "FN36-ABR (full node 36 kiloblock all block receive) client", which has to download the full chain but only stores the most recent 36 kiloblocks. Once I've had some time to finish it up, assuming that no showstoppers are found, I'll release it along with a blogpost explaining how it works. The storage used by the FN36-ABR client is about 107 MiB.
That's understandable. If possible I want to present two options to my users, one for users that are concerned about being tracked who would opt to run a full Namecoin client, and others that are less concerned about being tracked but upset their favorite torrent site is down.However, for full anonymity of lookups, you would need at least a FN36 client.
Does Namecoin have any indexing happening on it's own? I am interested in coming up with ways to make it faster to search for various use cases, even if it means keeping the data out of band. I think a bloom filter lookup table could be constructed without too much trouble, which would at least help you find which blocks have which text in them.I think name_filter is doing regexp operations on every name. Needless to say, that's very inefficient. I wouldn't object to a simple prefix search if it speeds things up a lot (which I assume it would). Depending on what you're using name_filter for, it's also plausible that an index of namespaces could help.
I think there is a solution to this. I think if all the clients could keep a bloom filter lookup table they would get what they want from the network without having to keep the entire Namecoin chain on disk. That leads to the problem that now there are clients that don't have the full chain to service the network. It could be possible to ensure that N number of clients distribute the index between each other, much like a torrent seeding. Each user would still not keep much Namecoin chain on disk, would still get fast lookup on domains from the bloom lookup, etc.The downside is that those clients can not reliably service the network.
Thanks for helping me everyone I look forward to your replies and ideas.