Design choices for SPV name lookups

Post Reply
biolizard89
Posts: 2001
Joined: Tue Jun 05, 2012 6:25 am
os: linux

Design choices for SPV name lookups

Post by biolizard89 »

So I've been tinkering around with name lookups using SPV (coded with BitcoinJ). At the moment, I can look up the value of a name, with what I believe is the same security level that currency lookups have in BitcoinJ for Bitcoin, under the condition that the user provides both the name and the hash of the block where it can be found. (It's probably also feasible to use the block height instead of the block hash; I haven't yet tinkered with this.) The code needs optimization and more testing, but I believe that it is a workable approach.

Unfortunately, that means that we need to, somehow, obtain the block hash or height where the most recent name_anyupdate operation for a given name can be found. There are 4 ways of doing this that I can think of.

1. Ask a semi-trusted party who possesses a full node and who permits such queries. This could be a block explorer, a Namecoin Core node's REST port (behind a reverse HTTP proxy), or some other API.
2. Add a wire protocol command along the lines of Bitcoin XT's "getutxo", which allows lookups by name rather than transaction hash. Note that we don't need this command to reply with any part of the transaction; just the block height or block hash should be sufficient.
3. On initial syncup, use the wire protocol to retrieve all name_anyupdate transactions from the last 36 kiloblocks. Immediately discard the transactions, but maintain an index mapping names to block hashes/heights.
4. On initial syncup, use the wire protocol to retrieve the last 36 kiloblocks in full. Immediately discard the transactions, but maintain an index mapping names to block hashes/heights. (Note: if we wanted, we could modify this method to store the name values, or the scriptPubKey, or the transaction output, or the entire transaction. Any of these would eliminate the need for network traffic on lookup operations, but would use additional storage.)

All 4 of these methods can most likely be run over Tor with reasonable anonymity, as long as the API lookup client or wire protocol client is capable of stream isolation.

All of these methods except for (4) rely on a trusted party to tell us which block contains the latest name_anyupdate for a given name. In (1), the trusted party is the API server. In (2) and (3), the trusted party is the download peer on the Namecoin P2P network. If the trusted party lies, she can undetectably trick us into accepting a name_anyupdate operation which has already been spent but has not expired. She cannot trick us into accepting an expired name_anyupdate, and she cannot trick us into accepting a name_anyupdate which doesn't appear in the blockchain that has the most work. If we ask multiple parties, and at least 1 is honest, it is easy to tell which one is honest. (4) doesn't need a trusted party, but its bandwidth usage is going to be high. (I haven't calculated how high.)

Unfortunately, tricking us into accepting a name_anyupdate which has been spent is really bad. For an example, if Keith steals the keys to Glenn's HTTPS server, and Edward uses an SPV client, Keith then has 8 months where he can MITM Edward's communication with Glenn's server, even though Glenn revoked those keys immediately.

The "right" way to deal with this (i.e. low bandwidth usage and low trust) is a softfork that backs the UTXO set (or UNO set) with PoW. However, the engineering involved is complex. This is also mostly in the scope of Bitcoin rather than Namecoin. (Things that are specific to names are in our scope, but it's still probably a good idea to see what Bitcoin does.)

So, one of the above 4 options will presumably have to do for now.

Some observations about these options:

(1) is relatively easy to implement in the simplest case (calling an HTTP-based API that returns HTML/XML/JSON). I could probably rig this up without much trouble.
(2) requires changes to the wire protocol. Also, getutxo's implementation is disliked by most of the Bitcoin developers.
(3) Uses more storage and bandwidth, since it needs to download some transactions (not just block headers) and store the index.
(4) Uses comparable storage and more bandwidth than (3). According to https://namecoin.webbtc.com/graphs/block_size.png , Namecoin blocks seem by eye to be around 10 kB, so 36 kiloblocks would be circa 360 MB of full blocks that have to be downloaded on syncup, plus an average of 1 kB per minute while the client is running.

There's another major security difference between (1) versus (2) and (3).. Since the P2P network is unauthenticated, it's trivial to Sybil. That means it's a very bad idea to rely on any data obtained from the P2P network which isn't backed by PoW. So it's okay to use the P2P network for looking up names if we already know what block to look in, because that information is backed by PoW (via Merkle proofs). Checking the spentness of a name transaction isn't backed by PoW (meaning that (2) is broken), and neither is verifying that a name transaction is nonexistent (meaning that (3) is broken). For a proof of concept of this attack, see https://github.com/petertodd/bitcoin/co ... e8977bf66c .

In contrast, an API server that uses TLS or .onion cannot be Sybiled, nor easily impersonated or MITMed. If Keith wants to pwn Edward and Edward is using an API server run by Roger, then Keith's only option is to pwn Roger. If Edward is using 5 API servers, run by Roger, Jake, Mike, Alison, and Shari, then Keith must pwn Roger, Jake, Mike, Alison, and Shari before he can pwn Edward. Even better, if the API server uses non-deniable signatures on its responses (e.g. PGP), then it would be possible to publish independently verifiable proof if Roger, Jake, Mike, Alison, or Shari ever provided incorrect information. (Note that if the wire protocol is tunneled through either TLS or .onion, then options (2) and (3) would work just as well as a TLS or .onion API server.)

Put another way, it's generally safer to trust non-anonymous, authenticated entities whose misconduct can be proven, than to put the same level of trust in anonymous entities who are not accountable and who are trivial to impersonate and Sybil. (Obviously, it's even better to not trust any entity. But we don't have that option until PoW-backed proofs of unspentness are a thing.)

This is the same reason that Tor is more resistant to Sybils than I2P is. Tor's directory authorities provide an authentication mechanism for relays, whereas it's much harder for networks like I2P to prevent Sybils.

This is also a reason why Bitcoin client/server systems like Electrum are often considered more secure and private than BitcoinJ (e.g. Tails uses Electrum), although in that case you're trusting the server / P2P nodes with your financial record linkability rather than with the ability to MITM your browser.

So -- my preference is to use method (1) or (4). I can probably produce a working proof of concept of (1) that uses a block explorer without much trouble. I can also probably produce a working proof of concept of (4), although it's going to be more work. I prefer (4) over (1), although I think there are probably some users who can't handle the bandwidth usage of (4). I haven't run any serious testing to see how much extra overhead (4) would have.

I'm curious if others agree with my analysis. Opinions? What should take priority, particularly if a bounty is set up?
Jeremy Rand, Lead Namecoin Application Engineer
NameID: id/jeremy
DyName: Dynamic DNS update client for .bit domains.

Donations: BTC 1EcUWRa9H6ZuWPkF3BDj6k4k1vCgv41ab8 ; NMC NFqbaS7ReiQ9MBmsowwcDSmp4iDznjmEh5

hla
Posts: 46
Joined: Mon Nov 10, 2014 12:01 am
os: linux
Contact:

Re: Design choices for SPV name lookups

Post by hla »

I'm fine with (1), but I think we should get (2) out the door as soon as possible. We shouldn't tie ourselves to the rate of consensus and implementation in Bitcoin Core for the implementation of functionality critical to the advancement of the Namecoin project. The rebasing of Namecoin Core on Bitcoin Core in such a way that the upstream can be continuously tracked was absolutely the right thing to do, but it doesn't mean that Namecoin Core should become an excercise in code golf (or rather, diff golf) for its own sake. It's a nice to have, but is ultimately subservient to the advancement of Namecoin Core to meet the ends of the Namecoin project. So we shouldn't hang around because Bitcoin Core might implement getutxo at some point and we might be able to reuse some code in it for a get-by-name command.

The design I would use is to have a request command (flags, name) where flags is a 0/ignore bitfield specifying the requested response fields. Additional trailing data is ignored. The response takes the form (flags, name, transaction), where flags is a 0/ignore bitfield specifying the provided response fields (which are a subset of those requested).

biolizard89
Posts: 2001
Joined: Tue Jun 05, 2012 6:25 am
os: linux

Re: Design choices for SPV name lookups

Post by biolizard89 »

Hugo and I discussed this topic further on IRC after his post above; I'm reposting with his consent since it's relevant.
#namecoin-dev wrote:May 27 05:45:32 <Jeremy_Rand> hl: I'm curious if you think there's a reason to use (2) (which uses the unauthenticated wire protocol) compared to (1) (which uses an authenticated API server), given the reasons I laid out for avoiding unauthenticated lookups of unverifiable data. Or are you talking about a hypothetical getutxos command that uses UTXO set commitments (which wasn't what I meant by (2) )?
May 27 05:46:18 <Jeremy_Rand> hl: also I'm curious if you have opinions on (1) versus (4), given that (4) should be safe without needing UTXO commitments
May 27 05:46:52 <qpm> freenode:<hl> well
May 27 05:47:14 <qpm> freenode:<hl> The wire protocol extension I propose should be easily extended to UTXO commitments
May 27 05:48:02 <qpm> freenode:<hl> Since (4) requires no protocol-level work, I don't see as exactly relevant; it's nice, but it's not something that needs protocol changes to do. It's nothing new, essentially
May 27 05:48:28 <qpm> freenode:<hl> Hmm
May 27 05:48:52 <qpm> freenode:<hl> Certainly for the time being (1) (or maybe 1+2?) is fine
May 27 05:49:48 <Jeremy_Rand> hl: ignoring the technical debt issues that come with duplicating Bitcoin's work on UTXO commitments, yes, I think a hypothetical getutxos command can be extended to use UTXO commitments later
May 27 05:50:12 <Jeremy_Rand> hl: What's the definition of protocol-level work?
May 27 05:50:12 <qpm> freenode:<hl> Basically 4 is fine, but I see 2 as getting us closer to our end goal, so I'd love to get the infrastructure in place as soon as possible
May 27 05:50:32 <qpm> freenode:<hl> Basically I consider it more important for Namecoin to prioritise work that only it can do, than work that anyone can do, if you see what I'm saying
May 27 05:51:17 <qpm> freenode:<hl> Though the logicality of that may be arguable, given that there are few others doing work on Namecoin without involvement in the project itself
May 27 05:51:36 <Jeremy_Rand> hl: 4 is probably going to have better privacy than 2. Or at least, 4's privacy is going to be a lot easier to reason about than 2's.
May 27 05:51:44 <qpm> freenode:<hl> Also responded to BDB lock limit thread
May 27 05:52:11 <Jeremy_Rand> Err, 4 will have better privacy assuming it saves the name values
May 27 05:52:15 <Jeremy_Rand> Which is relatively easy
May 27 05:52:21 <qpm> freenode:<hl> (4) is fine for people who want to use it.
May 27 05:53:35 <qpm> freenode:<hl> If you're really okay with (1), though, that seems like a good move, since that's very lightweight and easy to implement
May 27 05:54:20 <Jeremy_Rand> hl: yes, I believe that (1) is the most secure thing we can do with the current consensus rules unless we download full blocks
May 27 05:54:26 <Jeremy_Rand> also it's very easy to implement
May 27 05:55:05 <qpm> freenode:<hl> I'd be very happy to see (1) happen, because it would make lightweight resolvers in the short time highly feasible.
May 27 05:56:32 <Jeremy_Rand> hl: so let's say that NMDF or Namecoin Bountysource were to fund work on (1) and/or (4). Should only one of them get funding (if so, which), or should both?
May 27 05:56:49 <qpm> freenode:<hl> (1), definitely
Jeremy Rand, Lead Namecoin Application Engineer
NameID: id/jeremy
DyName: Dynamic DNS update client for .bit domains.

Donations: BTC 1EcUWRa9H6ZuWPkF3BDj6k4k1vCgv41ab8 ; NMC NFqbaS7ReiQ9MBmsowwcDSmp4iDznjmEh5

biolizard89
Posts: 2001
Joined: Tue Jun 05, 2012 6:25 am
os: linux

Re: Design choices for SPV name lookups

Post by biolizard89 »

Does anyone else have opinions about the merits of the various suggestions, particularly in the context of NMDF or Namecoin Bountysource funding?

Also, does it make sense to use BitcoinJ for the verification part of (1)? Doing so allows us to reuse a lot of code if we work on (4) later, and also can probably be reused heavily for a hypothetical getutxos with UTXO commitments later. On the other hand, using Electrum for the verification part of (1) might be slightly cleaner (but also means more work because I have BitcoinJ already working on the Namecoin network and with name scripts, whereas I don't know how much effort would be involved in doing it with Electrum). Electrum also doesn't make sense for UTXO commitments or (4), so it's less reusable than BitcoinJ in my opinion.
Jeremy Rand, Lead Namecoin Application Engineer
NameID: id/jeremy
DyName: Dynamic DNS update client for .bit domains.

Donations: BTC 1EcUWRa9H6ZuWPkF3BDj6k4k1vCgv41ab8 ; NMC NFqbaS7ReiQ9MBmsowwcDSmp4iDznjmEh5

somename
Posts: 80
Joined: Mon Sep 15, 2014 3:12 pm
os: windows

Re: Design choices for SPV name lookups

Post by somename »

biolizard89 wrote:Does anyone else have opinions about the merits of the various suggestions, particularly in the context of NMDF or Namecoin Bountysource funding?

Also, does it make sense to use BitcoinJ for the verification part of (1)? Doing so allows us to reuse a lot of code if we work on (4) later, and also can probably be reused heavily for a hypothetical getutxos with UTXO commitments later. On the other hand, using Electrum for the verification part of (1) might be slightly cleaner (but also means more work because I have BitcoinJ already working on the Namecoin network and with name scripts, whereas I don't know how much effort would be involved in doing it with Electrum). Electrum also doesn't make sense for UTXO commitments or (4), so it's less reusable than BitcoinJ in my opinion.
IMO, as a non-expert: 1 & 4 seem best, but 1 seems significantly easier and faster.

I would not mind to use 4 myself, but I think folks who strongly prefer 4 can run a full node (which I do).

Based on your explanation under 1), it can be made to make use of a local (=trusted) instance of Namecoin, so that would indirectly be similar to 4). One more plus for 1)!

biolizard89
Posts: 2001
Joined: Tue Jun 05, 2012 6:25 am
os: linux

Re: Design choices for SPV name lookups

Post by biolizard89 »

somename wrote:
biolizard89 wrote:Does anyone else have opinions about the merits of the various suggestions, particularly in the context of NMDF or Namecoin Bountysource funding?

Also, does it make sense to use BitcoinJ for the verification part of (1)? Doing so allows us to reuse a lot of code if we work on (4) later, and also can probably be reused heavily for a hypothetical getutxos with UTXO commitments later. On the other hand, using Electrum for the verification part of (1) might be slightly cleaner (but also means more work because I have BitcoinJ already working on the Namecoin network and with name scripts, whereas I don't know how much effort would be involved in doing it with Electrum). Electrum also doesn't make sense for UTXO commitments or (4), so it's less reusable than BitcoinJ in my opinion.
IMO, as a non-expert: 1 & 4 seem best, but 1 seems significantly easier and faster.

I would not mind to use 4 myself, but I think folks who strongly prefer 4 can run a full node (which I do).

Based on your explanation under 1), it can be made to make use of a local (=trusted) instance of Namecoin, so that would indirectly be similar to 4). One more plus for 1)!
I got (1) working today. I'll work on (4) later. Code will be released once it's cleaned up a bit (won't be for at least 1-2 weeks).
Jeremy Rand, Lead Namecoin Application Engineer
NameID: id/jeremy
DyName: Dynamic DNS update client for .bit domains.

Donations: BTC 1EcUWRa9H6ZuWPkF3BDj6k4k1vCgv41ab8 ; NMC NFqbaS7ReiQ9MBmsowwcDSmp4iDznjmEh5

Post Reply