UNO Commitments

domob · Post by **domob** » Sat Mar 14, 2015 8:07 am

By committing a hash of the UNO ("unspent name output" -- basically the name
database) set to all block headers, SPV clients can get a way to verify
that a untrusted node or server answers truthfully to queries for names.
This is a very useful thing to have for building "one-click install"
like resolvers and other tools on top of Namecoin. I've investigated
a bit how those commitments could be done based on a trie
data structure.

What is a trie?

See also Wikipedia. Basically, a trie is a special kind of search tree that
is especially suited when the index set is a set of "strings". A simple
example for such a trie, containing the strings "d/a", "d/x" and "d/ab":

Code: Select all

d: null
  /: null
    a: {...}
      b: {...}
    x: {...}

(I hope it is clear what I mean here. I do not want to attempt to draw
a tree in ASCII and fail doing so.) Here, {...} stands for some data
associated to a name (as per name_show, for instance) and null means that a
particular name has no data.

As you can see, one can relatively efficiently locate, insert and remove data
from a trie (like with other search trees). Furthermore, each node in the
trie corresponds to an entire "sub-trie" of names starting all with the same
prefix. This is, IMHO, a very useful property (see also below).

How does the commitment work?

We can now define a root hash of such a trie: Compute a hash over
the data as well as all children, but replace the individual child sub-tries
first by their hashes. This construction is similar to a Merkle tree.
The top hash is then committed somehow (e. g., in the coinbase input) in
the blockchain, which could be done by a softfork and should be mandatory
(and verified by full nodes).

A node can now query an untrusted server for a name, and the server
can return the "reduced nodes" (with children replaced by their hashes)
for all trie nodes upward from the requested name.
(One for each character in the string.) It can then use these
to independently compute the root hash and compare it with the blockchain.
Since the returned data for the requested name is part of the hash, the
SPV node can determine that the answer is correct if the root hashes match.

It is also possible to prove that a name does not exist: Just return
the reduced nodes for all prefixes of the requested name that do
exist. Then either the full string is represented by them but the final
node has no data, or only a prefix can be found which does not contain
a child for the next character of the name anymore. In both cases can
the hash be verified and the node be sure that it got a truthful negative
answer.

Why not use Merkle trees?

Of course, one can also construct a Merkle tree of all names and use that
as commitment. However, I believe that a trie has the following advantages:

It can be kept from one block to the next, then only updated names
need to be updated in the trie. With a Merkle tree, it is necessary to
reconstruct the tree when the set of names changes. (At least I think so.)
With the trie, a node can request a full sub-trie corresponding
to a prefix and verify that as well. This allows to query, for instance,
for an offline database corresponding to a full namespace. Or, the node can
query for a prefix subtree in order to hide the actual name it is interested
in for privacy reasons.

Performance results

I've implemented building of such tries in the "uno-trie" branch
of my Github for Namecore. A quick result for mainnet on my laptop:

Building the trie and serialising it to compute the size takes around
19 seconds. Just building the trie for the hash should require a little less,
although I don't think the difference will be too much. Memory usage of the
process increases from 110 MiB to 400 MiB, but the serialised size of the trie
(corresponding to the full name database including all values,
name addresses and outpoints) is only around 50 MB in size.

I believe that this is not a prohibitive performance cost for this feature.
Note that a miner and full node only needs to either keep the trie in
memory all the time (then updates should be very fast) or rebuild the
trie for each block (then the memory can be freed immediately again, and
probably one can even build the hash without keeping the full trie in memory
all at once).

It may be beneficial to require that each block should contain the hash of the
previous block's UNO set. That way, the computation of the root hash
can be done once and need not be updated when more transactions are added to
the block. (Only when the previous block it builds upon is changed.)

Post by **biolizard89** » Sun Mar 15, 2015 9:27 am

Nice work. I don't have anything substantive to say in terms of review, but thanks for working on this.

domob · Post by **domob** » Fri Mar 20, 2015 7:41 pm

Any other opinions? Is it worth spending more time on this, like implementing to keep those UNO tries in memory and updating them as blocks come in? That would basically be "all" that is required to perform a softfork requiring miners to do a commitment at each block. Is the required 300 MiB of extra memory worth the feature or not?

(Note that non-miners could opt to not check the UNO commitment, which would slightly degrade their security. They would end up doing SPV + checking all signatures, which seems to be still "pretty good" security.)

Post by **biolizard89** » Sat Mar 28, 2015 11:20 pm

domob wrote:Any other opinions? Is it worth spending more time on this, like implementing to keep those UNO tries in memory and updating them as blocks come in? That would basically be "all" that is required to perform a softfork requiring miners to do a commitment at each block. Is the required 300 MiB of extra memory worth the feature or not?

(Note that non-miners could opt to not check the UNO commitment, which would slightly degrade their security. They would end up doing SPV + checking all signatures, which seems to be still "pretty good" security.)

I'm curious, what is the memory footprint of the proposed UTXO commitment feature in Bitcoin? Is it comparable to the 300MiB that you're proposing? 300MiB isn't a huge amount (I think Zerocash clients need circa 700MiB for generating transactions), but it would be good to have some more data points for context.

EDIT:

domob wrote:I believe that this is not a prohibitive performance cost for this feature.
Note that a miner and full node only needs to either keep the trie in
memory all the time (then updates should be very fast) or rebuild the
trie for each block (then the memory can be freed immediately again, and
probably one can even build the hash without keeping the full trie in memory
all at once).

How feasible would it be to implement these 3 modes for comparison of speed, and CPU/memory usage?

domob · Post by **domob** » Sun Mar 29, 2015 9:28 pm

biolizard89 wrote:How feasible would it be to implement these 3 modes for comparison of speed, and CPU/memory usage?

Which three modes do you mean? I only see two at the moment, namely keeping the trie in memory all the time or rebuilding it from scratch for each block. I'll work on the update-in-memory mode when I have time. (Of course, just not validating the hash is a possible third mode. That would be as it is now, but is not possible for miners.)

The "ultimate" solution could be to store the trie in an update-able format on disk instead of memory. This, however, would probably require us to basically rewrite what things like LevelDB are doing - i. e., implementing an efficient way to store a dynamic data structure in an efficient way in a file on disk. Not sure how worthwhile that would be - I agree that 300 MiB isn't a too large amount. (That's what the client used before my auxpow-in-CBlockIndex patch, I think.)

Post by **biolizard89** » Sun Mar 29, 2015 10:17 pm

domob wrote:
biolizard89 wrote:How feasible would it be to implement these 3 modes for comparison of speed, and CPU/memory usage?
Which three modes do you mean? I only see two at the moment, namely keeping the trie in memory all the time or rebuilding it from scratch for each block. I'll work on the update-in-memory mode when I have time. (Of course, just not validating the hash is a possible third mode. That would be as it is now, but is not possible for miners.)

The "ultimate" solution could be to store the trie in an update-able format on disk instead of memory. This, however, would probably require us to basically rewrite what things like LevelDB are doing - i. e., implementing an efficient way to store a dynamic data structure in an efficient way in a file on disk. Not sure how worthwhile that would be - I agree that 300 MiB isn't a too large amount. (That's what the client used before my auxpow-in-CBlockIndex patch, I think.)

The 3 modes I meant were:

Keep in memory all the time
Rebuild for each block, but store entire thing in memory during rebuild
Rebuild for each block without storing the entire thing in memory all at once

300 MiB is actually a lot for some systems -- e.g. a Raspberry Pi probably won't handle it well.

domob · Post by **domob** » Mon Mar 30, 2015 7:26 am

biolizard89 wrote:The 3 modes I meant were:

Keep in memory all the time

Rebuild for each block, but store entire thing in memory during rebuild

Rebuild for each block without storing the entire thing in memory all at once

Ok, I see. The second one is what I've currently implemented (although building happens on-demand with an RPC call, not for each block - but that's easy to do). I'll work on the first mode next.

For the last: I think that it needs a way to traverse all names in sorted order - lexicographically. Currently, the LevelDB sorts names in a way such that short names are always before long names (because the "length" is prefixed at the beginning). I think that we can change this by fiddling with the code that translates a name into a LevelDB key. Are there any objections to doing that? (We talked about this already in relation to libcoin, which also sorts lexicographically and not in the same way as namecoind. There were none, just confirming this now.) This would affect the order in which names are returned by name_scan and name_filter, but otherwise I don't see any technical issues. When this is done, I think that I know how to rebuild the root hash without keeping all the trie in memory at once.

Post by **biolizard89** » Mon Mar 30, 2015 11:15 pm

domob wrote:
biolizard89 wrote:The 3 modes I meant were:

Keep in memory all the time

Rebuild for each block, but store entire thing in memory during rebuild

Rebuild for each block without storing the entire thing in memory all at once

Ok, I see. The second one is what I've currently implemented (although building happens on-demand with an RPC call, not for each block - but that's easy to do). I'll work on the first mode next.

For the last: I think that it needs a way to traverse all names in sorted order - lexicographically. Currently, the LevelDB sorts names in a way such that short names are always before long names (because the "length" is prefixed at the beginning). I think that we can change this by fiddling with the code that translates a name into a LevelDB key. Are there any objections to doing that? (We talked about this already in relation to libcoin, which also sorts lexicographically and not in the same way as namecoind. There were none, just confirming this now.) This would affect the order in which names are returned by name_scan and name_filter, but otherwise I don't see any technical issues. When this is done, I think that I know how to rebuild the root hash without keeping all the trie in memory at once.

Sounds good to me. I have no objections to sorting lexicographically.

domob · Post by **domob** » Tue Mar 31, 2015 4:59 am

Some more thoughts: Someone on Bitcointalk pointed out that Bitcoin's UTXO proposal used radix tries instead of tries. (Although it seems to have gotten stalled - not sure about the details.) This seems like a good idea for us as well. It is a bit more complicated to implement, but I'll give it a shot - it will probably save a good chunk of space (we'll see).

Furthermore, I've been thinking about ways to remove the memory burden. I think that it will be possible to store the trie for efficient updating in LevelDB (when done the right way). This should give us the benefit of quick updates without the need to store everything in memory - if it works out the way I think it will, this would be the "perfect" solution.

Thinking a step further: In order to make use of this in a light client, the client needs to know the block headers. Is there any code out there that implements a client with P2P networking and PoW checking to track the block headers for Namecoin? Presumably there exists already Python code to do that with Bitcoin - this should be easy to adapt. Anyone interested in working on it? Even without the UNO commitments, this could still be used already now to track Merkle branches of name transactions to a block header. Not fully secure, but much better than an unsecured API server and also better than the "PoW secured queries" idea.

Post by **biolizard89** » Fri Apr 03, 2015 1:35 am

domob wrote:Some more thoughts: Someone on Bitcointalk pointed out that Bitcoin's UTXO proposal used radix tries instead of tries. (Although it seems to have gotten stalled - not sure about the details.) This seems like a good idea for us as well. It is a bit more complicated to implement, but I'll give it a shot - it will probably save a good chunk of space (we'll see).

Furthermore, I've been thinking about ways to remove the memory burden. I think that it will be possible to store the trie for efficient updating in LevelDB (when done the right way). This should give us the benefit of quick updates without the need to store everything in memory - if it works out the way I think it will, this would be the "perfect" solution.

Sounds good, thanks for your work on this.

domob wrote:Thinking a step further: In order to make use of this in a light client, the client needs to know the block headers. Is there any code out there that implements a client with P2P networking and PoW checking to track the block headers for Namecoin? Presumably there exists already Python code to do that with Bitcoin - this should be easy to adapt. Anyone interested in working on it? Even without the UNO commitments, this could still be used already now to track Merkle branches of name transactions to a block header. Not fully secure, but much better than an unsecured API server and also better than the "PoW secured queries" idea.

You might try looking at NamecoinJ by HashEngineering. It's based on BitcoinJ, and is the basis of the Namecoin Android Wallet (which uses SPV). Joseph was looking at porting Multibit to Namecoin using NamecoinJ as well.

Namecoin Forum

UNO Commitments

UNO Commitments

Re: UNO Commitments

Re: UNO Commitments

Re: UNO Commitments

Re: UNO Commitments

Re: UNO Commitments

Re: UNO Commitments

Re: UNO Commitments

Re: UNO Commitments

Re: UNO Commitments