"import" specification

phelix
Posts: 1634
Joined: Thu Aug 18, 2011 6:59 am

Re: "import" specification

Post by phelix »

domob wrote:Sounds good, and along the lines of my own thoughts (although you have taken more time to work it out nicely :)). I'm for 2 or 4, and think that should be fine. We can even define a kind of "escape syntax" that is used to translate keys after processing the generic fields. E. g., "@import" (or whatever) does importing, while "@@import" gets translated to the "@import" key in the dict. This way, we get full flexibility also for use-cases that may need those keys without too much complication.
"Sounds good, and along the lines of my own thoughts (although you have taken more time to work it out nicely :))" :mrgreen:

Probably we should accept plain "import" until at least blockheight 300k.


@import
§import
%import
&import
=import
+import (I like this one)
~import
*import
#import
_import (mmmh as in Python)

Image
nx.bit - some namecoin stats
nf.bit - shortcut to this forum

hla
Posts: 46
Joined: Mon Nov 10, 2014 12:01 am
os: linux
Contact:

Re: "import" specification

Post by hla »

biolizard89 wrote:So, I see 4 ways to approach the "import subdomain" issue.
  1. Have entirely separate import logic per namespace, e.g. implying the traversal of the "map" field. This appears to be what Hugo is suggesting.
  2. Have a namespace-specific set of contexts where "import" can appear.
  3. Have a namespace-specific name for the "import" field. For d/ this could be something that isn't a valid domain label.
  4. Make "import" global regardless of namespace, with a name that doesn't collide with anything.
These are in decreasing complexity of implementation.

Option 2 seems to be uniformly simpler than Option 1, and I don't see any problem with it.

Option 3 will impose the restriction that no field in a namespace spec can accept arbitrary keys for dicts. In my opinion it doesn't make sense for a namespace to accept arbitrary data in any form, since base64 is better suited for that.

Option 4 will impose a restriction on the character set of all namespaces. To be honest, I don't see a problem here either. There are a few specific applications that Namecoin targets, and very few of them, if any, need the full character set. For example, the following rules could be used:
  • Don't reserve anything that has a meaning in a DNS name or a URL.
  • Don't reserve anything that has a meaning in common identity formats, e.g. real names, email addresses, etc.
  • Don't reserve anything that has a meaning in base64 encoding.
A more thorough list could (and should) be put together. From these rules, find a specific prefix to reserve for JSON operations, such as importing, decrypting, expiring, etc. Any namespace that actually needs that prefix is clearly a weird use case, and can use base64 to encode it.

Thoughts on this? Phelix, Hugo, Daniel?
Firstly, the case for generic import is substantially diminished by the lack of namespaces. Namecoin currently has two official namespaces. We're not facing an excess of namespaces such that a generic import functionality is urgently necessary.

Secondly, this discussion regards the specification, not any implementation of it. So if an implementor finds that they can reuse code for import processing for several domains, nothing stops them. But each namespace specification should have the discretion of expressing its own import processing rules and semantics. To do otherwise necessarily imposes semantics on future namespaces, which may negatively impact the adaptability of Namecoin to new problem spaces.

Moreover, I can't see any situation in which import needs to be processed by an entity which doesn't understand the semantics of the namespace. All import processing will already be done by a piece of software that comprehends the namespace. So generic import isn't necessary from an ecosystem perspective, and may introduce security considerations. If you add generic import to Namecoin value semantics, Namecoin values aren't really JSON anymore; they're a sort of super-JSON, but people will expect it to function like JSON, which may lead to injection attacks.

The id/ namespace specifies the following syntax:

Code: Select all

  "email":
    {
      "default": "khal@dot-bit.org",
      "business": "khal@example.com"
    }
There may be all sorts of applications which generate values programmatically, either in a user-initiated or automated fashion. What if the user specifies a key of 'import'? Yes, you can introduce an escaping rule, but this emphasises the fact that you're not dealing with JSON anymore, but a sort of modified JSON which (dangerously) can be formed by a JSON serializer, but not necessarily with the same semantics. This essentially introduces the potential for injection vulnerabilities. Given the ridiculous prevalence of injection vulnerabilities even today, this seems a foolish move.

Relying on the premise that people won't specify "unusal" key names like "@$()!>,?#import^*" isn't sufficient, because the assumption that key names are always structual (something like an XML tag name) rather than constituting content is false (which doesn't necessarily mean the content isn't textual, which makes the use of base64 silly.) If there exists a JSON format specification x today, or tomorrow, and it models something which would be useful in Namecoin (identity is such a complex issue it's easy to imagine formats which would be useful with id/), it's potentially unusable, because it's specified in JSON and not the subset of it to which Namecoin assigns neutral semantics. So you then end up having to modify that specification in a Namecoin-specific way, such as the proposal to base64-encode keys... which is terrible, since it makes the format machine unreadable, requires changes to the format, and introduces an element of context (did this structure come from Namecoin so I need to de-frobnicate the keys, or should it be taken as is?) This sort of context is undesirable and increases complexity. If the "came from Namecoin" bit gets set wrong (due to a bug or whatever) this could cause misinterpretation of the data, which coud have security implications. It is also incredibly likely that some people setting up keys will inevitably fail to do the frobnication, or do it wrong, which will encourage implementations (however inadvisably) to try and 'guess' whether it needs to defrobnicate the keys or not, which is even more dangerous.

Moreover, completely independent of the method of activation for generic import, its operation obliges any format or subformat used with any namespace to be semantically compatible with the merge rules it defines. It is logically impossible for such merge rules to comply with all possible JSON formats (unless you make import so complicated it starts looking like a generic transformation language ala XSLT, which is a road we surely don't want to go down.)

Case in point. Here is format A:

Code: Select all

{
  "administrative-contact": {
    "name": "John Smith",
    "email": "jsmith@example.com",
    "tel": "+1.222.3333333",
  },
}
This is imported by another value:

Code: Select all

{
  "administrative-contact": {
    "name": "Jane Doe",
    "email": "jdoe@example.com",
    "fax": "+1.333.4444444"
  },
  "import": "..."
}
So what is the correct result? If we merge maps in a generic occluding manner, we might get this:

Code: Select all

{
  "administrative-contact": {
    "name": "Jane Doe",
    "email": "jdoe@example.com",
    "tel": "+1.222.3333333",
    "fax": "+1.333.4444444"
}
Oops - if I try and call Jane Doe, I'm going to end up calling John Smith instead.
Occlusion via explicitly specifying 'null' is one possible means of resolution, but this is impractical for formats which list a large number of possible item types.

If you make maps get wholly replaced with one another, that fixes the above but breaks many useful d/ cases.

Ultimately generic import is complicated. It might be worth considering, despite its complexity and issues, if we were facing dozens of namespaces awaiting realization. But we have only two. The complexity of import exceeds the gains you make by it. And again, this purely concerns the standard; people are free to factor their implementation as they see fit, so long as they conform with the standard.

phelix
Posts: 1634
Joined: Thu Aug 18, 2011 6:59 am

Re: "import" specification

Post by phelix »

hla wrote: Firstly, the case for generic import is substantially diminished by the lack of namespaces. Namecoin currently has two official namespaces. We're not facing an excess of namespaces such that a generic import functionality is urgently necessary.
There is also u/ (currently using "next" instead of "import") and there are two more that I would like to work on if I had the time/resources.
But each namespace specification should have the discretion of expressing its own import processing rules and semantics.
Yes, and we can't stop anybody to do so.
All import processing will already be done by a piece of software that comprehends the namespace. So generic import isn't necessary from an ecosystem perspective, and may introduce security considerations.
I think this premise is wrong. Currently NMControl handles "import" even for namespaces it knows nothing about. The same goes for the experimental API server. It makes requests faster.
If you add generic import to Namecoin value semantics, Namecoin values aren't really JSON anymore; they're a sort of super-JSON, but people will expect it to function like JSON, which may lead to injection attacks.
I don't see a way around this risk. Escaping may not be nice but I think we can handle it. With namespace specific "import" we will even have to deal with it several times.
Moreover, completely independent of the method of activation for generic import, its operation obliges any format or subformat used with any namespace to be semantically compatible with the merge rules it defines. It is logically impossible for such merge rules to comply with all possible JSON formats (unless you make import so complicated it starts looking like a generic transformation language ala XSLT, which is a road we surely don't want to go down.)
If I understand you correctly you say that imports can lead to wrong data. As long as it works deterministically the users should be able to deal with it.


I really don't want to end up having to maintain several slightly different import variants in NMControl, the API server, nameGUI, a potentially upcoming SPV client, etc...
nx.bit - some namecoin stats
nf.bit - shortcut to this forum

domob
Posts: 1129
Joined: Mon Jun 24, 2013 11:27 am
Contact:

Re: "import" specification

Post by domob »

phelix wrote:I really don't want to end up having to maintain several slightly different import variants in NMControl, the API server, nameGUI, a potentially upcoming SPV client, etc...
This - plus the spec's and having to know about them as user.
BTC: 1domobKsPZ5cWk2kXssD8p8ES1qffGUCm | NMC: NCdomobcmcmVdxC5yxMitojQ4tvAtv99pY
BM-GtQnWM3vcdorfqpKXsmfHQ4rVYPG5pKS
Use your Namecoin identity as OpenID: https://nameid.org/

hla
Posts: 46
Joined: Mon Nov 10, 2014 12:01 am
os: linux
Contact:

Re: "import" specification

Post by hla »

phelix wrote:
hla wrote: Firstly, the case for generic import is substantially diminished by the lack of namespaces. Namecoin currently has two official namespaces. We're not facing an excess of namespaces such that a generic import functionality is urgently necessary.
There is also u/ (currently using "next" instead of "import") and there are two more that I would like to work on if I had the time/resources.
Do please elaborate. What objects do these namespaces model? And I'm under the impression that the reputation of u/ is not good, since it essentially duplicates id/.
But each namespace specification should have the discretion of expressing its own import processing rules and semantics.
Yes, and we can't stop anybody to do so.
All import processing will already be done by a piece of software that comprehends the namespace. So generic import isn't necessary from an ecosystem perspective, and may introduce security considerations.
I think this premise is wrong. Currently NMControl handles "import" even for namespaces it knows nothing about. The same goes for the experimental API server. It makes requests faster.
This is contradictory. Each namespace specification can express its own rules, but software is entitled to apply generic import processing? Essentially you're saying that namespaces can make up their own rules, but they'll be frequently ignored. This makes namespace-specific rules unusable in practice, as well as a disaster for the security and predictability of the outcomes.

I also don't understand how NMControl is supposed to make any useful use of a namespace which it doesn't understand. The only possible thing it can usefully (or really, rather uselessly) do in that circumstance is pass the buck to some downstream consumer... in which case the consumer can apply the import processing itself, as indeed it is the only entity that can safely do so.
If you add generic import to Namecoin value semantics, Namecoin values aren't really JSON anymore; they're a sort of super-JSON, but people will expect it to function like JSON, which may lead to injection attacks.
I don't see a way around this risk. Escaping may not be nice but I think we can handle it. With namespace specific "import" we will even have to deal with it several times.
Please explain how the current d/ import proposal in ifa-wg/proposals requires escaping?
Moreover, completely independent of the method of activation for generic import, its operation obliges any format or subformat used with any namespace to be semantically compatible with the merge rules it defines. It is logically impossible for such merge rules to comply with all possible JSON formats (unless you make import so complicated it starts looking like a generic transformation language ala XSLT, which is a road we surely don't want to go down.)
If I understand you correctly you say that imports can lead to wrong data. As long as it works deterministically the users should be able to deal with it.
Um, the point is that's impossible unless you change the format to accommodate generic import's rules, which means you're dictating design rules for current and all future namespaces, and any subschema they may incorporate (which may not even be specified by the Namecoin project).
I really don't want to end up having to maintain several slightly different import variants in NMControl, the API server, nameGUI, a potentially upcoming SPV client, etc...
Um... what? This makes no sense. We're discussing the variation of import on a per-namespace basis. There's no reason the rules applied by different pieces of software processing the same namespace would need to be different; indeed, they must not be. Or if you're discussing the general desire to keep the number of combinatorial multipliers down, where 'number of implementations' and 'number of namespaces' are such multipliers, then the number of namespaces will be a far more dominant factor in the work involved per implementation.

Moreover, as I said, if in practice several namespaces end up using similar or identical import rules, by their choice, nothing precludes an implementation relating to all of those namespaces from factoring its own code internally. The only consideration here is what the specification says.

phelix
Posts: 1634
Joined: Thu Aug 18, 2011 6:59 am

Re: "import" specification

Post by phelix »

hla wrote:
phelix wrote:
hla wrote: Firstly, the case for generic import is substantially diminished by the lack of namespaces. Namecoin currently has two official namespaces. We're not facing an excess of namespaces such that a generic import functionality is urgently necessary.
There is also u/ (currently using "next" instead of "import") and there are two more that I would like to work on if I had the time/resources.
Do please elaborate. What objects do these namespaces model?
There were discussions about file signing and something like a web of trust. Also I can imagine links to (encrypted) torrents / server hosted files.
And I'm under the impression that the reputation of u/ is not good, since it essentially duplicates id/.
:roll:
But each namespace specification should have the discretion of expressing its own import processing rules and semantics.
Yes, and we can't stop anybody to do so.
All import processing will already be done by a piece of software that comprehends the namespace. So generic import isn't necessary from an ecosystem perspective, and may introduce security considerations.
I think this premise is wrong. Currently NMControl handles "import" even for namespaces it knows nothing about. The same goes for the experimental API server. It makes requests faster.
This is contradictory. Each namespace specification can express its own rules, but software is entitled to apply generic import processing? Essentially you're saying that namespaces can make up their own rules, but they'll be frequently ignored. This makes namespace-specific rules unusable in practice, as well as a disaster for the security and predictability of the outcomes.
I also don't understand how NMControl is supposed to make any useful use of a namespace which it doesn't understand. The only possible thing it can usefully (or really, rather uselessly) do in that circumstance is pass the buck to some downstream consumer... in which case the consumer can apply the import processing itself, as indeed it is the only entity that can safely do so.
It is a fact that we can't stop people from creating their own namespaces with their own rules. With our thinly stretched dev capacity it would be nice to only have a single import. Domob made a point in that it is also easier for users. The value size is currently very limited and it is convenient and fast to do imports in NMControl / a remote server / an SPV client.
If you add generic import to Namecoin value semantics, Namecoin values aren't really JSON anymore; they're a sort of super-JSON, but people will expect it to function like JSON, which may lead to injection attacks.
I don't see a way around this risk. Escaping may not be nice but I think we can handle it. With namespace specific "import" we will even have to deal with it several times.
Please explain how the current d/ import proposal in ifa-wg/proposals requires escaping?
biolizard89's post above sounded like that. If you don't see a problem with "import" then it's fine with me.
Moreover, completely independent of the method of activation for generic import, its operation obliges any format or subformat used with any namespace to be semantically compatible with the merge rules it defines. It is logically impossible for such merge rules to comply with all possible JSON formats (unless you make import so complicated it starts looking like a generic transformation language ala XSLT, which is a road we surely don't want to go down.)
If I understand you correctly you say that imports can lead to wrong data. As long as it works deterministically the users should be able to deal with it.
Um, the point is that's impossible unless you change the format to accommodate generic import's rules, which means you're dictating design rules for current and all future namespaces, and any subschema they may incorporate (which may not even be specified by the Namecoin project).
Well, nobody forces anyone to use the generic import. As I wrote way above namespaces are free to create whatever rules they like besides being restricted to not use "import" as a keyword (escaping would allow that, too, but would force proper escaping).
I really don't want to end up having to maintain several slightly different import variants in NMControl, the API server, nameGUI, a potentially upcoming SPV client, etc...
Um... what? This makes no sense. We're discussing the variation of import on a per-namespace basis. There's no reason the rules applied by different pieces of software processing the same namespace would need to be different; indeed, they must not be. Or if you're discussing the general desire to keep the number of combinatorial multipliers down, where 'number of implementations' and 'number of namespaces' are such multipliers, then the number of namespaces will be a far more dominant factor in the work involved per implementation.
Why do you think so? Are you aware that implementations could be in different languages?
Moreover, as I said, if in practice several namespaces end up using similar or identical import rules, by their choice, nothing precludes an implementation relating to all of those namespaces from factoring its own code internally. The only consideration here is what the specification says.
Well, the implementation will have to follow the spec....

IMHO the simple generic solutions is good enough for 99% of practical use cases and another 0.9% can probably be solved as well.
nx.bit - some namecoin stats
nf.bit - shortcut to this forum

hla
Posts: 46
Joined: Mon Nov 10, 2014 12:01 am
os: linux
Contact:

Re: "import" specification

Post by hla »

At this point you're ignoring half of everything I write.
:roll:
This is not a response.
It is a fact that we can't stop people from creating their own namespaces with their own rules.
I have no idea what your position is here. We can't stop people from doing their own thing with their own namespaces, but we reserve the right to transmute their values in arbitrary and wrong fashions? That seems like a pretty effective way of stopping people from creating their own namespaces with their own rules. But that, presumably, is not the objective of generic import. Ergo, the impact generic import has on this use case is a problem.
Well, nobody forces anyone to use the generic import. As I wrote way above namespaces are free to create whatever rules they like besides being restricted to not use "import" as a keyword (escaping would allow that, too, but would force proper escaping).
Yes, and that restriction is not okay for the reasons I've highlighted above repeatedly.

phelix
Posts: 1634
Joined: Thu Aug 18, 2011 6:59 am

Re: "import" specification

Post by phelix »

hla wrote:At this point you're ignoring half of everything I write.
I'm sorry you feel this way. I tried to answer everything. For what it's worth I fixed my quotes.
:roll:
This is not a response.
Yes it is. You can look it up. In this particular case I meant: "yes, unfortunately. I am sad that Onename does things like they do"
It is a fact that we can't stop people from creating their own namespaces with their own rules.
I have no idea what your position is here.
No need for a position as we can't change it.
We can't stop people from doing their own thing with their own namespaces, but we reserve the right to transmute their values in arbitrary and wrong fashions? That seems like a pretty effective way of stopping people from creating their own namespaces with their own rules. But that, presumably, is not the objective of generic import. Ergo, the impact generic import has on this use case is a problem.
If it helps: there is always the possibility of working on raw data (without "import" or anything processed). So the user/application can decide whether it wants to make use of it or not.
Well, nobody forces anyone to use the generic import. As I wrote way above namespaces are free to create whatever rules they like besides being restricted to not use "import" as a keyword (escaping would allow that, too, but would force proper escaping).
Yes, and that restriction is not okay for the reasons I've highlighted above repeatedly.
What I wrote above is not correct because of the raw data option.

I think I finally understood you :) Let me rephrase in my own words: We should not force generic processing rules on namespaces because there is a risk of errors and attacks. Particularly with automatically created values there is risk of injection attacks.

Do you think we can really work around this risk with namespace specific imports? Several different import variants could also mean more attack vectors and a higher risk of bugs in one of the implementations.

The specific import biolizard suggested seems to protect against attacks pretty well.
nx.bit - some namecoin stats
nf.bit - shortcut to this forum

hla
Posts: 46
Joined: Mon Nov 10, 2014 12:01 am
os: linux
Contact:

Re: "import" specification

Post by hla »

Yes, I think we're on the same page now.

I think anything generating values which understands the semantics of the namespace can do so in a safe fashion, yes. For example, in the case of d/ there are only a finite number of item types, and a GUI, say, is likely to correspond to this. Importantly, 'import' is not recognised inside a map object (although of course it is recognised inside its values).

Escaping would work. But there are several issues:
- Whatever the import operation actually does may not (indeed, cannot) be suitable for all possible schemas (unless you make it Turing-complete).
- People may assume that such an import operation is suitable, which results in unpredictable, wrong or unsafe outcomes.

- There will inevitably be some crummy implementation which doesn't escape properly. This, in turn, will lead to other implementations having to hedge and try and 'guess' whether the value was escaped or not, leading to nondeterministic behaviour. (See how IE's "X-Content-Type-Options: nosniff" is now a recommended security practice.)

Moreover, I realise that we're now conflating two issues:
- The method of activation of generic import.
- The transformation expressed by generic import and its suitability, and whether it should exist at all.

Here I am discussing the ambiguity of the method of activation. One possible compromise here would be to specify a generic transformation, and allow namespace-specific specifications to reference it. This could be as simple as "the import directive as specified in ... is allowed here." This would permit full code reuse, and a small amount of namespace-specific logic to decide when to invoke it (in the case of d/, for example, only in the object itself and objects inside map objects).

Of course I have issues with the generality of the transformation as well; even if the method of activation issue is resolved, there are going to exist JSON formats that aren't suitable for use with generic import (this is completely separate from the escaping issue). Some of these formats may exist today, as part of the microformat ecosystem. Since modelling identity is complex, we shouldn't deny id/ the ability to take advantage of such formats. Forcing a generic import mechanism on namespaces could actually risk more duplication, since you could end up with a generic import which a namespace has to attach a big "don't use this" sign to, and then come up with its own variant.

On a procedural note, this is the discussion on d/. Obviously a cross-namespace importation mechanism can't be put in the d/ spec. So to do generic import we'd have to make a new spec, be happy with it, and reference it from d/.

Post Reply