DanArmak comments on The Web Browser is Not Your Client (But You Don’t Need To Know That)

DanArmak Apr 29, 2016, 6:20 PM
0 points

Well, yes. That’s more or less why I expect it to never, ever happen. I did say I’m a crank with no serious hopes. ;-)

It’s a pity that whatever energy exists on LW for discussing technological changes to the diaspora is exhausted on a non-serious proposal.

When you argue for something you don’t expect to be accepted, you lose any reason to make reasonable compromises, lowering the chances of finding a mutually beneficial solution.

While I don’t object in theory to a new protocol, JSON over HTTP specifically is a paradigm I would like to destroy.

I may share your feelings. But if you want an API to be accessible to Web clients, it pretty much has to be JSON over HTTP. Any other format you support will have to be in addition, not instead of that.

Json isn’t actually bad as a slightly-structured, self-describing, human-readable format. Maybe you prefer YAML or something, but I don’t feel there’s a lot of difference to be had. Certainly it’s far better than non-self-describing, non-textual formats unless you really need to optimize for parsing performance or for size on-the-wire. And I’d argue you don’t, for this usecase.

HTTP is horrible (and I say this as someone who wrote a lot of low-level HTTP middleware, even a parser once). Using maybe 50% of the semantics, and pretty much 0% of the syntax, and adding the features of HTTP/2 and some others they couldn’t fit into the spec, would be wonderful. But we don’t really have that option; we’re stuck with it as something we can’t change or avoid using. And I too hate having to do that.

But you know what? The same is true of TCP, and of IPv4, and of the BSD socket API, and a thousand other fundamental API designs that have won in the marketplace. At some point we have to acknowledge reality to write useful software. A forum / discussion protocol doesn’t conflict with JSON over HTTP (much). We need to to focus on designing a good API, whatever it runs on.

If it helps, you can put the HTTP/JSON encoding in a separate specification, and be the happy user of a different-but-compatible encoding over a gateway.

I think this is our core disagreement. I find web forum software worse even after penalizing NNTP for everything you mention.

You don’t address the point I feel is most important: the NNTP model (distributed immutable messages, users not tied to servers, mod powers and karma not in the spec, …) just isn’t the one we use and want to keep using on discussion forums.

it turns out links to netnews posts also exist.

But they don’t work with supercedes, because they link to immutable message IDs. So the server has to dereference the link, has to have kept all the old (superceded) versions, and has to prove the supercede chain validity to the client in case of signed messages. This is just unnecessarily ugly.

Besides, they are URIs, not URLs. That’s not something the web can handle too well. You can include a server in the link, making a URL, but NNTP doesn’t have a concept of an authoritative host (origin), so once again, why use NNTP if you’re not going to move messages between servers, which is the whole point of the protocol? If you just want to store them at a single place, it would make as much sense to use shared IMAP. (Which is to say, not much.)
- Lumifer Apr 29, 2016, 7:04 PM
  3 points
  Parent
  Before we get deep into protocols, is there any kind of a spec sheet anywhere?
  
  Saying you want better software for discussions is… horribly vague. I have a strong feeling that we should figure out things like lists of desirable features, lists of undesirable misfeatures, choices of how one list will be traded off against the other list, etc. before we focus all the energy on stomping JSON into tiny little pieces.
  - DanArmak Apr 29, 2016, 9:02 PM
    4 points
    Parent
    Here’s my shortlist of requirements:
    
    Basic architecture: network of sites sharing an API (not an interface). A site can have a web client as part of the site (or several), but at least some clients can be written independently of a site. Users can choose to use different/customizable clients, and in particular, aggregate and cross-link content and users across sites. It should be possible, at least in theory, to write a non-web cross-site client with lots of custom features and use it as one’s only interface to all discussion forums without any loss of functionality.
    
    We need at least feature parity with LW, which is the most feature-full of diaspora blogs and forums; other sites tend to have subsets of the same features, so they should be able to disable e.g. private messages if they want to. So: top-level posts with trees of comments, both of which can be edited or retracted; posts have special status (tags, categories, require permissions to post, etc); authenticated users (unless the site allows anonymous or pesudonymous comments), so a user’s comments can be collated; permalinks to posts and comments; RSS feeds of various things; etc.
    
    Users should follow the user@host pattern, so they can be followed across sites. Different authentication methods can be integrated (Local/Google/Facebook/OpenID/...) but the spec doesn’t concern itself with that. User permissions should be stored at each site, and be powerful enough to allow different configurations, mod and admin powers, etc. Posts and messages should allow pubkey signatures, and users should be able to configure a signing key as part of their account, because some people really enjoy that.
    
    In the LW 2.0 discussions, people proposed different variations on karma. The API should include the concept of a user’s karma(s) on a site, but for voting etc. it should probably limit itself to storing and querying data, and let the implementation decide how to use it. So e.g. the server implementation could disallow posting to a user with insufficient karma, or the client implementation could hide downvoted comments. The API would specify the mechanism, not the policy.
    
    Finally, there need to be implementations that are pain-free and cost-free for site admins to install. At the very least, it should not involve running completely custom server software, or completely rewriting existing web clients and their UX. Ideally, there would be easy adapters/plugins/… for existing client and/or server software.
    - Error Apr 29, 2016, 9:29 PM
      3 points
      Parent
      I agree with most of this, with the exception that top-level posts should not have any special status at the protocol level other than not having a parent. Clients are free to present them specially, though, including whatever ‘default’ interface each site has. Whatever moderation layer exists may do the same.
      
      I also dislike private messaging systems—not so much because they shouldn’t exist, but because they should be implemented as email accounts that only deliver mail among local users, so you can handle them in your regular email client if you want.
      
      [Edit: Note that tags and a lot of other post metadata could be implemented as extra headers in a news article. Not karma, though.]
      
      Your description of basic architecture in particular is an excellent summary of what I want out of a discussion protocol.
      - DanArmak Apr 29, 2016, 10:48 PM
        0 points
        Parent
        
        top-level posts should not have any special status at the protocol level other than not having a parent.
        
        Those are implementation details. The point is that top-level or parent-less posts have a special semantic status: they start a new conversation.
        
        I also dislike private messaging systems—not so much because they shouldn’t exist, but because they should be implemented as email accounts that only deliver mail among local users, so you can handle them in your regular email client if you want.
        
        It’s a matter of integration: I want the same settings, and client software, that you use for the rest of the forum to apply to privmsgs. For instance, blocking a user’s messages, sending privmsgs as replies to forum threads (and displaying that correctly in the client), …
        
        And I don’t want to have to use two different client applications at the same time (email & forum) for private vs public messages.
        
        And most people only use webmail, and you can’t tell gmail.com to display messages that live on the lesswrong.com IMAP server, if that’s what you intended.
        Error Apr 30, 2016, 3:24 AM
        0 points
        Parent
        
        It’s a matter of integration: I want the same settings, and client software, that you use for the rest of the forum to apply to privmsgs.
        
        I don’t share the preference, but I don’t think this represents a conflict. There’s no reason a web client couldn’t present one UI to its users while doing two different things on the back end, IMAP for PMs and whatever else for the forum. Newsreaders do exactly that to support reply-by-email, and it works fine from what I’ve seen.
  - DanArmak Apr 29, 2016, 8:34 PM
    0 points
    Parent
    I’m very willing to engage in this. (And I described what I want in some of my other comments). I’ll post my spec sheet (which I think includes most of Error’s) in a separate reply. But first, before we get deep into feature lists and spec sheets:
    
    Suppose we agree on a protocol (or whatever). Suppose it’s so good that we can convince most people it’s technologically and socially superior to existing solutions—not counting the unavoidable costs of using custom software and of changing things, which are significant.
    
    Given all that, how likely will we be to 1) write all the code needed, to the quality of a production project (actually, multiple ones), and provide support etc. for the foreseeable future (or convincing others to help us do so); 2) convince enough diaspora site admins, and readers/commenters/users if applicable, to switch over?
    
    Obviously this depends on how much our proposal improves (or promises to improve) on what we have now.
    - Lumifer Apr 29, 2016, 8:48 PM
      0 points
      Parent
      See my answer to Error, but for the “how likely” question the only possible answer that I can see is “One step at a time”.
      
      First you need an idea that’s both exciting and gelled enough have some shape which survives shaking and poking.
      
      If enough people are enthusiastic about the idea, you write a white paper.
      
      If enough people (or the right people) are enthusiastic about the white paper, you write a spec sheet for software.
      
      If enough people continue to be enthusiastic about the idea, the white paper, and the spec sheet, you start coding.
      
      If you get this far, you can start thinking about OSS projects, startups, and all these kinds of things.. :-)
      
      P.S. Oh, and you shouldn’t think of this project as “How do we reanimate LW and keep it shambling for a bit longer”. You should think about it as “What kind of a new discussion framework can we bestow on the soon-to-be-grateful world” :-)
      - Error Apr 29, 2016, 9:51 PM
        0 points
        Parent
        I get the impression most projects do that backwards, and that that’s a large part of how we got into this giant mess of incompatible discussion APIs.
        
        Somewhere later in this sequence I’m going to address the social problem of convincing people to buy back in. The very short version is: Make it more powerful than what they’ve got, so they have an incentive to move. Make sure they are still running their own shows, because status and sovereignity matters. And make it more convenient to migrate than to manage what they’ve got, because convenience is everything.
        
        Once you get three or four diasporists back, network effects does the rest. But it needs to be an improvement to the individual migrant even if nobody else does it, otherwise the coordination problem involved is incredibly hard to beat.
        
        You should think about it as “What kind of a new discussion framework can we bestow on the soon-to-be-grateful world” :-)
        
        Sometimes I think the best way to promote my ideas would be to start an NNTP-backed forum hosting service. I know it’s within my capabilities.
        
        Then I realize that 1. that would be a lot of work, and I have a day job, 2. nobody cares except me, and 3. I would be competing with Reddit.
  - Error Apr 29, 2016, 8:15 PM
    0 points
    Parent
    I had a list of...not features, exactly, but desirable elements, in the first post. I intended to update it from comments but didn’t.
    - Lumifer Apr 29, 2016, 8:41 PM
      0 points
      Parent
      I want higher and deeper X-)
      
      Higher in the sense of specifying desirables from some set of more-or-less terminal goals. For example, you say “centralized from the user perspective”—and why do we want this? What is the end result you’re trying to achieve?
      
      Deeper in the sense of talking about base concepts. Will there be “posts” and “comments” as very different things? If so, will the trees be shallow (lots of posts, mostly with few comments, no necroing) or deep (few posts, mostly with lots of comments, necroing is encouraged)?
      
      Will there be a “forum”? “subforums”, maybe? Or will there be a pile of tagged pieces of text from which everyone assembles their own set to read? Will such concept as “follow an author” exist? How centralised or decentralised will things be? Who will exercise control and what kind of powers will they have?
      
      That’s not a complete set of questions at all, just a pointer at the level which will have to decided on and set in stone before you start discussing protocols.
      What links here?
      Lumifer's comment on The Web Browser is Not Your Client (But You Don’t Need To Know That) by Error (Apr 29, 2016, 8:48 PM; 0 points)
- Error Apr 29, 2016, 9:12 PM
  0 points
  Parent
  
  When you argue for something you don’t expect to be accepted, you lose any reason to make reasonable compromises, lowering the chances of finding a mutually beneficial solution.
  
  If it helps, any compromises I make or don’t make are irrelevant to anything that will actually happen. I don’t think anyone in a position to define LW2.0 is even participating in the threads, though I do hope they’re reading them.
  
  I figure the best I can hope for is to be understood. I appreciate your arguments against more than you may realize—because I can tell you’re arguing from the position of someone who does understand, even if you don’t agree.
  
  Maybe you prefer YAML or something
  
  YAML’s the least-bad structured format I’m aware of, though that may say more about what formats I’m aware of than anything else. It’s certainly easier to read and write than JSON; you could conceivably talk YAML over a telnet session without it being a major hassle.
  
  I agree that non-textual formats are bad for most cases, including this one.
  
  If it helps, you can put the HTTP/JSON encoding in a separate specification, and be the happy user of a different-but-compatible encoding over a gateway.
  
  I wouldn’t object to that, as long as 1. the specs evolved in tandem, and 2. the gateway was from http/json to (NNTP2?), rather than the other way around.
  
  The temptation that’s intended to avoid is devs responding to demands for ponies by kludging them into the http/json spec without considering whether they can be meaningfully translated through a gateway without lossage.
  
  But they don’t work with supercedes, because they link to immutable message IDs.
  
  This...might trip me up, actually. I was under the impression that requests for a previous message ID would return the superceding message instead. I appear to have gotten that from here but I can’t find the corresponding reference in the RFCs. It’s certainly the way it should work, but, well, should.
  
  I need to spin up INN and test it.
  
  You don’t address the point I feel is most important: the NNTP model (distributed immutable messages, users not tied to servers, mod powers and karma not in the spec, …) just isn’t the one we use and want to keep using on discussion forums.
  
  We either disagree on the desirable model or else on what the model actually is. I’m ambivalent about distributed architecture as long as interoperability is maintained. Mod powers not in the spec seems like a plus to me, not a minus. Today, as I understand it, posts to moderated groups get sent to an email address, which may have whatever moderation software you like behind it. Which is fine by me. Users not being tied to a particular server seems like a plus to me too. [edit: but I may misunderstand what you mean by that]
  
  Karma’s a legitimately hard problem. I don’t feel like I need it, but I’m not terribly confident in that. To me its main benefit is to make it easier to sort through overly large threads for the stuff that’s worth reading; having a functioning ‘next unread post’ key serves me just as well or better. To others...well, others may get other things out of it, which is why I’m not confident it’s not needed.
  
  I’ll have to get back to you on immutability after experimenting with INN’s response to supercedes.
  - DanArmak Apr 29, 2016, 11:32 PM
    0 points
    Parent
    
    If it helps, any compromises I make or don’t make are irrelevant to anything that will actually happen.
    
    That depends on how much you’re willing to compromise before you see it as wasted effort to participate. Somewhere in the space of ideas there might be a proposal that everyone would accept as an improvement on the status quo.
    
    I don’t think anyone in a position to define LW2.0 is even participating in the threads, though I do hope they’re reading them.
    
    Someone is upvoting your posts besides me. This one is at +19.
    
    I wouldn’t object to that, as long as 1. the specs evolved in tandem, and 2. the gateway was from http/json to (NNTP2?), rather than the other way around.
    
    I meant we could have one spec chapter spec describing types, messages, requests and responses, and then one or more ‘encoding’ chapters describing how these messages are represented in JSON over HTTP, or in… something else. So all encodings would be equal; there could be gateways, but there also could be servers supporting different encodings.
    
    I don’t think this is necessary, but if you insist on non-json/http encodings, it’s probably better to do it this way rather than by translation.
    
    I’m ambivalent about distributed architecture as long as interoperability is maintained.
    
    A distributed system necessarily has fewer features and weaker guarantees or semantics than a non-distributed one. Distributed systems can also be much harder to implement. (NNTP is easy to implement, because it has very few features: messages are immutable, users are not authenticated...) So if you don’t need a true distributed system, you shouldn’t use one.
    
    Mod powers not in the spec seems like a plus to me, not a minus.
    
    As long as comments are stored on private servers, then mods (=admins) can delete them. A spec without mod powers has to store data where no-one but the poster can remove or change it. We’re getting into distributed system design again.
    
    Well, actually, there are ways around that. We could put all comments into a blockchain, which clients would verify, and you can’t retroactively remove a block without clients at least knowing something was removed, and anyone with a copy of the missing block could prove it was the real one. But why?
    
    Today, as I understand it, posts to moderated groups get sent to an email address, which may have whatever moderation software you like behind it.
    
    We’re talking about two different schemes. You’re describing moderated mailing lists; messages need to be approved by mods before other members see them. I’m talking about the LW model: mods can retroactively remove (or, in theory, edit) messages. This too stems from the basic difference between systems with and without mutable messages. In a mailing list or an NNTP group, once clients got their copies of a post, there’s no way for a mod to force them to forget it if they don’t want to.
    
    Users not being tied to a particular server seems like a plus to me too.
    
    By “tied to a server” I mean authentication tied to the DNS name. To authenticate someone as foo@gmail.com using Google login or OpenID or an actual email-based auth system, you talk to gmail.com. The gmail.com admin can manipulate or revoke the foo account. And there’s only one foo@gmail.com around.
    
    Whereas in NNTP, if I understand correctly, I can put any string I like in the From: field. (Just like in classical email.) I might say I’m foo@gmail.com, but NNTP software won’t talk to gmail.com to confirm that.
    - Error Apr 30, 2016, 4:12 AM
      0 points
      Parent
      
      Someone is upvoting your posts besides me. This one is at +19.
      
      Touche. It’s kind of a shame that Main is out of commission, or I’d be earning a gazillion karma for this.
      
      I meant we could have one spec chapter spec describing types, messages, requests and responses, and then one or more ‘encoding’ chapters describing how these messages are represented in JSON over HTTP, or in… something else.
      
      Hrm. I actually really like this idea; it fits right in with my separate-form-from-function philosophy, and I think standardizing semantics is much more important than standardizing the format of messages over the wire (even though I do have strong preferences about the latter). You’d have to be awfully careful about what went into the spec, though, to allow for a wide range of representations. e.g. if you have a data structure that’s an arbitrarily-nested dictionary, you’re limiting yourself to formats that can represent such a type; otherwise you have the same sort of potential lossage you’d get through a gateway.
      
      But in principle I like it.
      
      [edit: If you were really careful about the spec, you might even be able to get an NNTP-compatible representation “for free”]
      
      Whereas in NNTP, if I understand correctly, I can put any string I like in the From: field.
      
      True with respect to the protocol. I was going to write about this in a future post but maybe it’s better to talk about it now, if only to expose and (hopefully) repair flaws beforehand.
      
      Yes, you can forge From headers, or mod approval headers, or anything really. But the forged message has to enter the network through a server on which you have an account, and that system knows who you are and can refuse to originate messages where the From header doesn’t match the authenticated user. On Usenet this is ineffective; the network is too large. But in a small private network it’s possible for node owners to collectively agree “none of us will allow our users to forge From headers.”
      
      Moderated groups theoretically work like the mailing lists you describe; articles get redirected to a moderation email address. Direct posts are only accepted by the moderator. The address can be (probably is) monitored by software rather than a person, and that software can enforce a policy like “reject posts by users on the Banned list, reject posts with no parent from users not on the Local Sovereign list, accept all other posts.”
      
      As I understand it, cancels and supercedes are also posts in their own right and go through the same moderation queue, so you can extend that policy with “accept cancels or supercedes by the same user as the original post, accept cancels or supercedes by users on the Moderator list, reject all other cancels or supercedes.
      
      I think this works as long as the From header can be trusted—and that, as above, that can be arranged on a closed network (and only on a closed network).
      
      I probably haven’t covered all bases on this; how would you sneak a forged message through such a setup?
      
      In a mailing list or an NNTP group, once clients got their copies of a post, there’s no way for a mod to force them to forget it if they don’t want to.
      
      I consider that a feature, not a bug, but I think I’m misunderstanding you here; no system that permits local caching can prevent clients from keeping old versions of posts around. And web pages are certainly cached (and archived). So I don’t think you mean what it sounds like you mean.