Filling the remaining gap between WebSocket, WebRTC and WebTranspor

Existing web platform facilities for network communication include WebSocket and WebRTC. Each mandates a specific protocol, ensures TLS, and respects the Same Origin Policy. The proposed WebTransport is similar in these respects.

Unfortunately, these APIs are impractical when the developer has no control over the endpoint. In general, you can’t create a web app that talks to servers and devices that have their own protocols incompatible with what’s available on the web, e.g. SSH, RDP or printer protocols. Another example is that you can’t create a web app that needs to talk to a legacy system. Requiring users to change or replace said system would be hard as it could be deployed on-premise by countless parties. Similarly, devices often have their own protocols, last for decades, and are expensive to replace.

Native platforms universally provide APIs to access local network devices and information systems over TCP or UDP. For example, Windows applications can use WinSock, and Android applications can use android.net. Applications depending on these facilities cannot be ported to the web platform without a rewrite of the servers they connect to (which may range from easy to impossible). Nevertheless, there are risks in adding such a powerful capability to the web platform, for users and their organizations, which need to be carefully considered.

A general API like a prior proposal from 2015 (TCP and UDP Socket API) would allow listening for incoming connections, but this may not be necessary to satisfy the main use cases, so a more restrictive API may be more feasible. The only way to know for sure would be to identify the key use cases that need to be solved.

So, we would like to first explore the use cases. Once interest is proven and the use cases are understood, a threat model will have to be defined in order to clearly identify and tackle the ensuing security and privacy challenges.

Do folks have use cases that relate to this gap? Is there interest in exploring an API proposal?

9 Likes

Hi Eric! I definitely think this would be a great addition to the platform. There was a small amount of discussion in a previous thread, but unfortunately, that didn’t go very far.

I was first disheartened by this limitation when I had the idea to build a mail client. This would require implementing IMAP and SMTP, which of course, isn’t currently possible in the web. To list a few more use cases - it’d be great to build an IRC client! Or something to communicate with IOT devices on your network.

I understand why this proposal will be scary to some people - but I really want to push past that, and get even closer to feature parity with native apps :crossed_fingers:

3 Likes

I’ll mention one “use case” that actually unlocks dozens if not hundreds of different applications. Specifically, Distributed Hash Tables.

DHTs are a fundamental building block of almost every P2P system. They are the primary routing layer in WebTorrent, IPFS, Dat, and other decentralized protocols. Almost every decentralized/distributed system needs to provide a lookup service similar to a key-value hash table. Using this distributed data structure, it’s possible to build higher-level capabilities like peer-finding without a central signaling infrastructure.

Limitations of WebRTC Data Channels:

It’s not possible to build a DHT on top of a WebRTC connection model. We need the ability to store the “contact information” for a peer, close the connection to that peer, and then re-connect to that peer at some point in the future (if they’re still online). This is the way that DHT routing tables get built up over time. With TCP/UDP, this is quite easy to accomplish – simply store the ip:port (12.34.56.78:9000) and try to connect. If the peer is still online, it will just work. However, in WebRTC, the offer/answer connection model makes it impossible to store a peer’s “contact information” and attempt to connect again later. A WebRTC connection offer or answer is one-time use.

Required features in a solution:

I think that rectifying this requires these features currently lacking in WebRTC:

  1. Peers need the ability to “listen” for incoming connections.

  2. Peers also need the ability to publish a some kind of “reusable offer/answer” that multiple peers can use to connect to listening peers (currently an offer/answer is usable only once). In normal TCP/UDP, this peer contact information is simply an IP address and port (i.e. 12.34.56.78:9000). We don’t strictly need support for connecting using an IP address and port, but some kind of long-lived “contact information” for a listening peer is required.

  3. Lighter-weight connections. It seems that the current WebRTC spec is very complex This leads to a situation where it’s really hard to open more than a dozen connections at once without the browser process hitting 100% CPU utilization. This prevents applications that would like to open 20+ connections from doing so, which makes Data Channels a lot less useful for use cases like peer-assisted delivery, peer-assisted live streaming, WebTorrent, DHTs, in-browser cryptocurrency networks, etc.

10 Likes

Just want to say this would be amazing and liberate so much deventralized software from technical infeasibility. Like the other people said, web-first mail clients, and DHTs are two huge use-cases.

1 Like

Our primary use-cases are application resilience, direct and multi-party collaboration, and archiving.

Resilience: We want connectivity options that enable more resilient web applications for users in low/unpredictable bandwidth scenarios, when servers are down (operator error, cert expiration, etc), and when the connection is blocked from internet shutdowns (Myanmar, Kashmir, Ethiopia and more) or censorship (China, Turkey, Thailand, Indonesia and more).

Collaboration: We want to enable users to collaborate in scenarios not well-supported by existing web platform connectivity options. Examples are publishing, editing, sharing and discovering documents on disconnected networks such as LAN, BLE+WifiDirect connections, and overlays which span hardware combinations and transitions.

Archiving: We want users to be able to revisit content and applications, as well as to ensure access to information in the state needed by those users and under challenging conditions. Archiving features enable the data survival over time, combat mis/disinformation and fake news, and give users more control over their experiences on the web.

We address these use-cases with the IPFS protocol, which is built on libp2p, a modular system of protocols, specifications and libraries that enable the development of peer-to-peer network applications. Libp2p is a standalone project, used by many applications aside from IPFS.

The WebRTC challenges related to DHTs in web content are well defined by @feross in the previous comment. All those points are valid in the context of our IPFS/libp2p use-cases.

Building web applications with js-ipfs and js-libp2p in web content means our transport options are limited to HTTP, WebSockets and WebRTC, each with challenges:

HTTP and WebSockets

  • Centralized and brittle: both require hardcoding address of an arbitrary service that is assumed to always work (a single point of failure)
  • Use in Secure Contexts requires a TLS setup that relies on PKI

WebRTC

  • No permanent address that can be dialed later for quick resume
  • Uses DTLS (a bit better, no PKI), but there is no way to use a custom keypair and no way to pass any additional metadata like we can with EncryptedExtensions in TLS 1.3
  • Does not work in Service Workers, which means additional overhead due to postMessage, duplicated instances

This translates into high level gaps:

  • Nodes running in web browser are unable to build and participate efficiently in DHTs because they cannot discover and connect directly to each other.
  • Need to bootstrap connections using WebSockets or HTTP each time a web application starts or WebRTC connection is initialized, which is slow and sometimes not possible (see resilience notes earlier in post)
  • No way to directly reconnect after initial connections have been made.
  • No discovery/connectivity in offline environments, or local network. The web doesn’t have a mDNS-like local discovery method for web browser to find peers on the same network and/or on the same web Origin
5 Likes

Re-iterating IoT as a use case - CoAP (UDP) and MQTT (TCP) and other protocols would be awesome to enable in a PWA. Local/privacy-first controllers and ease dev exp are very compelling!

I for one would absolutely love it if this ability were added to the web platform. I’ve had a number of real-world use cases over the years that would have benefited from direct network access. Some are my own personal projects, some are projects I’ve worked on for various companies, and some pertain to projects I’ve worked with in the past where I thought “if only this could connect to normal TCP or UDP sockets, it would be so much better”.

  • SMTP client: https://github.com/jbaicoianu/elation-component-mail - I built this 10 years ago and had to implement IMAP talking to a custom backend speaking SMTP over REST, direct TCP would allow users to securely access their own mail services without having to relay through a server I have to host.

  • IRC client: https://github.com/jbaicoianu/elation-irc - 7 years ago, implemented with a custom WebSocket backend to allow connecting multiple clients to multiple servers. Same as above - this arrangement decreases user security and reliability, because the relay could be sniffing traffic, or the relay could go down.

  • Emscripten SDL2_net: https://github.com/emscripten-ports/SDL2_net - allows us to compile SDL2 apps to WASM with networking support, which Emscripten can proxy through either WebSockets or WebRTC. Normally requires additional server-side software to bridge with specific TCP or UDP services (commonly either websockify or custom server software)

  • PPPoW: https://github.com/jbaicoianu/PPPoW-server - Dialup over websockets. Implementing full TCP/UDP support by running pppd and a WebSocket relay in an AWS instance, allowing WASM-compiled native apps running in the browser to speak to it as if it were a dial-up ISP. Used for Win3.11 in VR demo: https://assets.metacade.com/emulators/win311vr.html

  • Internet Archive Emularity networking project: https://docs.google.com/document/d/16Vc-PxHxbVo5WG20tuF4zDHHa3oWO5DcEs-w3teNEYM - a work in progress, this is a proposal for extending Archive.org’s emulated software collections to support multiplayer games and internet-connected applications. This would add multiplayer support for historical games which used earlier networking technologies like null modem, dial-up, or IPX. Current experiments are implemented with a WebSocket relay which bridges to a TCP server (existing open source project).

  • JanusVR Presence Server: https://github.com/janusvr/janus-server - A presence server for web-based virtual worlds. Allows connections from native app users over TCP and web users with WebSockets on the same port - server autodetects type, and relays between both.

  • High Fidelity WebRTC Relay: https://github.com/janusvr/hifi_webrtc_relay - Bridges the High Fidelity virtual world networking protocol to the web, using a custom WebRTC DataChannel relay to let web users connect to the same servers as the existing native app, and participate in voice calls and 3d world building from a lightweight web interface.

  • libp2p: https://github.com/libp2p - a cross-language library for building p2p apps. Already covered in much greater detail by @lidel above.

  • WebTorrent: already covered in much greater detail by @feross above

All of these use cases would have been significantly easier to implement without involving a server-side relay component, and in many cases requiring a relay actually decreased the security of the end user.

Direct TCP and UDP support for PWAs would open up a new wave of interoperability between web and native platforms, and give web app developers access to decades of software and services that were just never written with the web in mind. It would allow native app developers to open up parts of their systems to web users without needing to do full rewrites of the entire networking stack to accommodate the web.

Plenty of security concerns involved, of course, but I’ll hold off on those until we get to that part of the conversation :smiley:

3 Likes

Given the level of expressed support, have transferred in a repo: https://github.com/WICG/raw-sockets.