Filling the remaining gap between WebSocket, WebRTC and WebTranspor

ericwilligers · 2020-04-08

Existing web platform facilities for network communication include WebSocket and WebRTC. Each mandates a specific protocol, ensures TLS, and respects the Same Origin Policy. The proposed WebTransport is similar in these respects.

Unfortunately, these APIs are impractical when the developer has no control over the endpoint. In general, you can’t create a web app that talks to servers and devices that have their own protocols incompatible with what’s available on the web, e.g. SSH, RDP or printer protocols. Another example is that you can’t create a web app that needs to talk to a legacy system. Requiring users to change or replace said system would be hard as it could be deployed on-premise by countless parties. Similarly, devices often have their own protocols, last for decades, and are expensive to replace.

Native platforms universally provide APIs to access local network devices and information systems over TCP or UDP. For example, Windows applications can use WinSock, and Android applications can use android.net. Applications depending on these facilities cannot be ported to the web platform without a rewrite of the servers they connect to (which may range from easy to impossible). Nevertheless, there are risks in adding such a powerful capability to the web platform, for users and their organizations, which need to be carefully considered.

A general API like a prior proposal from 2015 (TCP and UDP Socket API) would allow listening for incoming connections, but this may not be necessary to satisfy the main use cases, so a more restrictive API may be more feasible. The only way to know for sure would be to identify the key use cases that need to be solved.

So, we would like to first explore the use cases. Once interest is proven and the use cases are understood, a threat model will have to be defined in order to clearly identify and tackle the ensuing security and privacy challenges.

Do folks have use cases that relate to this gap? Is there interest in exploring an API proposal?

oliverdunk · 2020-04-08

Hi Eric! I definitely think this would be a great addition to the platform. There was a small amount of discussion in a previous thread, but unfortunately, that didn’t go very far.

I was first disheartened by this limitation when I had the idea to build a mail client. This would require implementing IMAP and SMTP, which of course, isn’t currently possible in the web. To list a few more use cases - it’d be great to build an IRC client! Or something to communicate with IOT devices on your network.

I understand why this proposal will be scary to some people - but I really want to push past that, and get even closer to feature parity with native apps

feross · 2020-04-17

I’ll mention one “use case” that actually unlocks dozens if not hundreds of different applications. Specifically, Distributed Hash Tables.

DHTs are a fundamental building block of almost every P2P system. They are the primary routing layer in WebTorrent, IPFS, Dat, and other decentralized protocols. Almost every decentralized/distributed system needs to provide a lookup service similar to a key-value hash table. Using this distributed data structure, it’s possible to build higher-level capabilities like peer-finding without a central signaling infrastructure.

Limitations of WebRTC Data Channels:

It’s not possible to build a DHT on top of a WebRTC connection model. We need the ability to store the “contact information” for a peer, close the connection to that peer, and then re-connect to that peer at some point in the future (if they’re still online). This is the way that DHT routing tables get built up over time. With TCP/UDP, this is quite easy to accomplish – simply store the ip:port (12.34.56.78:9000) and try to connect. If the peer is still online, it will just work. However, in WebRTC, the offer/answer connection model makes it impossible to store a peer’s “contact information” and attempt to connect again later. A WebRTC connection offer or answer is one-time use.

Required features in a solution:

I think that rectifying this requires these features currently lacking in WebRTC:

Peers need the ability to “listen” for incoming connections.
Peers also need the ability to publish a some kind of “reusable offer/answer” that multiple peers can use to connect to listening peers (currently an offer/answer is usable only once). In normal TCP/UDP, this peer contact information is simply an IP address and port (i.e. 12.34.56.78:9000). We don’t strictly need support for connecting using an IP address and port, but some kind of long-lived “contact information” for a listening peer is required.
Lighter-weight connections. It seems that the current WebRTC spec is very complex This leads to a situation where it’s really hard to open more than a dozen connections at once without the browser process hitting 100% CPU utilization. This prevents applications that would like to open 20+ connections from doing so, which makes Data Channels a lot less useful for use cases like peer-assisted delivery, peer-assisted live streaming, WebTorrent, DHTs, in-browser cryptocurrency networks, etc.

chr15m · 2020-04-19

Just want to say this would be amazing and liberate so much deventralized software from technical infeasibility. Like the other people said, web-first mail clients, and DHTs are two huge use-cases.

lidel · 2020-04-20

Our primary use-cases are application resilience, direct and multi-party collaboration, and archiving.

Resilience: We want connectivity options that enable more resilient web applications for users in low/unpredictable bandwidth scenarios, when servers are down (operator error, cert expiration, etc), and when the connection is blocked from internet shutdowns (Myanmar, Kashmir, Ethiopia and more) or censorship (China, Turkey, Thailand, Indonesia and more).

Collaboration: We want to enable users to collaborate in scenarios not well-supported by existing web platform connectivity options. Examples are publishing, editing, sharing and discovering documents on disconnected networks such as LAN, BLE+WifiDirect connections, and overlays which span hardware combinations and transitions.

Archiving: We want users to be able to revisit content and applications, as well as to ensure access to information in the state needed by those users and under challenging conditions. Archiving features enable the data survival over time, combat mis/disinformation and fake news, and give users more control over their experiences on the web.

We address these use-cases with the IPFS protocol, which is built on libp2p, a modular system of protocols, specifications and libraries that enable the development of peer-to-peer network applications. Libp2p is a standalone project, used by many applications aside from IPFS.

The WebRTC challenges related to DHTs in web content are well defined by @feross in the previous comment. All those points are valid in the context of our IPFS/libp2p use-cases.

Building web applications with js-ipfs and js-libp2p in web content means our transport options are limited to HTTP, WebSockets and WebRTC, each with challenges:

HTTP and WebSockets

Centralized and brittle: both require hardcoding address of an arbitrary service that is assumed to always work (a single point of failure)
Use in Secure Contexts requires a TLS setup that relies on PKI

WebRTC

No permanent address that can be dialed later for quick resume
Uses DTLS (a bit better, no PKI), but there is no way to use a custom keypair and no way to pass any additional metadata like we can with EncryptedExtensions in TLS 1.3
Does not work in Service Workers, which means additional overhead due to postMessage, duplicated instances

This translates into high level gaps:

Nodes running in web browser are unable to build and participate efficiently in DHTs because they cannot discover and connect directly to each other.
Need to bootstrap connections using WebSockets or HTTP each time a web application starts or WebRTC connection is initialized, which is slow and sometimes not possible (see resilience notes earlier in post)
No way to directly reconnect after initial connections have been made.
No discovery/connectivity in offline environments, or local network. The web doesn’t have a mDNS-like local discovery method for web browser to find peers on the same network and/or on the same web Origin

jonathanberi · 2020-04-20

Re-iterating IoT as a use case - CoAP (UDP) and MQTT (TCP) and other protocols would be awesome to enable in a PWA. Local/privacy-first controllers and ease dev exp are very compelling!

jbaicoianu · 2020-04-20

I for one would absolutely love it if this ability were added to the web platform. I’ve had a number of real-world use cases over the years that would have benefited from direct network access. Some are my own personal projects, some are projects I’ve worked on for various companies, and some pertain to projects I’ve worked with in the past where I thought “if only this could connect to normal TCP or UDP sockets, it would be so much better”.

SMTP client: https://github.com/jbaicoianu/elation-component-mail - I built this 10 years ago and had to implement IMAP talking to a custom backend speaking SMTP over REST, direct TCP would allow users to securely access their own mail services without having to relay through a server I have to host.
IRC client: https://github.com/jbaicoianu/elation-irc - 7 years ago, implemented with a custom WebSocket backend to allow connecting multiple clients to multiple servers. Same as above - this arrangement decreases user security and reliability, because the relay could be sniffing traffic, or the relay could go down.
Emscripten SDL2_net: https://github.com/emscripten-ports/SDL2_net - allows us to compile SDL2 apps to WASM with networking support, which Emscripten can proxy through either WebSockets or WebRTC. Normally requires additional server-side software to bridge with specific TCP or UDP services (commonly either websockify or custom server software)
PPPoW: https://github.com/jbaicoianu/PPPoW-server - Dialup over websockets. Implementing full TCP/UDP support by running pppd and a WebSocket relay in an AWS instance, allowing WASM-compiled native apps running in the browser to speak to it as if it were a dial-up ISP. Used for Win3.11 in VR demo: https://assets.metacade.com/emulators/win311vr.html
Internet Archive Emularity networking project: https://docs.google.com/document/d/16Vc-PxHxbVo5WG20tuF4zDHHa3oWO5DcEs-w3teNEYM - a work in progress, this is a proposal for extending Archive.org’s emulated software collections to support multiplayer games and internet-connected applications. This would add multiplayer support for historical games which used earlier networking technologies like null modem, dial-up, or IPX. Current experiments are implemented with a WebSocket relay which bridges to a TCP server (existing open source project).
JanusVR Presence Server: https://github.com/janusvr/janus-server - A presence server for web-based virtual worlds. Allows connections from native app users over TCP and web users with WebSockets on the same port - server autodetects type, and relays between both.
High Fidelity WebRTC Relay: https://github.com/janusvr/hifi_webrtc_relay - Bridges the High Fidelity virtual world networking protocol to the web, using a custom WebRTC DataChannel relay to let web users connect to the same servers as the existing native app, and participate in voice calls and 3d world building from a lightweight web interface.
libp2p: https://github.com/libp2p - a cross-language library for building p2p apps. Already covered in much greater detail by @lidel above.
WebTorrent: already covered in much greater detail by @feross above

All of these use cases would have been significantly easier to implement without involving a server-side relay component, and in many cases requiring a relay actually decreased the security of the end user.

Direct TCP and UDP support for PWAs would open up a new wave of interoperability between web and native platforms, and give web app developers access to decades of software and services that were just never written with the web in mind. It would allow native app developers to open up parts of their systems to web users without needing to do full rewrites of the entire networking stack to accommodate the web.

Plenty of security concerns involved, of course, but I’ll hold off on those until we get to that part of the conversation

cwilso · 2020-07-23

Given the level of expressed support, have transferred in a repo: https://github.com/WICG/raw-sockets.

trusktr · 2021-03-05

Chrome shows intent to implement this: 1119620 - chromium - An open-source project to help move the web forward. - Monorail

9pfs · 2022-01-07

Do you have any ideas for using TCP/UDP in Bugout?

ShortFuse · 2022-01-10

Are there any plans for listening for TCP connections? Perhaps I’m missing it in the API.

Still, the use case would be ability to use what’s known as “Webhooks” which are event callbacks from providers when things change.

For example, with Cisco’s Broadworks API for phones, you can create a virtual extension and connect over WebRTC just fine. But to get the actual events you need to do some long polling (Comet) or use an “HTTP Contact”. Being able to run a tiny HTTP server that receives a POST coded in JS can be extremely useful instead of having it hit a server somewhere else and then bounce that information over to the web client. It eliminates the “proxying” needed.

It doesn’t even need to be HTTPS really. And it’s a lot “better” than using using and maintaining a WebSocket connection for data. I know Asterisk uses WebSocket for events/callbacks, but that requires a constant connection which a step above long polling with Comet. The other alternative is Server-Side Events (SSE), but you have the HTTP connection limits (even with HTTP/2). Also that has the polling and keep-alive mechanism as the other. Instead, a single point of contact with a disposable connection is easier to maintain. Yes, there is also using a UDP endpoint, and that can work too technically, but Webhooks are already somewhat popular and asking these companies to move from HTTP Post is unlikely to happen.

Other providers that use Webhooks include Twilio, GitHub, Facebook, and Stripe.

Edit: There’s also the possibility of using headless Chrome/Firefox instead of Node or Deno on server environments. All I personally need is the ability to run an HTTP server, though I’m not expecting much in terms of performance. And HTTP/2 or HTTP/3 would be prohibitively slow unless we’re considering WebASM or Web Workers to help. Still, neat.

easrng · 2022-01-26

That wouldn’t be possible, most clients are behind NAT so webhooks from a server would never make it past the router, at least on IPv4. There are holepunching options that don’t require preexisting knowledge of client IPs to act as a server like pwnat but they aren’t super reliable so you’d still need a relay server for some connections. Another option is modifying WebPush to be able to silently push to an open page, but push services might not be happy about that.

Distortions81 · 2022-01-31

I think think the way to get this adopted is this:

1: Easily disabled by admins

2: Only allow connections to same domain and cert as HTTPS site.

3: Anything else requires a prompt, probably only on accounts with sudo/admin privileges (password prompt)