[Idea] Local devices API (LAN services)

The local network is no longer a first class citizen of the web world. It’s easier to trust a faraway server than your NAS, TV or thermostat sitting right next to you. This is sad and could be different.

Before committing to writing a full spec I want to pitch the general idea. This due to the sheer amount of seemingly stalled efforts in this space (Network Service Discovery API, FlyWeb, raw-sockets, TCP and UDP sockets and numerous threads on this forum). My goal is to gather initial insight and discuss potential security implications before continuing towards a formal spec.

Security

To ensure strict security, this effort aims to treat the LAN in the same way as the web at large. We don’t trust any devices by default and all security efforts, such as CORS and CSP, should apply.

Trust

The biggest missing piece for connecting to local devices is handling trust and doing it in a user friendly way. One can host a service on a LAN with self-signed certificates and add those to their certificate store. However, this is far outside the capabilities of a regular user. ChromeCast and similar technologies show that this can be done in both a user friendly and secure way. The proposed API would allow a browser to initiate a connection to a device on the LAN. The API can narrow down the devices it wants to talk to. The exact identification or filtering system can be decided later. This instructs the user agent to scan the local network and give the user an overview of compatible devices. The user makes his choice from the list. This prompts the user agent to start a handshake procedure. The first time this handshake takes place the user has to manually confirm the authentication using a ‘PIN verification’ step as used in ChromeCast or Bluetooth authentication. This is done to avoid MITM attacks. From that point forward, trust has been established between the user agent and said local network service. In practice this likely means that a self signed TLS certificate is now trusted for this device or service. I see this ‘trust’ or security context between the user agent and a device/service on the local network as orthogonal to the protocols that use it. Below I specify two potential uses: message passing and local HTTPS.

Offline-first

To be truly useful for IOT use-case the entire setup should work without internet access. This means, no cloud and no certificate authorities, only what is available on LAN. No-one likes it when their vacuum stops working because it can’t phone home to the cloud.

Known services/devices

If I wanted to build a WPA that serves as a remote control for my TV it would be useless if I have to go through this consent flow every time. Therefore, the user agent can remember devices on two levels. First, when the user agent has a ‘trust relation’ with a service, it can re-establish the connection without the ‘PIN verification’ step. Secondly, the user agent can track access permissions per origin, just like is done for UserMedia. Finally, the user agent may choose to synchronize these settings across devices.

Protocol

Giving access to low level protocols opens up a lot of security concerns. This proposal explicitly avoids that by letting the browser manage the trust relation with a local service or device. However, this means a new tailored protocol must be introduced. Just like ChromeCast it would use existing protocols under the hood, E.g.: SSDP/UPnP for service discovery and TLS for the handshake. Bridges can be built for backwards compatibility until devices adopt the new standard.

Message passing

For use cases in IOT I suggest a simple message passing API that allows bidirectional message passing with the LAN device. As to not reinvent the wheel, this may be the WebTransport API that uses the security context provided by this proposal. Use-case include: Web based IOT hubs, remote controls, …

[Addendum] Local HTTPS

This approach would allow users to securely connect to services on their local network. Once trust is established with a local service, the resulting certificate can be used for a HTTPS connection. A variation of the above approach may open up an opportunity to combine HTTPS with mDNS. Potentially allowing you to directly navigate to an HTTPS mDNS URL, e.g.: https://device.local and letting the browser automatically trigger the verification and consent flow described above. Use-cases include: Finally being able to directly connect to a local NAS with a user friendly and secure context.

5 Likes

AFAIK, WebRTC can connect two devices over a LAN that can then communicate over a DataChannel. The main problem is the signalling, i.e. how to exchange the offer and answer between devices on the LAN. Commonly this is done over the Internet, e.g. a WebSocket connection to a signalling server. There are also some interesting approaches using things like QR codes to exchange data locally. However it would be useful if the LAN itself could act as a signalling service and use the actual local network to exchange the offer and answer and establish a local connection, without having to make the user manage some kind of manual exchange.

Perhaps it’s also the case that the most minimal and simple way to solve LAN networking with web technologies is a small API that solely serves as LAN signalling for WebRTC. Then apps can set up a DataChannel (and/or video/audio streams) and go from there.

This would be good for us for LAN-based multiplayer with offline web games. It could also simplify deployment as you could set up local multiplayer without having to also set up a signalling service, contact STUN/TURN servers, etc.

1 Like

Hi AshleyScirra,

Yes, WebRTC signaling would be a great application as well. I like the idea of reviving LAN games. I also imagined connecting to a doorbell video feed without requiring signaling via the cloud. It could be as seamless as casting video to your TV today.

In the router, DHCP reservations assign a unique MAC to a IP address. Using that along with a random port is a fairly good level of security. Also possible to pass a secret along too, e.g. in the URI string.

This isn’t a user-friendly process but IMO only because of paternalistic UX design. On my Telus router (Canadian telecom) for instance, DHCP options are hidden away, behind an ominous warning screen.

This would be huge! These are the use cases that excite me.

Signaling is the most insecure part of a WebRTC connection

You put a lot of faith in the remote server to not modify your offer/answer. If an attacker could intercept your message they can MITM you. If I am transferring a file to someone in my LAN why should I have to connect up to a remote server?

I don’t want my devices connecting to the internet

To echo what @backkem-gh said it is a shame that my NAS, TV, Security Camera etc… needs to give access to the world. I really would prefer that my phone be able to connect directly!

I want a vendor agnostic protocol for casting

I am frustrated that I can’t cast to my smart TV from my Linux laptop. I have tried some of the stuff out there, but it is frustrating. I want something that is standardized and I can easily send Pion/ffmpeg/GStreamer/$x to it.

I want P2P in air gapped networks


@AshleyScirra you can connect two WebRTC peers without signaling in a LAN! It requires pre-configuration and exploits some undefined behavior, but it works. Check out this repo pion/offline-browser-communication.

I also wrote webrtc-uri but not as excited about it after feedback from others. One interesting idea I heard is that instead of storing details in the URI you should broadcast them in mDNS. A WebRTC Agent could announce their ICE Details, and then we just need to have control of certificate the agent uses.

I don’t know where to take any of these ideas from here. I am interested and happy to make the changes happen in multiple WebRTC implementations.

Hi @backkem-gh and all,

I note that the W3C Second Screen Working Group currently develops the Open Screen Protocol, actually a suite of network protocols, to implement the Presentation API and the Remote Playback API in an interoperable fashion.

As the name suggests, the Open Screen Protocol is currently restricted to actual screens connected to the LAN (e.g. a TV set, a set-top box or a Chromecast key) and the Second Screen Working Group is not chartered to go beyond that. However, it should be relatively easy to extend that protocol to other types of devices (and the group is making sure that this will be doable).

The suite includes protocols to handle discovery (using DNS-SD over mDNS), transport (using QUIC) authentication (using SPAKE2), and more specific application protocols for the Presentation API and the Remote Playback API. Discovery, transport and authentication should in particular be easy to reuse in other contexts.

The protocol is neither finalized yet nor shipped in browsers, but I note that Google is working on an open-source Open Screen Protocol library.

2 Likes

This is great! I’m putting the protocol spec on my reading list.

I suggest we try to write a absolutely minimal spec that opens up these protocols for use in broader context. Once we have that we can try to get broader support within the community.

Hi all, here is my attempt at writing a proposal based on the Open Screen Protocol spec:

Anyone can comment on the doc. Please do, we need your feedback! In addition, feel free to request write access here.

1 Like

I forgot to post it here but I did a lighting talk about this proposal at the Second Screen WG/CG - 2021 Q1 virtual meeting. You can also find my slides here. I’d like to thank them for the fun session and great reception. The main points of feedback gathered:

  1. Having a generic list of local devices leads to some problems:
    • Applications are only able to list all devices since there is no obvious way to filter them. This doesn’t provide a very user friendly device overview.
    • Users don’t have much control to limit the access of an origin to specific device types. E.g.: a game should only get access to peers or controllers. A remote control app should only have access to TVs. Neither should have access to tour NAS or a doorbell. This filter should be clear in the user consent dialog.
    • → We may need to introduce some form of ‘device type’ to amend this.
  2. Turns out there is a draft Go implementation of the OSP. This may be of interest to you @Sean-Der .

PS: I’d also like to attract some attention to the other lighting talks given as they embody the spirit of this discussion IMO:

1 Like

Coming from Plex (where we’ve got quite a bit of experience building web apps that connect to LAN services), I’ve got a few thoughts here:

  • TLS is a powerful tool for LAN connections
  • At the same time, it can be a double-edged sword: with current browser tech, when DNS is unavailable (ISP outage or overzealous router filtering), TLS is useless
  • There’s a proposal to tighten security when communicating with LAN devices at GitHub - WICG/private-network-access
  • But that proposal does nothing to help people who want to build robust, secure LAN services actually do so, and those capabilities are badly needed; see There is still no complete replacement for LAN plaintext connections · Issue #23 · WICG/private-network-access · GitHub
  • While companies like Plex have the infrastructure to hand out millions of browser-trusted TLS certificates to user devices (via Let’s Encrypt), that’s not always going to be practical for everyone, so a way to communicate with devices using private PKI would greatly improve adoption for LAN cases
  • Those capabilities are currently available via WebRTC (which has limits as discussed in this thread, and is far too heavyweight and complex to practically expect LAN service developers to implement), and will soon be via WebTransport (which is too lightweight, and requires the bulk of an HTTP stack to be built on top of it)
  • The obvious solution to handle these cases is to combine the private-network-access spec (for server-side cross-origin secure-by-default policy) with an extension to Fetch and other core browser networking tools (e.g. media loading/playback) that allows connecting by IP address with a custom set of expected (D)TLS certificate fingerprints

The only argument against this solution I’ve seen is the one that came up in that issue thread: “allowing custom fingerprints would allow developers to build insecure apps (by unsafely reusing keys) without triggering mixed-content restrictions”. This is, in my opinion, an extremely weak argument against an extremely useful feature.

If you buy into it, you have to also acknowledge that the exact same issues exist for WebTransport and WebRTC (the latter of which is already widely deployed!); they have the same security properties while just being harder to use.

I’d very much like to discuss this issue (and our experience in this area in generally) further with anyone in a position to move forward with the concept.

Hi rcombs,

Thanks for all these pointers. I didn’t come across the private-network-access proposal yet. If I read it correctly, they are bringing CORS to the LAN. This ‘LAN devices’ proposal is aimed at solving their ‘non-goal’:

Provide a secure mechanism for initiating HTTPS connections to services running on the local network or the user’s machine.

One of our design goals is to be offline-first with zero reliance on central infrastructure, including DNS or CAs. To that end we use self-signed certificates that are exchanged in the ‘pairing’ step, just like Bluetooth or Chromecast.

That being said, I do see this proposal as orthogonal to any protocol running on top of TLS. If a user-agent could consider the self-signed certificates obtained during the pairing step as part of their certificate store, it may be usably by all existing protocols running over TLS: HTTPS (fetch), WebSocket, WebTransport, … From reading your references, it seems apparent that a certificate rotation strategy may be missing from the current proposal. I’d like to hear your thoughts.

It would be interesting to hear your thoughts on such a pairing-based approach? Would this make sense in the Plex world? Are people used to this from other products like Chromecast?

In addition, I’m definitely open to explore how we can use the offline-first foundation that we aim to provide while increasing user-friendliness using external infrastructure. Maybe we can allow the use of pre-shared keys or cloud-based signaling to make the pairing step transparent to the user. This seems feasible to me in your setup.

Right, my concern here is that if private-network-access goes forwards, it will block the cases where connecting to LAN devices sans-DNS/sans-CA-signed-cert do currently work (HTTP-loaded pages), so this case being unsolved should be a blocker on that change advancing.

This would be excellent, though note that for our use-case it’d be extremely helpful to also support HTTPS requests made by the browser engine itself rather than via fetch, e.g. audio/video.

So, in Plex, we issue browser-trusted CA-signed certs and rotate them every 60 days, which works fine normally; the problem is with the client not immediately knowing about the new fingerprint during rotation. With normal DNS-name-based cert trust, this isn’t an issue, since identity is based on DNS name (which is constant across rotations) rather than e.g. a hash of the cert itself. But I recognize that that’s not a great solution for everybody, since setting up centralized public-CA-based issuance infrastructure isn’t always practical, so I imagined a couple possible solutions:

  • My favorite option is to have the server hold onto older certs for some period after renewing (e.g. for Let’s Encrypt certs, keep the cert for the full 90 days of validity when renewing at 60), and having the client signal the fingerprint it’s expecting via a TLS extension (SNI-style); this is detailed in the GHI thread
  • Alternately, the server could renew early, but wait to rotate to its new cert until expiration is imminent, giving the client enough time to find out about the new cert’s fingerprint (either directly from the server or from a centralized service) and annotate both currently-valid ones. I don’t like this approach much as it requires substantially more infrastructure work (and would be probably result in plenty of subtle implementation bugs), but it’s doable.

My biggest concern is that often “pairing” ends up just meaning “trust on first use”, which is better than no authentication at all but still somewhat lacking. It can be an option for devices with absolutely no other way to authenticate (display-less bluetooth devices come to mind), but any pairing protocol should provide a safe way for the parties to confirm each other’s identities (without relying on the developer to design something secure themselves).

Yeah, we exchange device identities (DNS names, potentially cert fingerprints) and user auth data via a centralized HTTPS service, so the pairing concept would only apply to us in initial-setup cases; we definitely wouldn’t want to force the user through that process for every device they own.

In the IEEE we made a standard for network service discovery 802.21, and the 802.11 team also made their own FYI.

I created an initial repository for this proposal: GitHub - backkem/local-devices-api: Securely connect browsers and devices on the LAN. I tried to capture the remarks made in this thread into issues.

@MichaelGlennWilliams IEEE 802.21 refers to “Media Independent Handoff”. I couldn’t quite figure out how that relates to service discovery. Is it a small piece of that larger spec? If so, do you have any reference material on that?

This is from their tutorial on the 802.21 website. It is called the MIH information service.