shouldYield - enabling script to yield to user input


#1

Explainer here, and pasted below.

shouldYield

The Problem

Today, developers must tradeoff how fast they can accomplish large blocks of work and how responsive they are to user input. For example, during page load, there may be a set of components and scripts to initialize. These are often ordered by priority: for example, first installing event handlers on primary buttons, then a search box, then a messaging widget, and then finally moving on to analytics scripts and ads.

In this example, a developer can either optimize for:

  • Completing all work as fast as possible.
    • For example, we want the messaging widget to be initialized by the time the user interacts with it.
  • Responding to user input as fast as possible.
    • For example, when the user taps one of our primary buttons, we don’t want to block until our whole page is ready to go before responding.

Completing all work as fast as possible is easy, the developer can simply execute all the work in the minimal number of tasks.

Responding to user input as fast as possible today requires paying some performance overhead, and minimizing that performance overhead is quite complicated. The most straight forward approach is to perform a unit of work, and then setTimeout(..., 0) or setImmediate(...) to continue the work.

The performance overhead of this approach comes from a few sources:

  • Inherent overhead in posting a task.
  • Task throttling from the browser.
  • Execution of other browser work, such as rendering, postponing task execution.
    • Sometimes this is desirable, to get something visible on screen sooner. If we’re executing script which is required to display our app however, this isn’t desirable.

Proposal

In order to enable developers to complete their work as fast as possible if the user isn’t interacting, but respond to user input as fast as possible if input occurs, we propose adding a new window.shouldYield() API, which returns true if there is user input pending. To avoid script from misbehaving, and preventing rendering for multiple seconds, user agents will also be permitted to return true if script has been executing for too long.

Example

Using shouldYield requires having some way to schedule tasks. We anticipate most adoption coming from frameworks and large sites. However, if you have a list of tasks that need executing, adoption is very simple.

let taskQueue = [task1, task2, ...];

function doWork() {
  while (let task = taskQueue.pop()) {
    task.execute();
    if (shouldYield()) {
      setTimeout(doWork, 0);
      break;
    }
  }
}

doWork();

Constraints

Ideally, a solution to this problem would meet the following constraints:

  1. Enable stopping JS execution when input arrives.
  2. Enable efficiently resuming JS execution once input has been processed.
  3. Provide the browser a way to stop the JS from executing when input isn’t pending, in case the browser decides it really needs to draw something.
  4. Not require paying the cost of posting a task unless the browser has pending high priority work
  5. Prevent other JS from running between when the script stops and resumes JS execution
    • This excludes JS associated with browser work like event processing, rAF callbacks etc

This proposal focuses on constraints 1, 3 & 4, and ignores 2 and 5, which will be addressed independently.

The fifth constraint is interesting - in order for work which is incentivized to finish quickly (e.g., ads script) to be able to adopt this API and improve responsiveness to input, we need some way to prevent arbitrary javascript from executing between when script yields and when it is rescheduled.


#2

Isn’t this what requestIdleCallback is for?


#3

Great question: rIC is for low priority work, and will be postponed by a variety of other things the browser could be doing - rendering an animation for example.

In particular, from the spec:

The user agent SHOULD choose deadline to ensure that no time-critical tasks will be delayed even if a callback runs for the whole time period from now to deadline. As such, it should be set to the minimum of: the closest timeout in the list of active timers as set via setTimeout and setInterval; the scheduled runtime for pending animation callbacks posted via requestAnimationFrame; pending internal timeouts such as deadlines to start rendering the next frame, process audio or any other internal task the user agent deems important.

rIC is for scheduling low priority tasks, but this API enables keeping work scheduled unless an extremely high priority task (i.e., an input related task) comes in.

In the loading example I outlined, we don’t want to consider initializing a search box to be low priority, we just want to prioritize input higher.


#4

So this is basically rIC but with a higher priority? In that case why not reduce the API surface by making it a priority parameter to rIC?

Even if you do that I think it will be tough to unambiguously define each priority level, and of all the possible events the browser could handle, which count as being allowed to interrupt which priority levels.


#5

If you want to respond to input as quickly as possible using some kind of “high priority rIC”, you need to break your tasks into tiny pieces, such that input could be processed between consecutive pieces of work. Breaking the tasks into tiny pieces means you’ll end up paying the overhead of posting a task many times, which is too expensive.

See constraint #4. We want a solution that does “Not require paying the cost of posting a task unless the browser has pending priority work”.

This proposal attempts to avoid talking about task priorities. The only signal we need to define is what it means for user input to currently be pending, which should be reasonably straight forward.


#6

But your own example uses setTimeout, which will post a task to invoke the callback after the specified timeout. It’s the same as rIC: you simply work for as long as it is safe to (50ms to meet the threshold of perception of instant response), then post a task to do the next chunk. The overhead of posting a task should be negligible compared to 50ms of useful work.


#7

In my example, we only post a task if we receive input.

We’re trying to eliminate the 50ms of latency introduced by a rIC type approach. One solution would be to break the work up into (for example) 1ms tasks, but then the task posting overhead becomes prohibitive.

You’re right that if we’re willing to accept 50ms of added latency, we could get away with breaking the work into 50ms chunks. If we want to eliminate that latency, we need a new API.


#8

It’s not 50ms of latency. It’s 50ms of useful work until you have to post a new task.


#9

Sorry, not quite following. The situation here is that we have some work that needs to get done, which isn’t related to input processing. In the example in the explainer, suppose we’re done installing event handlers on primary buttons, and are now working on initializing the search box, when a user taps on a primary button.

If we’re initializing the search box in a series of 50ms tasks, then if the user taps at the beginning of a 50ms task, we get an extra 50ms of latency in processing that tap.

On the other hand, if we initialize the search box in a series of 1ms tasks, then if the user taps at the beginning of a 1ms task, we get an extra 1ms of latency in processing the tap. However, initializing the search box will take much longer due to the overhead of posting a ton of tasks.

Finally, if we use shouldYield(), then we can frequently check if there’s pending input while initializing the search box. Suppose we check once every 1ms. Then we get a maximum of 1ms of added latency, and only pay the overhead of posting a task if user input shows up.


#10

That’s within the threshold of perception of instantaneous. So nobody will notice.

Also I’m not sure if browsers are allowed to change the rIC deadline dynamically, but if they are, they could adjust it in response to incoming input events so it bails out earlier.


#11

An event which is handled in 50ms is perceived as instantaneous in some cases.

Keep in mind that there’s also the event handler duration: an event with 50ms of queueing time and a 100ms event handler will feel noticeably more sluggish than an event that just has a 100ms event handler.

There are also many types of input for which a 50ms feels nowhere near instantaneous. Typing for example. Compare typing on this page vs this page.

One of these pages has (almost) back to back 50ms long tasks, the other doesn’t. It’s pretty easy to tell which is which.

Touch dragging is even more latency sensitive, but touch dragging that blocks on script execution is fairly rare during page load.

Making a higher priority rIC as we discussed above, and then also enabling it to update it’s deadline by changing the result of IdleDeadline.timeRemaining() would be another possible solution. It feels more complicated and less ergonomic to me though. What should the default deadline be for a high priority rIC? There are certainly times where executing more than 50ms of work is fine during page load. What does “high priority” mean in the context of an “idle callback”? The two sound somewhat contradictory.

The high priority rIC approach would also prevent us from addressing constraint #5 in the future, as it forces a specific way of rescheduling yourself, whereas ShouldYield allows rescheduling yourself any way you want.


Aside: The literature doesn’t claim that there’s no value to responding in < 100ms. If you respond in < 100ms then users feel like they’re directly manipulating objects in the UI. This is fairly subjective. More rigorous research has shown that users can identify differences in latency down to 6ms (and some participants in this study down to <3ms).


#12

This api would be incredibly useful for Facebook!

Looking at our data for the time between when we an event and the event start time (in Chrome only, since in Chrome event start time is when the browser got the event not when it was dispatched) it looks like this could improve many interactions by 50-150ms on average for us. This would be very impactful, as an improvement like that tends to positively affect downstream metrics in large ways.

We would use this API in places where we wouldn’t normally like to take the perf hit of deferring work via requestIdleCall back, but we would like to stop what we’re doing if we know that there’s higher pri work. A great example here is loading the site, normally we want to totally focus on finishing loading the current page, however if there’s a click event we’d rather start focusing on that. shouldYield would let us know we need to pause running the scripts involved in loading the page so we can get the click event and change our priorities based on the event.


#13

Are you referring to the Event.timestamp? Firefox also uses the timestamp from when the event was generated, not when it was dispatched (as of Firefox 33, about 4 years ago).


#14

This would be useful for our RUM library, Boomerang, as well.

Boomerang is often loaded as a third-party library before onload, so any work we’re doing can affect responsiveness and possibly trigger long tasks. We have two main phases of work before onload (“initialization” and when we receive “config”), and through our experimentation and observation[1], we can bump against or exceed the 50ms LongTask threshold, especially on lower-end devices.

We’ve been considering how we can break apart the work we’re doing during these phases so we’re not executing solid chunks of work, potentially getting in the way of page responsiveness. We’re planning on using setTimout() / setImmediate(), but when we have a lot of little chunks of work, it would be nice to have something like shouldYield() to signal we can run serially.

One thought is how shouldYield() would interact with synthetic and RUM Time to Interactive measurements – if a developer is using shouldYield() and they know work is not pending, it would be good to incorporate this so TTI calculations aren’t extended.

[1] https://nicj.net/an-audit-of-boomerangs-performance/


#15

Good point regarding impact on metrics. We’ve considered this for some of our internal metrics, but haven’t thought about it for the long tasks API.

Should a 60ms task with a shouldYield call at 30ms fire a long task event? Probably not. We likely want to consider shouldYield() calls to be the equivalent of ending the current task and immediately starting another.


#16

I’ve been thinking a fair amount about this. My first thought is that previous task queuing systems I’ve been part of designing have chosen not to expose this type of bit because all tasks should self-manager. Tim and I have discussed and I’m willing to agree that the single-threaded nature of the web may make it different and thus perhaps this type of API is needed. (Note that my preferred method for improving UI thread availability is continuing to improve the ability for JavaScript to do important work off of the UI thread.)

I wonder if exposing the oldestInputQueuedTime would be a better primitive? This allows the user code to determine how many ms is important. High speed games may want to use 16 ms while other applications may want to use something as high as 200 ms. This also couples the API to input rather than making it a heuristic that browsers can fiddle with after shouldYield becomes used. This helps to ensure that user agents behave in a similar way as the primitive of exposing the oldest input event’s time is a crisp primitive.


#17

I 100% agree that ideally we’d move more work off the UI thread, but in cases where that isn’t possible (and until we’ve build sufficient tooling to make that easy), I think a primitive like this makes sense.

One of my concerns with this API is that developers will avoid yielding for multiple seconds if no input shows up, preventing any rendering at all during page load, for example. That’s the only reason I opted for shouldYield over hasPendingUserInput or similar - to give the browser an escape hatch if the page is blocking for extremely long periods of time.

If others don’t share that concern, then I’m certainly happy focusing on just the user input case.

My thinking previously was that developers would want to yield as soon as input showed up in all cases, but it’s true that there are cases where input is lower priority. This could be a bit of a slippery slope though. Knowing how long the event has been queued is helpful, but maybe you also want to respond more quickly to touchmoves than to clicks! Maybe clicks in some locations are higher priority than clicks in other locations! Pretty soon we’ve invented a polling based event system.

Just exposing the oldest input timestamp does feel like a reasonable position along that continuum though. I’m happy to rephrase either as a boolean hasPendingUserInput or oldestInputQueuedTime.