[Proposal] Long Task notifications in Performance Observer

panicker · 2016-07-26

Problem description: certain tasks can take a long time (multiple frames), locking up the UI thread and blocking other critical tasks. To the user this is may manifest as delays in responding to input (tap, click, wheel), janks in animation or scrolling etc. See detailed problem description in the explainer Long tasks are a major source of bad user experiences on the web today.

Proposal We’d like to propose a performance API to enable applications to detect presence of such “long tasks” that monopolize the UI thread for extended periods of time and block critical tasks.

The explainer is in this Git repo has more details.

PhilipTellis · 2016-07-26

If a long task is running and blocks the UI thread, won’t this also block any Performance Observer events from firing, or is this something that can be pushed off to a WebWorker?

panicker · 2016-07-26

It is a non-goal to deliver the notification synchronously when the long task happens, instead they will be queued and delivered periodically (probably with the next begin-frame render cycle). Hope that clarifies the proposal?

The proposal focuses on reporting-only, although as described the API does not preclude the host app from taking action on misbehaving nested iframes.

igrigorik · 2016-07-26

+1 to @panicker’s comment.

Also, to expand that: the same statement is true for all PerformanceObserver notifications.

“The performance timeline task queue is a low priority queue that, if possible, should be processed by the user agent during idle periods to minimize impact of performance monitoring code.” - performance timeline

PhilipTellis · 2016-07-26

This may not be as useful since users tend to leave if a page locks up, and then it’s too late to beacon or do anything about.

panicker · 2016-07-26

A completely unusable page that leads to user abandonment is the extreme case. This proposal intends to address a much broader problem space – for instance delays in interacting with the page right after load (time to interaction), delays in responding to clicks (eg. user interacting with a widget on the page which should update in response to click) , noticeable jank in animation or scrolling etc. I’ve clarified the text in the post. Also this is explained in more detail in the explainer.

Nic_Jansma · 2016-08-02

I think this proposal is a fantastic idea and would definitely be something we (a RUM vendor) would be interested in capturing for our customers if available. Attribution of potential performance issues seen in first- and third-party scripts is on the minds of a lot of our customers.

The name attribution proposal seems reasonable.

Assuming this probably wouldn’t be tracked proactively, but only “captured” when turned on via PerformanceObserver, correct? Unlike ResourceTiming, which I would argue is more important to have everything captured even if no one is listening (up to 150 limit or everything before onload) to be able to generate complete picture of for a Waterfall (so a RUM script doesn’t need to be sync loaded in the HEAD), this seems like it should only be captured if the page is specifically requesting it, and long tasks that happened prior to it being turned on are just lost.

While I think Philip’s note that a completely locked-up page might not be able to deliver these events is an important point, I think that’s an case extreme case – and maybe could be solved by something similar to Network Error Logging (“Page Freeze Logging”).

The data we would get out of this proposal from all of the various little Long Tasks on a page will be very useful.

panicker · 2016-08-03

Hey Nic,

Thanks for the response. Yes this will only be captured when a PerformanceObserver is registered. We wouldn’t surface or hold on to long tasks that occurred prior to PerformanceObserver being registered.

panicker · 2016-08-05

Here’s some preliminary data and a brief writeup from surfacing Long Tasks on some popular sites:

Main take-aways from this exercise:

The API is very promising as an indicator of how performance-tuned the site is, and the user’s experience on it.
The proposed API is not too difficult to implement (in Chrome)
The heuristic of considering scripts-only provides very good (~90% coverage) for frame context attribution

Comments and feedback are welcome.

igrigorik · 2016-08-17

FYI, blink intent to implement: https://groups.google.com/a/chromium.org/d/msg/blink-dev/A_sM-fu6u50/ao9mnO8SAQAJ

toddreifsteck · 2016-08-24

Nic, would your product be likely to enable this early in the page lifecycle (in HEAD) or later after page load?

Nic_Jansma · 2016-08-25

@Todd - We’d want to enable it as early in the page load process as possible to ensure we have a full picture. Our script (boomerang.js) loads async, and we’d want to turn it on as soon as it’s loaded, but we’d also probably provide a snippet to our customers to turn it on in their HEAD so we can hand the data over to boomerang.js once it’s later loaded.

We also might keep it running post-page-load depending on the scenario and what the customer wants to capture. In the case of SPAs, it might make sense to turn it on/off during soft navigations.

panicker · 2016-10-27

Long Tasks spec will be moved to WICG, we have ample agreement (from TPAC). [Consider this the Intent to Move]

yoavweiss · 2016-10-27

Repo is now under https://github.com/WICG/longtasks !!