[Proposal] Tasklets


#11

If I understand your question correctly: Each call to addModule() will create a new instance of that tasklet. On which thread ends up is an implementation detail and left up the UA.


#12

What’s important to me is that bad APIs are blacklisted rather than good APIs be whitelisted. If this API goes through I’m assuming that would essentially deprecate Web Workers, so Tasklets need to be at least as powerful. It’s already a problem that Workers are often forgotten about (ex. FormData not being supported in Safari workers). But it sounds like you’re taking the right approach here, so I’m happy.


#13

I’m not sure that is a direct consequence. WebWorkers still have a use-case for extremely heavy synchronous work (re-encoding videos?! idk). But your concern is definitely something to keep in mind :slight_smile:


#14

By default (in my head at least :slight_smile:) here is one TaskletGlobalScope associated with the window.tasklet.

Calling tasklet.addModule('thing.js'); will load the script inside the same global by default.

The key there here is that by default libraries & user code shares the same global and thread instead of having multiple globals & threads which have higher memory cost.

Potentially tasklets on different origins could live on the same thread, but this would be up to each user-agent.


#15

That section of the explainer is basically we don’t want sync xhrs and similar things inside the TaskletGlobalScope. Nearly every API available in a worker would be available here. :slight_smile:

Sync APIs inside workers are ok, as a new Worker script assumes a separate thread for processing, e.g. it is fine to have a worker which does:

while(true) {
  self.postMessage({prime: calculateNextPrime()});
}

… and never yields, we want different pieces of code to share the same thread, for background processing.


#16

It should also be noted that this isn’t a “new” idea.

There are lots of different libraries on top of workers today which do things in a similar spirit:

E.g.

…we think it’d be great if we could get a solution which we can have by default on the platform :grinning:


#17

Unless I’m missing something, that doesn’t feel quite right.

It suggests that…

tasklet.addModule('foo.js');
tasklet.addModule('bar.js');
tasklet.addModule('//other-origin/script.js');

…would share a global. But then multiple calls to tasklet.addModule('//other-origin/script.js'); from other origins couldn’t share the same global.


#18

The same tasklet could be instantiated in the same global multiple times, couldn’t it? That’s how I thought about it so far.


#19

That doesn’t mean “one TaskletGlobalScope associated with the window.tasklet” then


#20

My bad. I was confusing scope with underlying thread. I think each tasklet would be isolated in terms of scope.

@iank should confirm/correct me here.


#21

Just to confirm:

// tasklet.js
let i = 0;

export function increment() {
  return ++i;
}
const t = await tasklet.addModule('tasklet.js');
await t.increment(); // 1
await t.increment(); // is this always going to be 2? (assuming no other calls to increment)

const t2 = await tasklet.addModule('tasklet.js');
await t2.increment(); // 1 or 3?

#22

In my mental model:

const t = await tasklet.addModule('tasklet.js');
await t.increment(); // 1
await t.increment(); // 2

const t2 = await tasklet.addModule('tasklet.js');
await t2.increment(); // 1 

#23

Ok, so a TaskletGlobalScope per whatever-addModule-resolves-to. Although multiple globals can run in the same thread.

It’s probably too early for bikeshedding, but the “module” naming seems wrong. With modules, two ‘imports’ of the same module name will return the same instance, but that doesn’t seem to be what’s proposed here.


#25

I’m very much in favor of this and I like the API. My only feedback is regarding constructors being synchronous, and the example given of the error thrown in the constructor not being thrown in the main thread.

If I understand correctly, this is how it would work with the current proposal.

//tasklet.js
export class ThingDoer {
  constructor({id,target}) {
    if(!id) { throw new Error('Thing doers need an ID') }
    if(!target) { throw new Error('Thing doers need a target') }
  }

  doThing() {
    if(!canDoThing) { throw new Error('Cannot do thing') }
    // do the thing
  }
}
const module = await tasklet.addModule('tasklet.js')
const ThingDoer = module.ThingDoer
let doer;
try {
  doer = new ThingDoer()
}
catch(constructorError) {
  // will never happen
}
try {
  const result = await doer.doThing()
}
catch(resultError) {
  // this is either a constructor error or the error thrown by the function
}

I think it might would be better if it was possible to catch errors thrown by the constructor somehow. Maybe something like this

const doer = new ThingDoer()
try {
  await tasklet.whenConstructed(doer)
}
catch(error) {
  // one of the two types of errors that could be thrown by the constructor
}

#26

So, if I understand you correctly – you want a whenConstructed() so you know when the constructor failed. But what would whenConstructed() do? You could actually implement the behavior you are proposing already:

//tasklet.js
export class ThingDoer {
  constructor({id,target}) {
    if(!id) { throw new Error('Thing doers need an ID') }
    if(!target) { throw new Error('Thing doers need a target') }
  }
  whenConstructed() { /* literally does nothing */ }
  doThing() { /* ... */ }
}

That feels pretty awkward. I can see how it might be important to know whether it’s the constructor that failed or the actual method you invoked. What do you think about this? (@iank and what do you think?)

const module = await tasklet.addModule('tasklet.js')
try {
  const thingDoer = new module.ThingDoer();
  await thingDoer.doThing()
}
catch(error) {
  // error.constructorError === true
}

#27

I don’t want to know just that an error was thrown by the constructor, I want to know about the error that was thrown by the constructor. Having error.constructorError is better, but because in my example, an error can be thrown for either an invalid id or an invalid target and a developer might want to do two different things depending on which of the two errors is thrown.

Also, when the error is thrown can important.

Consider this example, written so it could work both as a tasklet and as a regular module:

// tasklet.js
export class Adder {
  constructor(number) {
    if(typeof number != 'number' || isNaN(number)) { throw new Error('Invalid number') }
    this.number = number
  }
  async addOne() {
    return this.number + 1
  }
}

We don’t expect addOne to throw the error because it’s a very simple function, and validation of what’s passed to the object is done in the constructor, so we probably won’t use a try/catch when we’re calling addOne.

// as a module
import Adder from 'tasklet.js'
// or as a tasklet
const {Adder} = await tasklet.addModule('tasklet.js')

try {
  const adder = new Adder(parseInt(document.getElementById('number-input').value))
  // If the constructor throws an error when loaded as a module, the stuff after this will not happen
  // If it's a tasklet, the next lines will run.
  const statusElement = document.getElementById('status')
  statusElement.classList.add('has-result')
  // Even if the input has "foobar" instead of a number, the `has-result` class has already been added and the next line will fail
  const result = await adder.addOne()
  statusElement.innerHTML = `Result: ${result}`
}
catch(error) {
  // If loaded as a module, this happens right after the the constructor throws the error
  // If loaded as a tasklet, this happens after the `has-result` class has been added.
  alert(error.message)
}

I can imagine having weird problems in situations like that, for example when migrating an existing module to a tasklet. It could take a lot of time to find bugs like that an refactor. Since constructor errors aren’t thrown until after instance functions are called, we have to expect that any instance function can throw an error, which I think is probably not good.

I threw out whenConstructed just to have some kind of promise-based thing that could give the error thrown by the constructor. I just borrowed the idea from customElements.whenDefined() because it’s kind of similar to what I want, in that it’s a promise-based thing that waits for something - in CE’s case, some JS to load; in tasklet’s case, something on another thread.

I thought about it and something like this would make it so we can get the error in basically the same place without changing much code by wrapping constructors in a promise-based thing.

// ... load the tasklet
try {
  const adder = await tasklet.constructObject(Adder, parseInt(document.getElementById('number-input').value))
  // if the constructor throws an error, it will throw here as a module OR as a tasklet
  const statusElement = document.getElementById('status')
  statusElement.classList.add('has-result')
  const result = await adder.addOne()
  statusElement.innerHTML = `Result: ${result}`
}
catch(error) {
  alert(error.message)
}

async tasklet.constructObject(someClass,[arguments]) just resolves with the constructed object, or throws the error thrown by the constructor. I think it’s also easier to polyfill than making constructors throw errors asynchronously. What do you think?


#28

I am not convinced this proposal solves a real problem.

We build a large PWA (~250k LOC) and make extensive use of Web Workers for asynchronous processing. We have a small framework which does the following:

  • creates a “job worker” per navigator.hardwareConcurrency (i.e. one worker per hardware thread)
  • wraps the postMessage() bridge in a simple call, e.g. JobScheduler.Do("expensiveJob"), returning a promise

Creating a new task to be run asynchronously - even a small one - simply involves adding an extra worker import script with a function to do the work, and then calling Do("jobName") from the main thread. Then you get a promise that resolves when the work is done.

This appears to invalidate the starting assumptions of this API:

  • postMessage() is not used by any callers, so its “clunkiness” doesn’t matter
  • a fixed number of workers are used, regardless of how many types of jobs are defined, so the per-worker overhead doesn’t matter

We’ve also done a lot of work to intelligently schedule jobs between workers, hold the queue off the main thread so workers can pick up new jobs even when the main thread is busy, etc. Providing enough work is submitted, we can comfortably max out all CPU cores on a device and achieve the highest possible throughput for processing that work, which is useful for certain batch jobs like importing a folder of audio files.

It’s not obvious to me that tasklets add anything useful beyond this, so it seems that what developers really need is a good framework, not a new API surface.


#29

Yes, this API is first and foremost about developer convenience (and opening up opportunities for further performance optimizations). The proposed API could be implemented in user space right now.

However: While postMessage can be abstracted, by default it is not. It’s up to the developer to either roll their own RPC protocol (like you did) or pull in some 3rd party library like those in @iank’s post.

As a result WebWorkers are there but scarcely used. That is one of the problem this API is trying to solve: Lower the barrier of entry to off-main-thread work so that architectural patterns like native platforms use can be adopted on the web. As a result, I hope to see reduced jank on the main thread in general.

Again, the main reason why this is proposed as a platform API is to allow browsers to spin up only one thread for tasklets of different sites (something that is not possible with WebWorkers). Additionally I could see fast-paths being added for certain types that get passed around from Tasklets to other threads.


#30

That is a good point.

I have to admit I am hesitant to add a special function for constructing objects as it also strays away from normal JS, but I do see the necessity. I am wondering if that needs to be baked in the API.

I have turned your question/suggestion into an issue on the repo, as I don’t have a good answer myself right now.


#31

Generally I am against any spec that can be implemented in user space. Why not focus on something that isn’t already possible, rather than something that can be done already, even if it takes a small framework?

On what basis do you think tasklets will actually change this? If the real problem is something else - e.g. the fact async code is just harder to reason about in general - then tasklets won’t necessarily fix that problem, either. Another way of approaching that same issue would be for Google to develop an official web worker framework that does it, then use Google’s considerable developer evangelism resources to promote it. Specs aren’t the way to solve every problem!

Maybe it’s not even a problem - what if most websites just don’t need to do a lot of background processing?

Is this really out of the question for web workers? Why can’t web workers from two different origins share a thread, particularly if one is from a page in the background?