We’ve had several feature requests for adding the ability to resize the heap of an asm.js module. Right now, the asm.js rules force you to pick a fixed heap size ahead of time and this is undesirable when the module needs to manipulate a dynamic amount of data and run reliably on 32-bit devices. We have a proposal to extend asm.js to allow changing heaps and we’d appreciate any feedback. This has not been implemented in Firefox yet, so we’re definitely open to changes.
The proposal is to allow a special heap-changing function in asm.js that allows the caller to pass in a new ArrayBuffer that replaces the current heap encapsulated in the asm.js module’s closure state. This heap-changing function would have to take a very specific form so that AOT compilation cannot possibly be invalidated by pathological callers. The specific form proposed is:
function asmModule(stdlib, foreign, buffer) {
"use asm";
...
var Int32Array = stdlib.Int32Array;
...
var byteLength = stdlib.byteLength;
...
function changeHeap(newBuffer)
{
if (byteLength(newBuffer) & 0xffffff ||
byteLength(newBuffer) <= 0xffffff ||
byteLength(newBuffer) > 0x80000000)
return false;
heap32 = new Int32Array(newBuffer);
...
buffer = newBuffer;
return true;
}
...
In particular, the byteLength and changeHeap definitions must be exactly the same as the above code, modulo:
- comments, whitespace, ASI, empty statements and unnecessary parens/curlies (as implied by Section 4)
- the literals in the condition must be asm.js integer literals (so base 16, base 10, and exponential forms are acceptable)
- all names except the stdlib ‘byteLength’ property name (more on this in a second)
- the numeric literals: the first literal must include bits 0xffffff; the second literal must be >= 0xffffff; the third literal must be <= 0x10000000.
Notes:
- Additionally, the changeHeap function must be the first function in the asm.js module (if it is present at all). The byteLength import can occur one or more times anywhere in the global imports section.
- All views in the global scope must be replaced (in the order they were imported); it is not possible to simultaneously have views into two separate heaps.
So what’s the deal with ‘byteLength’? We need the ability to
- verify that ‘newBuffer’ is a real ArrayBuffer,
- get its byteLength.
The ArrayBuffer.prototype.byteLength accessor property’s getter achieves both these effects (it is a non-generic getter, so it throws if ‘this’ isn’t an ArrayBuffer) but we can’t just call ‘newBuffer.byteLength’ since … JS. Instead, we require the caller to pass in the result of:
Function.prototype.call.bind(Object.getOwnPropertyDescriptor(ArrayBuffer.prototype, 'byteLength').get)
which is validated at link-time. If this validates, we have a foolproof way to solve problems (1) and (2) above. Note: on Safari and IE, ArrayBuffer.prototype.byteLength is currently a data property. For these, a polyfill would simply be:
function byteLength(buf) { return buf.byteLength }
Should these JS engines want to take advantage of asm.js for AOT, though, they’d need to implement byteLength according to 24.1.4.1 in the current ES draft.
So why do we need to check those three byteLength conditions?
- A non-multiple-of-16mb size may inhibit bounds-checking optimizations on different platforms (see this topic).
- The <= 0xffffff check ensures two things. First, since it is possible to have a zero-byte-length buffer, it prevents shrinking the heap below the link-time minimum heap size. Second, since the length can be any integer greater than 0xffffff, this gives a guarantee to AOT compilation that bounds checks can be removed for constant heap accesses below that length.
- The > 0x10000000 check ensures that an asm.js module always has a full view of the heap, even if the engine allows ArrayBuffers larger than 2gb.
In particular, the min/max length requirements in change-heap establish a minimum-heap requirement for the module:
- Constant heap accesses must be below this limit to validate.
- Link-time validation fails if the initial ArrayBuffer is below the min or above the max.
In addition to the validation-time and link-time requirements listed above, there is one more validation-time requirement: non-builtin function calls are not allowed in any nested expression of a MemberExpression (see ValidateHeapAccess). The reason for this is that the order-of-evaluation rules of JS require that, in both of:
HEAP32[f()] = 0
HEAP32[i>>2] = f()
HEAP32 is evaluated before f(). Thus, if calling f() ultimately calls changeHeap, HEAP32 must refer to the original heap. Cutting out function calls rules out this corner case. This limitation applies to FFI calls, internal (normal) calls, and function-pointer calls, but not builtin calls (since none of them can reenter). Note: to avoid invalidating existing asm.js, this validation rule would only apply if a changeHeap function was defined.
Lastly, how would changeHeap be used in practice?
- If an asm.js heap needs to grow (e.g., malloc() runs out of buffer): allocate a new ArrayBuffer, copy over the contents of the old ArrayBuffer, and changeHeap to the new ArrayBuffer. This can be done from an FFI called by the asm.js malloc() implementation.
- If TC39 accepts the ArrayBuffer.transfer proposal, ArrayBuffer.transfer could be used instead of copying. Until ArrayBuffer.transfer is ubiquitous, applications would be advised to feature-test for ArrayBuffer.transfer and fall back on copying.
- changeHeap can be called with any ArrayBuffer, not just resized-from-previous heaps. This allows things like passing heaps around between workers (via postMessage-transfer) and without having to re-link an asm.js module each time.