[JS] Array method to append another array in-place

We have experience writing a high-performance game engine in JavaScript. One problem we’ve come across is the most performant way to append one array to the end of another in place. (arr1.concat(arr2) returns a new array; this is specifically for an in-place append that modifies arr1.)

The obvious solution is arr1.push(...arr2), but this has a nasty gotcha: the spread passes the array as arguments on the stack, which means if arr2 is long enough, you reach implementation-defined maximum argument length limits. Commonly this is around 65k, but varies from browser to browser. Worse, since the limits are fairly high, writing this type of code at first looks like it works, but then you find out it’s broken later on as your arrays get longer and reach limit (worse still, a non-standard limit).

The next obvious solution is to use a loop to call push(), e.g.:

for (let i = 0, len = arr2.length; i < len; ++i)
    arr1.push(arr2[i]);

The problem with this is it’s significantly slower, to the point we’ve found a helper method using exactly this approach was coming up high in performance profiles. It makes sense why: the optimizer probably can’t tell how long arr1 will end up, so as it keeps pushing it will have to keep reallocating the backing store to increase the array capacity, and every time it does that it will have to copy the entire array.

Switching to arr1.push(...arr2) measurably increased performance - but then we have to impose an arbitrary hard-coded limit on how long arr2 is to use that fast-path, else fall back to the slow path push() loop - and then we end up switching to a slow path in the most performance intensive case (with long arrays).

This is all pretty awkward and still doesn’t achieve maximum performance with long arrays. It could be solved by adding a new array method, e.g.:

// Appends all of arr2 to the end of arr1 in-place
arr1.append(arr2);

Edit: the discussion below helped clarify that this method can meet all of the following requirements:

  1. Modify an existing array, instead of returning a new array
  2. Resize the backing store once, since the final size is known
  3. Preserve the internal element kinds (providing the appended array has a compatible internal type)
  4. Append any size array, without arbitrary/non-standard limits

As far as I am aware, it is not possible to achieve all those requirements using any other JavaScript workaround.

Hmm would using something like a binary chop algorithm work for your use case? It sounds like we know how many items are in arr1 and arr2 and we know what the max limit is. So shouldnt splitting the arrays and only pushing what we know wont exceed the limit help?

I’m confused - I mentioned that the max limit is non-standard so isn’t known in advance, but you seemed to say that it was known in advance. How? Even if it was known, having to write a helper method will still not be as fast and involves more complexity to achieve the straightforward operation of appending arrays.

Ah sorry I misunderstood. I’m not too up to date on the spec around max argument limits in browsers but I misread and thought there was a way to know what the limit is in advance (Is the spec even definitive about this?)

But you’re right. Even if my suggestion was possible, it would still be tedious to do. Having it handled natively by adding a new method would make it easier. Just not sure if enough devs have to deal with processing and manipulating such large arrays on the client to warrant adding another Array method to the spec. But will wait to see what others think.

Two options would be to utilize

  • generator function
  • ReadableStream and WritableStream or TransformStream

which do not impose a maximum limit on input or output.

Queue the N th input array to be read when the first or previous array read/write completes.

Obviously there are a bunch of possible workarounds. The point of the proposal is there should be an easy and obvious way to append long arrays in-place, without having to use complicated workarounds.

What does the term “in-place” intend to mean relevant to Array?

Why is using a generator function or TransformStream not an option?

If arr1 and arr2 exist independently why does arr2 need to be concatenated to arr1?

Cannot N arrays be iterated in succession without concatenating N arrays to read the same input?

Haven’t I already addressed those points? Basically it comes down to a way to do arr1.push(...arr2) without a non-standard limit on how long arr2 is allowed to be.

No.

It is not clear why arr2 needs to be pushed to arr1.

Basically it comes down to a way to do arr1.push(...arr2) without a non-standard limit on how long arr2 is allowed to be.

Why cannot a generator function be used instead of pushing arr2 to arr1?

What is the use case?

Isn’t concatenating two arrays obvious enough to not require explaining the use cases?

It’s the same as all the reasons concat exists, but in-place, since that’s faster (no need to throw away the old array).

Just to be clear, the whole purpose of this proposal is to be able to append arrays in-place without needing to resort to complicated, performance-degrading workarounds. So if anyone is about to suggest another workaround, I don’t think that’s helpful.

No. If two arrays exist each can be iterated in succession to avoid the concatenation operation. What is the compelling need for arrays to be concatenated?

It is not clear what “in-place” is intended to mean.

Is the original array created with length set to the initial elements and the elements expected to be concatenated to the original array?

Again, from perspective here, the term “in-place” is not defined technically in this proposal.

To substantiate “performance-degrading” requires minimal, complete, verifiable examples; benchmarks; publications of tests demonstrating what the issue is with existing Array and TypedArray methods that does not meet the requirement, relevant to “performance”, which too, needs to be clearly defined. Less code? Memory usage? Time? What code is being compared?

Since the proposal appears to be about JavaScript Array would suggest posting this proposal at https://esdiscuss.org/1 (see https://www.ecma-international.org/memento/tc39-m.htm).

What is the compelling need for arrays to be concatenated?

To combine multiple arrays to pass to an API that expects a single contiguous array.

It is not clear what “in-place” is intended to mean.

It means modifying the array it is called on, instead of returning a new array. This is similar to how splice() modifies the array it is called on in-place - and MDN uses the terminology ‘in place’ to describe that. In-place methods are faster since they don’t need to create a copy of the entire array.

If gather the requirement correctly one pattern that can be used is to set originalArray.length property to originalArray.length + NthArray.length, use for loop to set the value directly at originalArray[originalArray.length] = NthArray[NthArrayNthIndex]

const appendToArray = (array, ...arrays) => {
  for (const arr of arrays) {
    const n = arr.length;
    const len = (array.length += n);
    for (let i = len - n, j = 0; i < len; i++, j++) {
      array[i] = arr[j];
    }
  }
  return array;
}

Usage

let a = [1,2,3];
let b = [4,5,6];
let c = [7,8,9];
let res = appendToArray(a, b, c); // [1, 2, 3, 4, 5, 6, 7, 8, 9]

That looks like just another way to do the push workaround, which has performance pitfalls:

“performace pitfalls” compared to exactly what existing code that does not exhibit the observable characteristics described?

By what means is the conjecture that “performance pitfalls” exist in that pattern or using push() verifiable?

I already covered this - relative to arr1.push.apply(...arr2), but that has non-standard argument size limits.

Oh right, missed that. But assigning the length will downgrade the elements kinds of the array, which will have a major performance impact. For example if the arrays were storing integers, assigning the length will downgrade it all the way from packed SMIs to holey generic, as it becomes a sparse array returning undefined from unfilled slots. So again this is not a reasonable approach if you need high performance. The reason I originally referred to using push() is that providing you always push the same element kind, it preserves the internal typing of the array. I also assumed there was no particular difference between push(x) and arr[arr.length] = x, although the former indicates the intent more clearly, so whether or not it involves array methods like push is moot.

So - the proposed append-in-place method would be able to:

  1. Resize the backing store once
  2. Preserve the internal packing and element kinds (providing the appended array has a compatible internal type)
  3. Append any size array, without arbitrary/non-standard limits

AFAIK there is no other way in JavaScript to meet all three requirements - hence the case for the proposal.

I am curious if this proposal can be universal for all implementations other than V8 cited in the post, but FWIW here are my 2 cents:

  1. I believe it’s already possible for an implementation to implement the arrays in such a way that arr1.concat(arr2) is no more than constructing a new structure like class ConsArray { Array* first; Array* second; } with array methods implemented to give you an illusion that the structure is flattened - this is how V8 implements string concatenation, which trades space for speed. I sense that this kind of memory layout might be what the proposal is implying, but mandating it does not necessarily help implementations. This layout may not be optimial for generic arrays since array concatenations are less common than string concatenations (note that cons strings require flattening if the user tries to access individual characters and flattening is slow). And that is the nature of things - implementations need to optimize for the most common use cases and sometimes that means other less common use cases can be slower. Specifying a new method with more implementation details may get in the way of optimizing for other use cases or at least make things more complicated.
  2. Concatenating two arrays in-place with a resized backing store is not necessarily guaranteed to be faster than returning a new array with elements copied over, depending on how the implementation chooses its memory management and GC strategy, and also depending on the actual arrays being concatenated. It’s also possible that once the size of a concatenated array reach a certain threshold, the implementation has to create a new array with copied elements underneath instead of simply reusing the existing storage. Again this all very much depends on the specific choices that implementations make and what use cases they prioritize.

However, I think it’s possible for an implementation to guarantee predictable performance for in-place concatenations - even though it might not be the best performance. That could also be a desirable characteristics if specified, but another alternative that comes to mind is TypedArray.prototype.set since in general typed arrays have more predictable performance (with the caveat that the elements need to be numbers and the lengths must be fixed)

1 Like

While I think that it is fine to post this kind of thing here early on too, it’s not wrong that there will be more actual tc39 member eyeballs on something run by/for TC39 specifically. Note also that tc39 recently started an effort similar to this one at https://es.discourse.group/

In either case, what you’re really looking for is whether you can explain, articulate a use that some others can agree with, help flesh out a little and are interested in potentially helping submit for early stage/incubation.

1 Like