-
Notifications
You must be signed in to change notification settings - Fork 43
Conversation
This PR is for discussion only and should not be merged. Once we have reached consensus on the renumbering, the existing documents should be updated to reflect the new opcodes. The goals of this renumbering are: 1. To maintain consistency with the ordering of MVP operations 2. To organize sets of similar operations into tables such that the offset from an operation for one type to the same operation for the next type is constant for all operations in that set. 3. To use round hexadeciml numbers for offsets where not too wasteful.
proposals/simd/NewOpcodes.md
Outdated
| i8x16.sub | 0x71 | i16x8.sub | 0x91 | i32x4.sub | 0xb1 | i64x2.sub | 0xd1 | | ||
| i8x16.sub_saturate_s | 0x72 | i16x8.sub_saturate_s | 0x92 | ---- sub_sat ---- | 0xb2 | ---- | 0xd2 | | ||
| i8x16.sub_saturate_u | 0x73 | i16x8.sub_saturate_u | 0x93 | ---- sub_sat ---- | 0xb3 | ---- | 0xd3 | | ||
| ---- dot ---- | 0x74 | ---- dot ---- | 0x94 | i32x4.dot_i16x8_s | 0xb4 | ---- | 0xd4 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this op? Is it #20 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proposals/simd/NewOpcodes.md
Outdated
|
||
| Conversion Op | opcode | | ||
| ----------------------- | ------ | | ||
| i32x4.trunc_sat_f32x4_s | 0x100 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can this be written as 0x0100? It makes it slightly clearer that it's 2 bytes.
It's unclear to me what utility this renumbering has; the goals that are stated have uncertain value and are to some extent circular. Would anyone be using a decoder based on these constant offsets? Would anyone do a reverse lookup from opcode into a table to find the meaning of the opcode instead of just searching a text file for the opcode value? Edit: My own answer to both of my questions is that I wouldn't. |
Here's the practical problems that I foresee with this:
Here's an alternative possibility:
It would be good to understand the pros that this numbering has - my mental model has certainly been that the encoding adjustments would be very minor. If the decoder needs to classify opcodes, it's straightforward to build a table (right now the table can be of size 256!) that maps opcode to any information desired by the encoder, which can be arbitrarily complex and not limited by the proposed grouping. |
Is there a concern about renumbering in general, or about this specific renumbering? At the very least I'd like to see some opcode grouping in the encoding, if only because it's easy to miss which instructions are even there in the current ordering. For example, I'd like to have the load splat and load extend instructions grouped with I agree it's not important for decoders (we'll all just use a switch statement or table), but I think there is value for people understanding the spec. If only for folks looking at opcode tables (which I do fairly often, but perhaps I'm an outlier 😄) |
I think my concern is entirely general - there doesn't seem to be any good justification for doing this. Clearly for documentation & exposition we can group operations logically, as the existing overview does very well; the mapping to the encoding doesn't seem to matter though. Having had a lot to do with the encoding of the existing non-SIMD instructions all I can say is that any logical grouping of the opcodes is not something I pay any attention to. |
In terms of understanding the spec, I feel like we can just sort the opcode table not according to the opcode number, but according to the implied easy-to-understand order. You could imagine multiple orders that make sense and it's not clear to me which one is best - for example, is it more useful to group by type or group by operation? (do you consolidate all shifts for all types or not) Because it's hard to pick one, it's not clear to me that we need to reflect this order in the opcode sort. There's something to be said about consistency of course, but it just doesn't seem that opcode numbers are very important? x64 has a crazy encoding but it's not because the numbers aren't sequenced, it's because there's a billion hard to understand modes, and it doesn't seem to be truly important whether 0xE9 is a CALL or a JUMP. |
Just to be clear, my main concern with this is that this is going to cause churn for everyone involved - browser vendors, toolchain providers, application developers (early adopters) - with potentially hard to understand validation errors, and I'm expecting that it will take a few months to fully resolve this, because of delivery lag times in various components. |
Those are fair arguments, and if we stay with the current opcode ordering it probably won't affect much. Though I will say we have some precedent for a change like this. We renumbered the opcodes in the 0xd version of wasm too: see WebAssembly/design#826, which references a spreadsheet describing the ordering.
We would probably want to base this on the organization of the v1 opcode encoding. In particular, the instructions are categorized by kind (control, parametric, binary, etc.), then type (i32, f32, etc.) then operation.
Agreed, but this was churn that was originally planned back in October, in issue #129. It didn't seem like there was much disagreement then, are the Chrome release plans (and COVID stuff) the main thing that has changed? I was at least under the assumption that newer opcodes were added a bit more haphazardly since we knew that we'd reorder them later anyway. It seems a shame to leave them as-is given that. |
There is also precedent for this in the SIMD proposal itself, particularly in #48. When we did that reorganization we were careful to leave holes for opcodes that had not yet been merged into the proposal. We thought that that would be the final renumbering, and we did not anticipate all the new opcodes that have been added in the last year and a half.
I strongly agree with this. We've been tacking opcodes onto the end in the order they are merged to the proposal, resulting in a an ordering that is completely inconsistent with the ordering of preexisting opcodes. If we hadn't been planning on a final renumbering, we would have done this differently by leaving lots of opcode space available for future opcodes to fit into the existing organization.
As described in #129, the plan is to give implementors enough time to implement the renumbering then try to synchronize the update across the ecosystem. There will be lag from projects having multiple distribution channels that update at different times, but I believe we will be able to synchronize the release of many projects on their primary early-adopter distribution channels to keep the churn to about a week.
I personally care about the numbering to the extent that having constant offsets between the same operations on different types makes the LLVM instruction selection code waaaay simpler. I would also be sad if the opcodes were not in order in the spec as an aesthetic matter. |
It wasn't clear from that issue what the extent of renumbering was, I guess. It could be as innocuous as "keep status quo" and as extensive as, well, this proposal :) We can still decide it's the right thing to do but I feel like we should be very explicit about what we gain and why it's not a documentation-only issue.
If we can synchronize the release window to be short then I don't have any other concerns. But I'm wondering how practical this is - e.g. node.js ships a stale version of v8 even in the nightlies. |
Strongly agree with this. I think that as a general approach for standardizing binary formats (thinking ahead to what we'll do for Interface Types), we should incrementally add opcodes as a proposal evolves. This will mean things are scattered and unordered, but should reduce incremental churn. Then, before finalizing the proposal, renumber things to be clean, organized, and non-ugly. This matches what we did for wasm itself. The core question is: does it matter that the opcodes be numbered in a "non-ugly" way? I don't feel strongly about that as an absolute good, but I see it as a matter of quality. This is, necessarily, a highly non-technical argument. I think we should want proposal creators to feel a strong sense of pride in their work. And I think that necessitates leaving some room for aesthetics and taste. Technical concerns trump style, but I think this is a place where any encoding is equally good technically, so we might as well do something pleasant here. I am also of the opinion that anyone using experimental features does so at their own peril. We can't break the open web. If we can't break experimental features past a certain point, that puts undue pressure on people to get things right ahead of time, rather than getting something working and iterate. Concretely, I can imagine people taking the time to fully map out all their possible opcodes and coming up with a sensible scheme for them, and leaving room for things they plan to add later. Which is doing more work up-front that would be better served by being informed by end-to-end practical experience.
Oh hey this was commented while I was writing it up. Which, yeah that. |
That's a good point. What's the plan for future additions, though? By necessity I'd expect we'll need to evolve the spec over the next few years, by shipping extra additions e.g. fast instructions - is that going to use a separate prefix, or just appending to the existing set? |
Projects like node that have long release cycles are always somewhat stale compared to upstream V8, but that's already a problem for using newly-implemented instructions, so I don't think the extra pain from the renumbering is a huge issue there.
I would expect future additions would either use a new prefix or use subsequent opcode space with the same prefix. That choice would depend on the actual content of the future proposals. But I'm not worried about trying to consider future proposals in this ordering; I expect the contents of future proposals would be different enough from the contents of this proposal that they would warrant their own groupings. |
While this is certainly true, I just shudder at the implications of this. Maybe that's what it is. Today I can use Wasm simd with Chrome, node.js and Emscripten. If there's a period of time where Chrome Canary / node.js nightly / Emscripten don't agree on the encoding subset, it's going to be very hard to do any SIMD related work. If this period is going to last a month or more, it feels like a significant step back from the status quo where a large subset of the spec works, can be tested, etc. I'm fine with using latest nightly builds of various tools to get stuff done, but I'm worried about having to wait for months for everything to get back to where we are today (which is "things basically work"). |
The entire -msimd subset works on node today as far as I'm aware, as well as a few instructions that are in -munimplemented-simd. So this is currently not a big problem in practice. I guess it would be feasible for existing users (like, uh, me, which is why I'm worried!) to keep using the older Emscripten version until node/Chrome catch up. edit It occurred to me that it's also possible to write a JS wasm->wasm transcoder that fixes the opcodes around, if the encoding isn't too painful to do... so maybe this is less of an issue that I'm afraid it might be with that. |
proposals/simd/NewOpcodes.md
Outdated
| i32x4.trunc_sat_f32x4_s | 0x0100 | | ||
| i32x4.trunc_sat_f32x4_u | 0x0101 | | ||
| f32x4.convert_i32x4_s | 0x0102 | | ||
| f32x4.convert_i32x4_u | 0x0103 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to put this section somewhere else to avoid double-byte encoding? It feels like it would make encoding more uniform, would make opcode-based lookup tables have a small fixed size without extra range checks, and would make it possible to patch the bytecode in place vs the old encoding for compatibility purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh, or is it the case that right now the opcode is actually not encoded as a byte, since LEB128 encoding uses two bytes for any opcodes >=0x80? :-/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, all the bytecodes are LEB128 so we already have opcodes that take 2 bytes in the current numbering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunate. The existing Wasm spec doesn't do this as far as I understand. We have 170 opcodes right now. I'm not sure why the LEB128 encoding was chosen, is there a precedent for this elsewhere in Wasm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same question in #46. TL;DR: this is how all proposals work since 2017, but SIMD is the only one where you can tell the difference. Notably, V8 was mistakenly using bytes instead of LEB128s until a couple weeks ago: https://bugs.chromium.org/p/v8/issues/detail?id=10258.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the entire ecosystem, AFAIK, only V8 was assuming the SIMD opcodes were 0xfd + byte
, the toolchain wabt, correct encodes and decodes using 0xfd + leb128
. This has been the way since the beginning of SIMD (afaict).
Why does all this even work? Glad you asked. This v8 bug https://bugs.chromium.org/p/v8/issues/detail?id=10258 describes why.
In short, the toolchain emits 0xfd 0x85 0x01
for the 85-th opcode. v8 decodes this as:
0xfd
, SIMD prefix0x85
, 85-th opcode (use this to index into our table of opcodes)0x01
, nop (this is the key, v8 does nothing, which is correct.
So, what doesn't work? When you have an opcode numbered >= 80, that takes a memarg. Which is how we encountered that bug.
(Also, the fix is not in yet, will land it soon.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, prefixes are being allocated in decreasing order from 0xfe. 0xff is reserved as an escape hatch for a future when we run out of opcode/prefix space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dear god. Thanks, this makes more sense now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same question in #46. TL;DR: this is how all proposals work since 2017, but SIMD is the only one where you can tell the difference. Notably, V8 was mistakenly using bytes instead of LEB128s until a couple weeks ago: https://bugs.chromium.org/p/v8/issues/detail?id=10258.
Same bug in Firefox, found it when I implemented SIMD :-}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again as I'm updating engine opcodes, I would also like to express a preference for the decoded opcodes to be 1-byte because we have the space. The reason is that on the engine side, this will add some bloat to opcode/signature tables which assume 1-byte opcodes. It's not terrible to update, but considering we have the space, and it's just the 4 opcodes that are spilling over I'd like to avoid it as it looks like it's possible in this case.
Maybe this has already been mentioned before, but one of the motivating reasons was that while this has no bearing on applications, and engines - not having consistent offsets makes the tools implementation non-ideal. Perhaps @tlively already mentioned this and I missed it. Especially as SIMD has many regular operation spread across different types, making this coherent and consistent was also a factor here. We could have done this from the beginning, but given that there were many applications relying on experimental support, to get quick feedback at the time it made most sense to tack on operations at the end so applications aren't affected by toolchain/engine changes as they just won't be using those opcodes. Also, renumbering opcodes at different intervals seemed like unnecessary churn (ex: swizzle opcode) when we could do it in a more planned way when we have a good idea of the operations being introduced, and frozen addition of new opcodes so we can renumber them with a one-time overhead. Any future additions as a part of a separate proposal will have their own prefix, so shouldn't affect this set of opcodes. This PR is mainly proposing the new opcodes, but the actual implementation can be co-ordinated with dates we all agree on, post COVID release freezes to minimize any churn that applications have to deal with. Very specifically for Chrome Canary / node.js nightly / Emscripten - the plan on our end is to co-ordinate this tightly so, the engine change will land in V8 as soon as the Emscripten change lands - Chrome canary and Emscripten should be in sync within 48 hours, for node.js nightly I'm not very clear on the schedule for different platforms, so that's a little uncertain but hopefully should have an answer for this soon. Even though this is still in experimental, the intent was to do this in a way that we can plan it in conjunction with other tools/engines to minimize user pain. |
I'm for this if we can minimize the churn 👍 |
I wrote a polyfill for this :D https://gist.github.com/zeux/10fa3a3c85262ac87979887cb02e0c66 (the polyfill doesn't have real remapping tables yet, but it should be straightforward to fix this code to either remap old=>new or new=>old by just implementing the remap) |
proposals/simd/NewOpcodes.md
Outdated
| v128.const | 0x0c | | ||
| v8x16.shuffle | 0x0d | | ||
| v8x16.swizzle | 0x0e | | ||
| i8x16.splat | 0x0f | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ordering here is mildly unsatisfying for decoding purposes. Similarly to the existing spec, this interleaves splat/extract/replace lane, and they have different encoding formats. Thoughts on grouping these by instruction type instead? (splats followed by lane operations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that sounds reasonable 👍
@zeux is the latest commit the change you had in mind? |
Yup 👍 |
Since |
@lars-t-hansen Thanks for the suggestion. I've fixed that and also marked the PR as draft to make it more obvious that it won't be merged. The other docs that already exist in the repo will need to be updated instead. |
* [WebAssembly] Enable recently implemented SIMD operations Summary: Moves a batch of instructions from unimplemented-simd128 to simd128 because they have recently become available in V8. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73926 * [WebAssembly] Simplify extract_vector lowering Summary: Removes patterns that were not doing useful work, changes the default extract instructions to be the unsigned versions now that they are enabled by default, fixes PR44988, and adds tests for sext_inreg lowering. Reviewers: aheejin Reviewed By: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75005 * [WebAssembly] Renumber SIMD opcodes Summary: As described in WebAssembly/simd#209. This is the final reorganization of the SIMD opcode space before standardization. It has been landed in concert with corresponding changes in other projects in the WebAssembly SIMD ecosystem. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79224
* [WebAssembly] Enable recently implemented SIMD operations Summary: Moves a batch of instructions from unimplemented-simd128 to simd128 because they have recently become available in V8. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73926 * [WebAssembly] Simplify extract_vector lowering Summary: Removes patterns that were not doing useful work, changes the default extract instructions to be the unsigned versions now that they are enabled by default, fixes PR44988, and adds tests for sext_inreg lowering. Reviewers: aheejin Reviewed By: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75005 * [WebAssembly] Renumber SIMD opcodes Summary: As described in WebAssembly/simd#209. This is the final reorganization of the SIMD opcode space before standardization. It has been landed in concert with corresponding changes in other projects in the WebAssembly SIMD ecosystem. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79224
I think we pretty much have consensus on this renumbering, and a lot of tooling have been updated, can we make the changes to the docs? |
Sure, I can update them today. |
@dtig I think this can be merged as-is now that the other docs are updated on this branch. Keeping NewOpcodes.md around will be useful for remembering where new operations can be slotted in. |
I don't see how this updates the spec tests which would be pretty useful (fixing https://github.com/bytecodealliance/wasm-tools/blob/ede81acc5f594b2efb3cd3c62aad63c6d20da023/tests/roundtrip.rs#L652-L661 and https://github.com/bytecodealliance/wasmtime/blob/master/build.rs#L203). Is there some magic to get the opcodes out of the tables or the spec tests need to be changed as well? |
As an example, this test still uses the old encoding of |
@dtig sounds good. Want to push that commit to this branch? |
Yeah, I'd like to do that - mostly because a few tools pull spec tests directly from this repository so having it updated here hopefully reduces some of that confusion and a WAVM update can follow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes still lgtm, with comment about shuffle/swizzle opcodes.
proposals/simd/BinarySIMD.md
Outdated
| `v64x2.load_splat` | `0x0a`| m:memarg | | ||
| `v128.store` | `0x0b`| m:memarg | | ||
| `v128.const` | `0x0c`| i:ImmByte[16] | | ||
| `v8x16.swizzle` | `0x0d`| - | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the shuffle/swizzle opcodes have swapped opcode numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Thanks for catching that :)
Since simd opcode has been renumbered at WebAssembly/simd#209
Summary: As described in WebAssembly/simd#209. This is the final reorganization of the SIMD opcode space before standardization. It has been landed in concert with corresponding changes in other projects in the WebAssembly SIMD ecosystem. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79224
This PR is for discussion only and should not be merged. Once we have
reached consensus on the renumbering, the existing documents should be
updated to reflect the new opcodes.
The goals of this renumbering are:
To maintain consistency with the ordering of MVP operations
To organize sets of similar operations into tables such that the offset
from an operation for one type to the same operation for the next
type is constant for all operations in that set.
To use round hexadecimal numbers for offsets where not too wasteful.
Rendered