Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A New Bytecode Format for JavaScriptCore (webkit.org)
211 points by stablemap on June 21, 2019 | hide | past | favorite | 89 comments


I'm unclear, why transform to a bytecode as a first step? Wouldn't it be simpler to instead to transform to an AST and work against that for everything? Wouldn't it make sense to generate the bytecode as sort of a last step before doing heavy duty optimizations?

Seems like with JS you would be constantly transforming from bytecode -> AST -> bytecode -> machine code, such as every time a method adds a new field to an object or some optimization assumption is violated. It doesn't seem like it would be easier or faster to work with. I'm guessing there wouldn't be a whole lot of memory benefits either.

Would someone mind illuminating me on this design choice? (Granted, I've not taken a compilers course, so feel free to call me an idiot for not knowing something basic about how compilers work).


* Adding a new field or method to an object at runtime doesn't change the AST.

* You don't translate from bytecode to AST.

* In tiered JIT architectures--typical for dynamically-typed languages like JavaScript where as you point out effective types can mutate at runtime--most code is executed by an interpreter, not as compiled machine code.[1] Interpreters are more efficient when executing a bytecode than walking an AST.

* Translating an AST straight to machine code is much more difficult than translating a low-level intermediate representation, and such representations are often easy to express as simple bytecodes. Compilers like GCC and LLVM use intermediate representations: RTL and IR. Both are effectively assembly languages, and like all assembly languages trivial to express as bytecode. GCC also has GIMPLE, a higher-level intermediate representation earlier in the pipeline which expresses a generic AST. The fact that GIMPLE and RTL coexist drives home the point that you don't want to be translating tree-like representations straight to machine code.

* You don't need to take my word for this, or the word of anyone else. Every programmer should have experience writing parsers, compilers, and interpreters, regardless of whether they took a class or even went to university. By doing this most of the reasons for the way things are done will become immediately clear to you. Note that splitting strings with regular expressions does not count as parsing, not for these purposes. A good project, however, would be to write a regular expression parser, compiler, and interpreter. There are many examples to follow and you can copy their designs exactly; just don't copy+paste as actually going through the motions and understanding the complexities inherent in the implementations is how everything will become clear.

[1] How this mutation is handled and why tiered architectures are preferable is another topic entirely.


Definitely agree that most programmers would benefit from studying these topics. They have broader applications than many realize.

You can use tools like jscodeshift [0] to refactor code! I've previously written codemods to migrate away from a bad design decisions. The React team has also used codemods to introduce breaking changes. AST Explorer [1] is the best tool I've found for writing codemods since it gives you nearly instant feedback and allows for a good amount of exploration.

[0] https://github.com/facebook/jscodeshift/

[1] https://astexplorer.net/


Swift and Rust also have their own higher-level intermediate representations that they translate their AST to before even going to LLVM IR (SIL and MIR respectively). Rust used to translate their AST directly into LLVM IR but they switched to going through MIR first because it unlocks a lot of optimizations as well as improved user-facing functionality such as non-lexical lifetimes in the borrow checker.


Rust has AST -> HIR -> MIR -> LLVM-IR. Two internal IRs and a third external one!


Good point, I forgot about HIR, though isn't HIR largely just a desugared representation of the AST?


Yes, but also post macro expansion and name resolution.


And compcert has at least seven different IRs, several of which are ASTs. http://compcert.inria.fr/diagram.png

That being said, consistency of internal data structures is probably one of those areas that separates a JIT from an offline compiler. I can see the benefit of switching to a bytecode representation pretty early in that circumstance.


Every programmer should have experience writing parsers, compilers, and interpreters, regardless of whether they took a class or even went to university.

The majority of developers doing yet another software as a service CRUD app or your typical “dark matter developer” who will write software that will never see the light of day outside of the company they work for would never benefit from writing a compiler or interpreter.


If they feel like calling themselves engineers, then they must know this stuff, as it is part of any proper engineering degree.

Now if they followed other path into programming, then by all means.


See “No True Scotsman”....

What next they aren’t a “true engineer” unless they can reverse a b-tree on the whiteboard?


For sure they aren't one if they don't have the degree from an university acknowledged by the Engineers Order.

Which most likely will have a good class in algorithms and data structures.

So while I don't expect every engineer to actually be able to do it, I expect him/her to be able to explain how to attack the problem.


And you personally have studied the curriculum of every single computer science/computer engineering degree and can say that every single one required writing a compiler or interpreter and that no person who hasn’t written one can call themselves an engineer?


The engineering Order has taken that job twice.

When they certify the university and then when they issue the professional titles exam.


So they “certify universities” worldwide?


Each one certifies the universities of the respective country, naturally.

I am well aware that US doesn't seem to value proper engineering education, and everyone is allowed to call themselves engineers.

Thankfully not every country sees it that way.


Yes I guess that’s why the US is such a backwater when it comes to innovation and why we have so few successful technology startups....


Should we do an exercise of which US tech was actually researched by immigrants with a proper engineering degree, and how much of that tech is actually built on US?


Should we do an exercise in how much knowing how to write compilers helps in the day to day life of your average SAAS CRUD developer or bespoke app developer - my original statement?


It depends if the bespoke develop ever needs to parse some kind of data file to import into their CRUD application, provide customization points to their customers or configuration file parser.

Then there is the whole point if HR is hiring actual engineers.


There are a slew of problems.

For a very long time JSC did use an AST, and interpreting and executing an AST is expensive. The very first version of the byte code interpreter (all hail squirrelfish) was in the order of twice the speed of the AST, and the AST had been extensively micro-optimized at that point, to the extent that the dispatch cost itself (a single indirect call) was more than 10% of execution time.

There were also a bunch of correctness issues:

* Handling exceptions required manually check call results everywhere. There was an unending stream of minor correctness bugs caused by this. Looping semantics were similarly exciting. Because every call technically required those checks there was a significant cost there, even though there were numerous cases they were missed.

* Changes in language semantics required duplicate changes in many different places

* New language features require a lot of very similar behaviour, with minor variations that resulted in duplicated code, introducing new places to accidentally miss exception and loop checks.

* Memory overhead was massive, both in terms of actual usage, and also allocation costs.

* Significant reductions in maximum recursion depth. Back in the days of the AST interpreter JSC had a hard limit of 500 levels of recursion, with plenty of checks, and it would still periodically blow the system stack and crash.

The cost of producing the byte code in the first place is super low, and because the byte code is all essentially the most primitive of JS operations, new language features generally don't require the optimizing passes to be aware of any changes, because they're mostly just syntax around existing primitive operations.


AST is less transformable. We transform our bytecode as if it was just an IR.

ASTs are a pretty annoying form to use as a source of shared truth for OSR. JSC has 4 tiers so we need a convenient-to-use shared truth IR. That’s what bytecode is for. For example, bytecode offers high-scalability answers to questions like “where should I exit” and “what is live when I get there”.


Isn't bytecode essentially a machine language for a simplified VM? Why would it be more transformable than a tree?


Because it’s not a tree.

Compiler IRs tend to have the property that operands are not themselves expressions; they can only be variables or maybe constants. That makes it easy to identify sequence points, which makes it easier to insert code or reshape the control flow graph.


A bytecode is easier to keep stable than an AST is; an AST changes as the language changes, but that doesn't mean you have to change the bytecode, only change the transformation from the AST to the bytecode.

I have no idea if this is the reason, but it's a possible one.


It's a bit of a micro-optimisation but ASTs are also usually represented fairly sparsely, so processing on bytecode could be much faster due to more efficient cache utilisation etc - some benefits from memory locality and also from bytecode usually being smaller in memory)

(I don't know JSC does it so that could be wrong)


That’s a pretty good reason also.


Besides already mentioned benefits the new bytecode is serializable so it is very easy to cache it.

Serializable format of AST is a hard problem. WASM was supposed to be like that, but in the final design it became a stack machine.


Kudos to Tadeu Zagallo and the JSC team for landing this awesome patch!

I've worked on WebKit memory performance in the past, so I'm well aware that these aren't low hanging fruits we're talking about. The type safety bonus features look great too. :)


Re: Direct vs. indirect threading (aka ordinary dispatch). I had the impression that recent Intel x86 chips did enough trace caching to render any performance distinction basically irrelevant. Is that right?

(Of course, Apple has their own ARM implementations to consider.)

EDIT: Here’s the paper I was thinking of in this regard:

https://hal.inria.fr/hal-01100647/document


My first thought when they said they switched from direct threading to indirect threading was that it may have been the worst possible time to do that given Spectre.

The negligible performance difference on modern Intel chips between direct and indirect threading is largely a result of their deep, heavily buffered branch predictors. Spectre mitigations are going to increase the performance differences. Most userspace applications will forgo mitigations in favor of keeping performance, but the browser was the big exception as it has to be concerned about in-process side-channels.


Haven't all the browser engines already implemented their own mitigations for Spectre/Meltdown for JavaScript code?


I can't speak to their existing mitigations directly, but more generally Spectre will be an ongoing saga for years to come.

I seriously doubt the mitigations in place, whatever they are, are comprehensive even for existing proven Spectre channels. The engines haven't even solved RowHammer. It's a very difficult problem domain, particularly for an application JIT'ing random, untrusted code from the Internet. In many respects they're (IMHO) quietly punting because there just aren't satisfactory solutions. Browsers are really stuck between a rock and a hard place, more so than VM providers like AWS EC2.

One of the more general solutions is simply to stop trusting the browser, if not your entire operating system. Keep your most sensitive secrets inaccessible to software to the greatest degree possible. Use smartcards and other hardware tokens for authentication, for example.


RowHammer is particularly frustrating because it's a combination of hardware/firmware/OS/software vulnerabilities; you need multiple levels of protection and it's hard to count on having them all.

I remember having a discussion about rowhammer on the v8 team when the original paper first came out - most of us thought it wouldn't be possible to exploit in JS but I suspected it would. Unfortunately I was right :/

I suspect at some point we'll discover WebGL can be used for rowhammering somehow too.


My point is you can't say "oh this approach may have poor behavior once Spectre mitigations are put in place", because the browser has already implemented Spectre mitigations.


There's more than one mitigation, and not all at the browser level. The kernel may also flush the branch cache on syscall entry, which is going to hurt performance if you depend on predictions being fast. Though as far as I know pretty much nobody uses that mitigation precisely because it's too slow.


Direct threading and indirect threading both rely heavily on the indirect predictor. You might even say that direct threading relies more heavily on indirect predictor state and in particular the contents of the IBTB because it has so many more branches and so more opportunity to store per-location history for particular instructions and hence more to lose if the IBTB is flushed.

Indirect threading relies more on strong and deep history pattern matching at a single location, and probably uses less state overall.


They’ve tried a bunch of things and basically landed on “we’ll do process isolation for Spectre but for Meltdown/Fallout/ZombieLoad/Other MDS vulnerabilities we’re basically fucked, so it’s on lower levels”


Impressive improvements.

I wonder how performance and memory use compare nowadays between V8, Spidermonkey and JavascriptCore.

Does anyone have a link to recent trustworthy and thorough benchmarks? (Google doesn't really spit out anything noteworthy...)


I believe that JSC is in the lead when it comes to throughput and latency. But that is at least partly based on benchmarks that I had a hand in designing.

I don’t know how the VMs stand against each other on memory. JSC is improving in this area a lot recently but I don’t know if it’s just catching up or leaping ahead or whatever.


It's first-party but I've found the benchmark results frequently posted in their blog do indeed transform into real-world results. https://v8.dev/blog


When they talked about getting rid of their threaded interpreter, I was reminded of this article from 2008 about writing fast interpreters, if that's interesting to this audience:

https://news.ycombinator.com/item?id=2593095 [hn link]


This is interesting! If anyone from WebKit is in the comments, can you provide link(s) to the bugs discussing the bytecode caching API? I'd love to take a look but my Bugzilla search skills are evidently weak.


There are a few bugs related to caching, but the main ones are: The initial implementation of the underlying infrastructure: https://bugs.webkit.org/show_bug.cgi?id=192782 Initial C++ and Obj-C APIs: https://bugs.webkit.org/show_bug.cgi?id=193401 WIP bug to integrate the cache with WebKit: https://bugs.webkit.org/show_bug.cgi?id=194047


Not much discussion in the bug report itself:

https://bugs.webkit.org/show_bug.cgi?id=187373


Do they have a different bytecode and runtime from WASM? Why not unify everything to web assembly byte code?


WASM byte code is structured into a tree, which means an interpreter loop wouldn't perform well. You'd really want to flatten it into a simpler bytecode before interpreting it -- and you'd want to do other transforms on it before optimizing it.

I don't think WASM bytecode is a good format for execution, and it's only mediocre as a compilation target.


Then why not abandon wasm, and make this byte code a target for llvm?


LLVM bitcode is CPU and ABI dependent, and isn't even stable between LLVM releases.


Bad question on my part. More relevant: Why wasm? From what I gather, js seems to already have a bytecode for their JIT runtime. Why not expose that so that we can have C++ in the web?


> js seems to already have a bytecode for their JIT runtime

This is kind of a nonsensical statement. JS itself doesn't have a bytecode, it's just a specification of the language you write scripts in.

Each implementation has its own intermediate representations, including bytecode formats.

WASM is a vendor-neutral standard that is designed as a compilation target and designed with safety in mind.

SpiderMonkey/JSC/V8 internal bytecode formats are just that, compiler internals.


I asked the same question, see below for interesting thread


sigh I think the modern web was a mistake from hacks of the 90s to share documents. Can we plz get a fresh restart from scratch?


This isn’t a standard.


They are different.

Unifying is not a bad idea.


Now please expose a loadFromByteCode api so that we can target bytecode instead of transpiling to js.


Suprised no one has posted https://blog.cloudflare.com/binary-ast/ . Its not bytecode but its still a great idea


Oh wow!! In a way it's even better than bytecode because it allows marking unused functions for lazy loading, giving speedups of 97% (in theoretical settings;)

Must read.

https://github.com/binast/binjs-ref from below is part of this!


WASM's (abandoned, because brotli is good enough) Binary Encoding proposal was essentially a binary AST compression system that worked for both JS trees (esprima) and wasm ASTs.

You get pretty good improvements by converting ASTs into SoA representation, interleaving them, and then compressing them. Fast to decode, too.

https://github.com/WebAssembly/design/issues/1180


The bytecode is an implementation detail that can, and does change. Making it an API would necessarily make changes (like this one) impossible.


some standardization wouldn't harm. Then we can do loadFromByteCodeV1 and later loadFromByteCodeV2


No, it absolutely would.

Because then any changes require duplicating the byte code parsers, and adding translation logic to manage those changes.

You must understand: any change would be an API break, and API cannot change. Deprecating an API takes years. You also can't change/add API in security updates, which means if a security fix required changing some aspect of the bytecode that change couldn't be made.

Lets say you change numbering of registers? Thats an API change.

Lets say you change number of opcodes? API change.

Any new API in macOS and iOS requires weeks of review cycle to ensure:

* It solves a problem in a generic manner: e.g. it doesn't solve a very specific version of a general issue

* It can be kept stable: e.g. it doesn't expose implementation details that may change

* It is ABI stable: direct memory access APIs cannot expose layout that can vary based on any internal details.

Shipping API is incredibly difficult if you care about ABI stability for software.


OK got it, thanks.

So hopefully https://blog.cloudflare.com/binary-ast/ will become a standard.

Already part of Firefox Nightly! about:config, unrestricted BinaryAST


binary-ast doesn't give any significant benefits over JS, potentially a small reduction in size vs standard gzip responses, but I've yet to see compelling evidence binast offers any significant time saving to first execution.

It's important to realize: binary-ast is no more readily executable than plain JS, it is no more trustable than plain JS, and even if they were parsing is not been a significant part of ttfe in JSC in a long time. It's the interpreter codgen that eats up time, and the execution speed difference between AST (which JSC used for a looonng time, and the final builds beat the contemporary bytecode interpreters of other engines when it was replaced). The performance of an AST is so much slower than bytecode interpreter that you require a tiny amount of JS before the cost of codegen+execution beats just execution using the AST.


JSC makes breaking changes to the bytecode format on a regular basis. It’s an internal implementation detail and isn’t going to be exposed to standards.

Also the bytecode format has no memory safety guarantees so standardizing it wouldn’t be a great idea. It’s only trusted to obey basic VM invariants if we know that we generated it ourselves.


Don't the existing plans to expose GC [1] and ECMAScript objects [2] to WebAssembly cover this?

[1] - https://github.com/WebAssembly/proposals/issues/16 [2] - https://github.com/WebAssembly/webidl-bindings/blob/master/p...


I was thinking along the same lines as you, but kannanvijayan helpfully pointed out to me that it isn't that simple:

https://news.ycombinator.com/item?id=19966100


Possibly. I'm a bit discouraged that ref's are opaque and cannot be mapped to wasm memory but maybe I missed something.


This will never happen because it turns internal IR (subject to change) into a formalized format that people will depend on every detail of, including bugs.

If you want bytecode, use wasm.


Chrome and Facebook are testing such a thing. I’ll look for the reference link. Maybe one of them can pipe up.


This is, give or take, what web assembly is supposed to do, no?


Wasm is a bytecode, but not js bytecode. In fact compiling js to wasm is very nontrivial / requires compiling it with a whole VM.


WebKit2 Qt when? Also, full QNetworkConnection support. That would be great.


Is it possible to pregenerate that bytecode?

For example for hybrid desktop/mobile apps.


No, because that would require exposing implementation details. e.g. the first API version of the bytecode would be the fixed API, and any future changes would require a translation layer for converting from one version of the bytecode to the other.

The bytecode is also an internal format so is consider trusted - once you expose it you have to worry about invalid bytecode.


And what about caching the bytecode so that it only happens once?


JSC already does that internally.


For webviews too?


Yes.


If you're going to precompile your JS, why would you want to emit this bytecode instead of just going to WASM?


Because wasm still doesn't support many features that JS does. For example accessing the DOM.


Bytecode is a broad category. The design of a bytecode suitable for interpreting JS has little in common with a bytecode for AOT compilation of static languages without tracing GC, other than that they're both made of bytes.



bin ast is/was an attempt to make a more compact version of JS that was easier to parse.

But parsing JS isn't a significant bottleneck, it's the subsequent work to produce the stuff that actually runs. IIRC proponents also claimed it meant you didn't have to worry as much about validation, which isn't true because it's untrusted content that comes from the internet. It must be validated.

binast also is not designed to be any more readily executable that JS - so even if it were supported step one would be "produce a bytecode that is fast".


And it is already shipped in Safari 12.1 ( Which means many are already using it )

Time and Time again it seems WebKit are the only team that wants to make the Web with better Web Pages with javascript , All the others seems to want the Web to be Fat Apps.

In the hope of anyone in Safari team is reading it.

Please make the Tab Overview Cache the Thumbnail or in List format, currently pressing Tab Overview will reload all the tabs in the background. I don't know if this is for generating thumbnails or other reason. Something that kills my machine when I have 300 Tabs, most of them are "cold" and not loaded.


...

"The new bytecode format uses approximately 50% less memory, which means a reduction of 10% in overall memory usage for JavaScript-heavy websites, such as Facebook or Reddit."

This is squarely to help webapps be faster and more memory efficient. This change is great, and doesn't align with the cargo cult of, "Make web pages great again."


> Time and Time again it seems WebKit are the only team that wants to make the Web with better Web Pages ( And we are far from perfecting it ), All the others seems to want the Web to be Apps.

I don’t see how this blog post supports that view, since it’s talking about optimizing JavaScript.


By Web page I mean including minimal Javascript, such as making initial JS loading faster, lower latency, lower memory, and in general much better UX. To me Chrome seems to be optimising for the wrong thing, like maximum throughput, WASM, Super fast in Compute intensive usage but eating memory like crazy.


https://v8.dev/blog/ignition-interpreter

"V8 team has built a new JavaScript interpreter, called Ignition, which can replace V8’s baseline compiler, executing code with less memory overhead and paving the way for a simpler script execution pipeline."

https://v8.dev/blog/preparser

"Lazy parsing speeds up startup and reduces memory overhead of applications that ship more code than they need."

https://v8.dev/blog/embedded-builtins

"V8 built-in functions (builtins) consume memory in every instance of V8. The builtin count, average size, and the number of V8 instances per Chrome browser tab have been growing significantly. This blog post describes how we reduced the median V8 heap size per website by 19% over the past year."

https://v8.dev/blog/improved-code-caching

"V8 uses code caching to cache the generated code for frequently-used scripts. Starting with Chrome 66, we are caching more code by generating the cache after top-level execution. This leads to a 20-40% reduction in parse and compilation time during the initial load."

Etc, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: