Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's two competing currents here. One the on hand, Julia is extremely powerful and dynamic, so it has very high expressability for any possible differentiable programming you could think of. It also has a fairly simple core, so as long as you know how to properly transform the core, you can get a mathematically correct differential. However, the reason julia works so well is that the compiler is able to understand and eliminate all the dynamism and complexity for you at run time and generate very tight code as a result. When you add AD into the mix, the generated code becomes, much, much more complicated, so the compiler needs to work a lot harder. That can sometimes be a cause for frustration for us, because we get something really cool working with absolutely zero effort, but then need to go back and make the compiler better to reclaim some performance (static languages have the opposite problem, where you need to improve the type system in order to allow the thing to happen, but once you have described it in a specific enough type system, you usually also have good performance - if your language is any good that is).

Overall I think Julia is a pretty great language to implement AD (as evidenced I'd say by the 15 or so different AD packages that people have written for individual use cases before the latest push for a unified one), but it still is a very powerful language, so if you want your AD to handle the whole language (as we do), then you're gonna have to do a bit of work.



Thanks! So it sounds like what you're saying is that, as with other languages that have complex optimizers, there can be performance cliffs when you get too far off the beaten path?

For someone not working on the Julia compiler, how tricky it is to figure out what to do to improve performance?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: