Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"OpenCL implementation is also free to effectively merge the stages like in Halide."

I gave up relying on magic compilers a long time ago. And having worked in this domain for a long time I'm actually offended by people who write off the problem as simply a matter of a good enough optimizer. This has significantly held back both performance and portability in imaging. (And likely other areas.)

Halide is not magic, it is just a better slicing of the problem backed up by a good implementation. As always there is no free lunch but when it comes down to actually shipping this kind of code across a wide variety of platforms with great performance, it is a lot more productive than anything else out there.



I do agree that relying on compiler optimizations is a waste of time. They never seem to appear. I simply wished to point out that you can write Halide like code also in OpenCL. And I'd love to see an implementation which would attempt similar style of optimizations what Halide allows.

Halide is extremely domain specific, which is a good thing, it allows them to focus on the problem at hand, namely on how to easily write image processing filters that can be made performant with relative ease. However I would not wish to write a bitonic sort or anything like that in Halide.


Andrew Adams did write bitonic sort: https://github.com/halide/Halide/blob/master/test/performanc...

As I wrote in another comment, the domain of problems for which Halide works is broader than imaging. I usually present it as "data parallel problems." In fact, I'd say the difference in domain between what Halide is good at and what OpenCL and CUDA are good at is not that significant in practice because those languages are basically C/C++ outside of kernel parallelism. (They are each adding some task parallelism facilities as well.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: