Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unfortunately I don't. I started to learn Unicode, then realized how complicated it is to do right and stopped because I realized that nobody really cares if it works almost all the time.

As Joel below demonstrates, you can get away with 29 languages by treating code points as characters and without knowing about grapheme clusters and other stuff.

https://www.joelonsoftware.com/2003/10/08/the-absolute-minim...

>When CityDesk publishes the web page, it converts it to UTF-8 encoding, which has been well supported by web browsers for many years. That’s the way all 29 language versions of Joel on Software are encoded and I have not yet heard a single person who has had any trouble viewing them.



Not really relevant. That just demonstrates that displaying those languages works adequately; it doesn't show anything about other processing that your software might care about (e.g. sorting, searching, case conversion, keyboard input, selection and editing, etc.)


> As Joel below demonstrates, you can get away with 29 languages by treating code points as characters and without knowing about grapheme clusters and other stuff.

If you treat text as completely opaque it does work fine. Issues crop up when you want or need to manipulate said text, either to extract information or to modify it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: