Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder how much time is collectively 'wasted' due to kanji/Chinese characters.

Each of these characters also have a certain order to the way they should be drawn, and from what direction. And at least in Japanese, each one has at least 2 readings (and sometimes much more), a Chinese reading and a Japanese reading, but which one is used doesn't always follow the 'rules'.

A lot of this is likely learned through simple exposure for listening and speaking, but it makes reading and especially writing rediculous.



In my experience with Japanese, kanji is more work to learn how to write, but it makes it easier for others to read - not just compared to romaji but even hiragana and katakana. I'm not really sure it's time "wasted".

> rediculous

In a lot of ways, it's like spelling in English ;)


>kanji is more work to learn how to write, but it makes it easier for others to read - not just compared to romaji but even hiragana and katakana.

The reason for this, which I think most westerners won't immediately understand, is because Japanese (and Chinese scripts in general) doesn't use spaces nor capitalization.

What do I mean by "doesn't use spaces no capitalization"? Think of English, but without either: canyoureadthiseasily?ididn'tthinksoeither.nobodywriteslikethisinenglish. A sentence written purely in hiragana reads like that.

Kanji serves the same function spaces and capitalization does for us, they distinguish individual words from each other.

Some words are written entirely in hiragana for brevity or style, but generally speaking the purpose of hiragana is to string the kanji together and add contextual information so the sentence flows well.

On that note, all foreign words are written in katakana, and katakana is also used for emphasis, similar to ALL CAPS or bold writing in English.


An even greater reason for the use of kanji is that the Japanese language has an enormous amount of homophones, words that are pronounced in the same way but have completely different meanings.

When talking, Japanese speakers can usually tell homophones apart based on the context. In written form, kanji make it much faster to understand what a given word means: for anything more complex than children’s books, it is actually quite hard to understand Japanese text written in a purely phonetic system (kana or latin).


> canyoureadthiseasily?ididn'tthinksoeither.nobodywriteslikethisinenglish

This is very legible for a native english speaker


It’s legible, but it takes a bit of active effort which makes it tiring to read more than a few sentences in a single sitting.


Spaces weren't always used. If you look at old greek/roman engravings everything is jammed together without spaces. I think you could eventually become quite good at reading spaceless text, even though spaces definitely aid in comprehension.


There are still several languages that don't use spaces. Like Thai and Cambodian. Both of these languages are very analytic - they use short words, which makes it easier.

The actual problem without using spaces is not that humans can't read it, it's that computers can't read it (without at least complete dictionary and maybe some AI help). GNU aspell for instance does not support languages that don't use spaces.

[1] http://aspell.net/0.61/man-html/Unsupported.html


Yeah, context is essential. We've seen plenty of examples of website names where the spaces being squashed out gives alternative meanings. Two come to mind where the last word was "exchange" and it followed a plural. No simple-minded spel chequer is going to be able to figure that out.


An example of text segmentation library for Japanese:

https://en.m.wikipedia.org/wiki/MeCab

Like you said, it does come with a dictionary to work properly.


It's legible (ish) due to the limited ways that English characters can group up - e.g. there's only one valid way to split up "writeslikethis". In JP many common words are only 1-2 characters long, so in general even a very short string of kana can be split up multiple valid ways.


Yeah, no problem here. I've been learning japanese for a while and have encountered the phenomenon though. Still if you look at e.g. the front page of JP wikipedia: https://ja.wikipedia.org/ there's plenty of kana there, be it words that are habitually written entirely in katakana or hiragana, or even kanji words that still have some attached kana to disambiguate readings.

After more practice I've found myself starting to pick up on common dividers, like particles, verb endings, adjective endings, etc. I assume native speakers do this instinctually, much like native English readers aren't really reading letter by letter (wichh is why txet lkie tihs is rbleadae at ntiave seepd for most)


Although you can write hiragana-only with spaces between the words - children's books are written like this. But I still find it way harder to read my kids hiragana-only books than text using kanji, for whatever reason (familiarity I guess)


I also find hiragana-with-spaces hard to read, even as a gaijin. So I'm somewhat skeptical of the idea that Japanese and Chinese are harder to read because they don't use spaces. No offense to some people here, but I think it would be better to avoid coming at this from a position of "what is the secret that makes western culture so much better?"

After a quick search, there is at least some research out there that suggests that Chinese readers are able to read faster than English readers, e.g. https://www.tandfonline.com/doi/abs/10.1080/1938807990955829.... This comes at a cost though, and that is the time spent learning how to write, but maybe technology like word processors will reduce that cost while retaining the benefits.


>"what is the secret that makes western culture so much better?"

I wrote that example as an attempt to frame how Japanese scripts are written compared to English. If you thought it had anything to do with superiority and inferiority, you read between the lines too much.

Spaces can be used, especially in modern times and are sometimes used for even better readability, but generally and historically Japanese script is one connected string of characters with no breaks except for the occasional punctuation and line breaks.

A connected string of nothing but hiragana and katakana is very infuriating to read, with or without spaces, and as a practical concern the issue manifests quite frequently in any piece of longer-form writing that contains lots of foreign words (because foreign words are all written in katakana).


I'm not aware of such research but it doesn't surprise me at all. Chinese and Japanese have more bits per symbol and thus if symbols can be decoded in a similar timeframe they should read faster.


I think

c a n y o u r e a d t h i s e a s i l y i d i d n t t h i n k s o e i t h e r n o b o d y w r i t e s l i k e t h i s i n e n g l i s h

might get the feeling across a bit better?


There's no way a language with that kind of major "deficiencies" have survived past 2k years, and especially the postwar media development. It's all just creative quirks, guaranteed to fit well within the computational budget of a human brain.

Similar kind of criticism can be made up against English: English is such an inefficient language, there are only 26 symbols, 52 if you count "capital" letters but those cannot be used for better compression, and so you cannot just remember a letter but 2-3 sequences of letters as meta-symbols, and those meta-symbols would have tons of special cases when it comes to pronunciation, and and ...

Most books in most languages compress into similar sizes of zip files, it's not like there is always one version that gets to just 512 bytes and the other that's always 50MB. I would think that "all languages are explicitly equal" is a stance too dogmatic to hold, but clearly all our human languages are quite close against each others even just looking at that.


There’s a stroke order but the rules are pretty easy to learn. Outside of calligraphy, nobody will likely care if you get the order wrong. There’s also shorthand ways to write kanji called ryakuji.

Yeah, kanji can have lots of readings. There are some rules around when you use an on’yumi vs kun’yomi. A kanji paired with a hiragana is always a kun’yomi reading, but if there’s no kana you just have to know which reading to use. You can only pick that up through context and exposure. Of course any one kanji can have multiple on’yomi and kun’yomi readings. (And then there’s the concept of rendaku which can modify pronunciation based on neighboring syllables.) There’s nothing easy about it, but it is what it is. The you’re serious about learning Japanese it’s something you just have to get used to.

But nevertheless, kanji has many positives. It can make it much faster to read things because they can wrap complex things into a single character. If you have an unfamiliar character, understanding its components can often times provide a hint as to its meaning.

Think of it kind of like reading a conversation entirely in emoji. You may not get the whole meaning, but you can generally pick up the gist. And with far fewer words than it would take to write it out in full.


Does the terseness help with typing too? Or does it come out to more or roughly the same number of key presses to express things?


Kanji take a few more key-pressed to type. You start by typing the phonetics, and your keyboard/word processor will suggest the kanji that fit the word, which you select with arrows or space bar.


This picture from the Japanese IME article on Wikipedia is a pretty good demonstration on typical use:

https://en.m.wikipedia.org/wiki/File:IME_demonstratie_-_Mats...


> I wonder how much time is collectively 'wasted' due to kanji/Chinese characters.

There's probably research on this in Chinese educational literature. Pinyin, though now used primarily for computer input and foreign language learning, was originally conceived for use in teaching literacy to native Chinese speakers. Mass literacy in China was a particularly hard nut to crack because of the huge challenge of learning thousands of characters as compared to a small phonetic alphabet.

The ambition of replacing characters with pinyin never really developed momentum for native Chinese speakers. But the increasing use of computer input, I believe, has reduced the emphasis on years of character-writing memorisation drills in Chinese schools.


>each of thrse characters also have a certain way they should be drawn

I have been learning japanese for a few years now, and kanji stroke order was honestly one of the easiest things to get right. there are exeptions, but in general, after having practiced like 50 characters, you pick up on patterns and are able to guess stroke order pretty easily. I highly doubt learning that specific aspect causes a significant amount of time wasted.


It also doesn’t truly matter. It just generally leads to being able to fit the parts together fairly easily.


And it becomes easier to read if people have bad handwriting. Cause badly written characters with the same stroke order will look similar to each other, therefore it will be easier be easier to read for people used to reading japanese handwriting.


In the reading direction, it has similar strengths and weaknesses as irregular English spellings. Faster to scan given there's more variety in glyph shapes to take cues from. Preserves etymology and relationships between words, so new words can actually be faster to learn at an intermediate level of fluency. Personally I often can guess the meaning of new words from kanji + context, but if I only hear the pronunciation I have no idea, so I always turn on closed captions on Japanese TV.

In the writing direction, it's both time-consuming to learn initially and time-consuming to write every time, but it does carry a certain satisfaction and delight. So the current trend to learning exclusively how to read and type them seems economically productive, but the loss of aesthetic enjoyment of calligraphy in daily life is regrettable.


I've heard that the Kanji make Japanese and Chinese much easier to scan quickly once you're fluent.


I've only learned Japanese as far the N4 test (second of five levels of the standardized tests), but my experience backs this up. Those tests preferred syllabic symbols instead of kanji, and that just made them harder to read for me.


Chinese doesn’t have a writing system that mixes different sets of pictographs. It has traditional and simplified versions of the writing system but they’re not intermixed the way Japanese does it.


Such mixing is simply not required in Chinese since words in all Chinese varieties are not inflected. There are some particles that are quite common (like 了 and 子), but they are very easy to write. The Simplified Characters are further optimized for writing speed. Most importantly 儿 for 兒 and 个 for 個, since these characters are very common in Modern Standard Chinese.


Probably not that much, at least when comparing to French or English orthographies. Those two are very complex, and it's easy to forget about it because we used them for decades. But try teaching reading, then writing a beginner in those languages and you'll see for yourself how complex they actually are. Also communication with less educated people (in my company internal docs and code are riddled with mistakes in both languages) is revealing in that matter. I myself have to check very often if a letter is doubled or not, if a 'e' or 'a' is used in word like dependent and it doesn't help that cognates existing in French and English are often written slightly differently, which adds to the possible confusion.


> Also communication with less educated people (in my company internal docs and code are riddled with mistakes in both languages)

...but you are still able to make out the meaning without all that much difficulty and without much of a chance of misinterpretation, i.e. these language are quite robust in the face of transmission errors. How does this work for ideographic scripts like Japanese Kanji or Chinese? When people make the sort of mistakes made by those less educated people mentioned above does their writing end up similarly readable, i.e. are ideographic scripts similarly robust in the face of transmission errors?


As a native Chinese speaker, I know that many older and less educated people in China do write with errors or using some non-canonical simplified characters, but usually it's not a problem for us. In most cases, the meaning of the word with the character missing or corrupted can be deduced from the context.

For example, it's common to see people in mainland China that cannot write 餐 (can1, meal) correctly, but I never experienced any issue with that in real life.

In addition, I read and write simplified Chinese natively, but I can read traditional Chinese with little difficulty, and even a little bit of Japanese with high kanji density. The characters have evolved significantly, but the remaining similarity is still enough for me to parse the text.


We were once in Japan, my wife (mainland Chinese) was able to read about 1 in 4 characters without any practice. (Admittedly, it's possible some were wrong, but if so they were uncommon as everything she identified as being what we wanted turned out to be right.)


> How does this work for ideographic scripts like Japanese Kanji or Chinese?

For Chinese, it’s usually not difficult. If a stroke or two is off, or if it’s the wrong radical, you can usually figure out pretty quickly with a similar effort to a misspelled word what they meant from context. If they incorrectly used a homonym, you likewise can tell pretty quickly, like “queue” vs. “cue”.

In Chinese this partly happens because you don’t just memorize individual characters, you also memorize pairs/clusters of characters as the actual semantic units.


>...but you are still able to make out the meaning without all that much difficulty and without much of a chance of misinterpretation, i.e. these language are quite robust in the face of transmission errors

Well, sometimes. There are plenty of single-character errors in English that could change meaning though: Presence of absence of a comma, asymptomatic vs symptomatic, bat vs bet, etc.


Look no further than the US constitution, particularly its second amendment, to see how much a single punctuation mark can matter.

Also sentences like: "the Google party featured two strippers, Larry and Sergei." Are the founders of Google the two aforementioned strippers or VIP guests?


Neither of those examples really hinge on the punctuation. The latter is a constructed absurdity based on violating information structure soft constraints (i.e. the way it's phrased is deliberately unhelpful).


For more serious examples: https://en.wikipedia.org/wiki/Serial_comma

The one I chose was a tongue-in-cheek absurdity. There are other cases where there actually is ambiguity.


Nominally, sure. In practice, almost all instances are resolved by context, modulo motivated reasoning.


A lot, I'm sure. I have no experience with Japanese but my wife is a native Chinese speaker.

1) In a language like English if you know a word and are faced with it's written form that you do not know you can probably figure it out. In Chinese knowing how to say it gives you no guidance in figuring out the written form, literacy takes years to learn. I have watched two literate Chinese speakers stumped by an unknown character and not even realize it was actually Japanese, not Chinese. (I knew because of the context, not from being able to read it.)

2) It appears to take my wife longer to write something in Chinese (and she's not a word-processor idiot by any means, when she's putting something into the computer she uses handwriting recognition, she's never learned the new ways) than it would take me in English--more strokes and they are less connected than even printing, let alone cursive.

3) Chinese lacks the concept of alphabetizing. I've watched her with her dictionary (we've been together since the 80s) and it's a process of looking stuff up in tables to get you close to your target and then a manual search once you're there. I can find a word in an English->Chinese dictionary far faster than she can find one in a Chinese->English dictionary.

That being said, I'm not going to call someone a word processor idiot because maintaining a skill like that costs time--it simply isn't worthwhile for most people. Many years ago I chose to abandon cursive when I realized that if I wanted to be able to write it decently I would have to deliberately practice--and I would say I write maybe a dozen words a month. Practicing would be a time sink, not a time saver. (Not to mention most of those words are on a whiteboard, too large to do in cursive.) Look it up is a perfectly good answer in most cases, only an issue for emergency type skills.


Anyone interested in this train of thought (pro or against) should read the Chrysanthemum Dynasty series by Ken Liu.


I mean, I'm interested but I don't know I am that interested. Can you just tell me the conclusion and I'll read Liu based on that? :)


I am "shocked" how much human collective time is wasted on inefficient software and configurations, including:

- UI animations sliding/folding (e.g. 1Hz interactivity vs 120Hz interactivity).

- MS Windows default keyboard settings with slowed down key repeat rate.

- corporate firewalls configured to DROP instead of REJECT the LAN traffic.

- latency to access office documents in the cloud.

- MS Teams bugs frustrations.

- increased number of mouse clicks in WIN11 to reach common functions.

- other software bugs/annoyances we cannot fix, c.g Jonathan Blow.


- Multi-threaded UI updates, where the elements being shifted from the expected position (after a background update), causing the mouse clicks to miss the expected element. Need to delay the clicks and wait until the UI updates are settled, before clicking on an element.


No--you have it backwards. The problem comes from delayed clicks, it won't be cured by delaying clicks. You click the spot you want and something more loads causing the control to move before the click is processed.

I'm not sure how to do it but the clicks need to be processed against the state of the display when the click was done rather than against the state when the click was processed. The only way to do this that comes to mind is to snapshot the hot zones on the screen before doing anything that will disrupt the screen and any click received during or within 300ms (configurable, I'm figuring a minimum reaction time) after the screen update should be processed against that map, not against the current state.


It still will be a timing game. What if I want to click on the new object in 300 ms time period? I need to wait 300 ms to do so, this is a flow issue.

If 1 billion current MS Windows users would make daily 1000 UI operations, but each time wait 100 ms for UI animations, the collective wasted time on the animations would be 3000 years of collective wasted time.

Fortunately most are just watching movies and shopping online, so not much collective _productive_ time is wasted ;)


no more time than it's spent in spelling and accurate grammar

kanji is useful training for visual memory and can be used to provide memory pegs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: