Status
ooh, tantalizing - USGS figures the where2.0 community is ready for the hard stuff? #unfilteredpaleo
Location
Vidisha, India
Subscribe to GeoRSS Subscribe to KML


How many words (or characters) are enough?

Published in Chinese, Observation  |  1 Comment


Reuters has an article that fewer characters being used in written Chinese than ‘before’ (not sure when before was). For example, you typically need to know 900 characters to read 90% of current publications. People usually gawk at the number of characters they say that Chinese writers/readers must learn, of the 50,000 individual characters that exist.

However, as Slate points out, there are currently almost 1,000,000 (million) English words, 50,000 of which are headwords – or the primary, bold-faced, word. I also had a statistic around somewhere about how many English words make up 90% of a typical daily newspaper, but cannot find it (silly data…)

And Chinese characters are commonly built up of radicals which indicate what the word can mean or deals with. Being an engineer, chinese script feels very “algebraic” to me. I learn what x and 2^y and π mean. However, when you see “angry” and “hungry” do they give you any idea what they mean?

Other interesting facts: there are fewer than 100,000 words in French, and 24,000 different words in the complete works of Shakespeare (1,700 were invented by him). About 80% of the information stored in the world’s computers (such as this text) are also in English. And as Joi Ito quandries: If news is not in English, did it happen?”

Similar Posts


Responses

  1. Bob says:

    May 24th, 2006 at 8:20 am (#)

    Is it the case that each character is spoken as one syllable? I’ve noticed when reading credits on Chinese movies that when an English translation is provided that there’s a charactersyllable mapping. In fact, you can often find the same syllable in a few names and deduce which character matches it.

    And since I understand characters to be words, that means that one word maps to one syllable as well, presumably with the tonal aspect.

    If that’s the case, would it then follow that there are about 900 syllables used to cover 90% of an evening newscast? Or is spoken Chinese richer than written Chinese?

    And to go off on a tangent, is there some aspect of a written character that immediately tells you which category of word it is — noun, verb, adjective, etc?

    Are the characters for verbs somehow “conjugated” to indicate whether it is I, you, we, he, etc. who is the actor? Or is it done with a separate character?

Leave a Response