On Sun, 16 Jan 2005, Alex Shinn wrote:
> The order you are using is the standard JIS order (almost - in all
> cases the small form sorts before the large form, which you have right
> for A, I, U, E, and O, but not YA, YU, YO or WA).
Thanks for this correction. A second look at my denshi-jishou confirms
that it does indeed sort small ya/yu/yo before large.
> If the data is stored in the database with a JIS-based encoding (any
> of the standard Japanese encodings, plus Unicode also preserves this
> order) then PostgreSQL will sort this properly.
> ...for Hiragana you don't need to do anything special.
This is not true, because sorts based on the numerical representation of
a kana can't give tokuon a lower precedence than kana following the kana
with tokuon. For example,「じゃきょう」 sorts before 「しゃく」in my
dictionary, but with a sort based on character codes, じ (0x3058) comes
after し (0x3057), and so じゃきょう would sort after even 「しんぬ」.
> Kanji is the difficult thing to sort, which PostgreSQL can't handle
> because the characters have different pronunciations in different
> contexts and you would need full NLS to figure out the right one.
If names are involved, an NLS won't do it for Japanese. The *only* thing
that will work properly in all instances is if you store the reading as
well as the kanji.
cjs
--
Curt Sampson <cjs@cynic.net> +81 90 7737 2974
*** Contribute to the Keitai Developers' Wiki! ***
*** http://www.keitai-dev.net/wiki/ ***
Received on Mon Jan 17 07:58:06 2005