(keitai-l) Re: Supported Character Sets for I-mode

From: Nick May <nick_at_kyushu.com>
Date: 01/13/06
Message-Id: <634515B0-9204-4215-97A9-19F05E03ED32@kyushu.com>
[ excellent - lots of people have gone and done testing. Thank you!]

On 13 Jan 2006, at 15:21, Curt Sampson wrote:

> Sorry to be rude,

That's quite alright. As long as you can back it up... ;-)

> but benefits you mentioned in your last post are
> complete rubbish.

Which alas, you can't...  Tush tush!

The figures you quote show the benefits are NOT complete rubbish. (Of  
course you don't get 33% - that was an "ideal" figure that ignored  
markup - as I stated explicitly. So if you wanted to say the figures  
are rubbish - yes of course - re-read the original post and you will  
see all sorts of caveats.)

> Uncompressed, EUC-JP and Shift-JIS are about 8% smaller than UTF-8.
> compressed, about 7-8% smaller.

Excellent! - a 7 to 8% reduction on my bandwidth bills, a 7 to 8%  
increase in peak capacity. That's about what I would expect as a real  
world benefit (it depends on the amount of markup of course).  
Definitely worth having.

What is PARTICULARLY interesting, is

1) your figures suggest that UTF-8 is NOT more compressible than  
EUCJP and SJIS, despite claims that have been made that it is.

2) your figures are for a certain ratio of markup to content - they  
would be lower if one had more (1byte) markup - higher if one had  
less. So the more one moves to stylesheets, and xhtml 1.0, THE BIGGER  
THE BENEFIT!



> For the compressed pages, you have
> to send 10 packets rather than 9, which in a typical TCP connection
> will increase download time by perhaps 3-4% (it's the latency for the
> connection setup and request/response turnaround that eats a lot of  
> time
> in requests this size).

Indeed.  So you admit, a 10% reduction in packets sent. Which your  
browser will start to display faster too.

So - you quote a large number of figures that demonstrate clearly   
the benefits of a two byte encoding scheme. Faster page load time.  
Lower bandwidth bills. Greater peak capacity. Thank you.

So how are these benefits rubbish? Are you going to explain to the  
guy with the modem why his phonebill is 8% higher when he accesses a  
UTF-8 site?

You may say that they are not worth trading off against other  
benefits of unicode in your domain-problem  but that is a different  
argument.

A final thought - if a version of the firebird browser was available  
that let us download pages 5% faster, which of us would not grab it?

7 to 8% - what shall we call it - the "UTF-8 tax"?


Nick
Received on Fri Jan 13 09:38:24 2006