(keitai-l) Re: PHP and Japanese characters: Looking for wisdom

From: James Holmes <nattotastic_at_gmail.com>
Date: 06/05/07
Message-ID: <008f01c7a760$d01cb5f0$020ba8c0@TYPHOON>
> Ideally, you use the character set encoding from the Content-type
> header. If you can figure out how to get this from Javascript, I'd be
> pretty interested to hear about it.

Couldn't you do it using AJAX?

var req= new XMLHttpRequest();
req.open('GET', 'your-url-here.com', false);
req.send(null);

if (req.status == 200) 
{
     alert(req.getResponseHeader('Content-type'));
}

-James

----- Original Message ----- 
From: "Curt Sampson" <cjs@cynic.net>
To: <keitai-l@appelsiini.net>
Sent: Tuesday, June 05, 2007 6:58 PM
Subject: (keitai-l) Re: PHP and Japanese characters: Looking for wisdom


> On Fri, 1 Jun 2007, Erick Papadakis wrote:
> 
>> My problem is that Japan seems to have had a devil of a time getting
>> to standardize its character sets! Some big sites like isize.com use
>> Shift_JIS, while others such as Goo or Mixi use EUC-JP, while several
>> of the more modern ones (such as blogs) use UTF-8.
> 
> Some use all sorts. Starling's generally use UTF-8, but we convert
> everything to Shift_JIS (on the fly) or Docomo phones.
> 
>> When we capture the TITLE (document.title) from these websites, and
>> then "rawurldecode" the received text in PHP, the string comes up
>> jumbled. If we knew the standard character set before hand, we could
>> have used the right mb_convert_encoding and such, but this is now an
>> issue.
> 
> Ideally, you use the character set encoding from the Content-type
> header. If you can figure out how to get this from Javascript, I'd be
> pretty interested to hear about it.
> 
> If there's a META tag, as Christopher pointed out, you can give that a
> try. But not everybody uses it (for good reason, actually, for those of
> us who do on-the-fly conversion), and the encoding from the content-type
> header overrides it, anyway.
> 
>> Would appreciate any insight into how you have solved the issue of
>> different in-coming text into programs.
> 
> For me over the past seven or eight years, mostly, it's been about
> dealing with forms, and I generally just put a hidden text field in the
> form with the character set encoding. (Browsers always post using the
> encoding in which they received the page containing the form from which
> they're posting.)
> 
> cjs
> -- 
> Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974
> 
> Mobile sites and software consulting: http://www.starling-software.com
> 
> This mail was sent to address nattotastic@gmail.com
> Need archives? How to unsubscribe? http://www.appelsiini.net/keitai-l/ 
>
Received on Tue Jun 5 14:01:17 2007