> Ideally, you use the character set encoding from the Content-type
> header. If you can figure out how to get this from Javascript, I'd be
> pretty interested to hear about it.
Couldn't you do it using AJAX?
var req= new XMLHttpRequest();
req.open('GET', 'your-url-here.com', false);
req.send(null);
if (req.status == 200)
{
alert(req.getResponseHeader('Content-type'));
}
-James
----- Original Message -----
From: "Curt Sampson" <cjs@cynic.net>
To: <keitai-l@appelsiini.net>
Sent: Tuesday, June 05, 2007 6:58 PM
Subject: (keitai-l) Re: PHP and Japanese characters: Looking for wisdom
> On Fri, 1 Jun 2007, Erick Papadakis wrote:
>
>> My problem is that Japan seems to have had a devil of a time getting
>> to standardize its character sets! Some big sites like isize.com use
>> Shift_JIS, while others such as Goo or Mixi use EUC-JP, while several
>> of the more modern ones (such as blogs) use UTF-8.
>
> Some use all sorts. Starling's generally use UTF-8, but we convert
> everything to Shift_JIS (on the fly) or Docomo phones.
>
>> When we capture the TITLE (document.title) from these websites, and
>> then "rawurldecode" the received text in PHP, the string comes up
>> jumbled. If we knew the standard character set before hand, we could
>> have used the right mb_convert_encoding and such, but this is now an
>> issue.
>
> Ideally, you use the character set encoding from the Content-type
> header. If you can figure out how to get this from Javascript, I'd be
> pretty interested to hear about it.
>
> If there's a META tag, as Christopher pointed out, you can give that a
> try. But not everybody uses it (for good reason, actually, for those of
> us who do on-the-fly conversion), and the encoding from the content-type
> header overrides it, anyway.
>
>> Would appreciate any insight into how you have solved the issue of
>> different in-coming text into programs.
>
> For me over the past seven or eight years, mostly, it's been about
> dealing with forms, and I generally just put a hidden text field in the
> form with the character set encoding. (Browsers always post using the
> encoding in which they received the page containing the form from which
> they're posting.)
>
> cjs
> --
> Curt Sampson <cjs@cynic.net> +81 90 7737 2974
>
> Mobile sites and software consulting: http://www.starling-software.com
>
> This mail was sent to address nattotastic@gmail.com
> Need archives? How to unsubscribe? http://www.appelsiini.net/keitai-l/
>
Received on Tue Jun 5 14:01:17 2007