I want to kills this compression thing once and for all. Too many of
us (me included) are making statements for which we do not have hard
evidence...
Before I go doing any testing, are people agreed that this (below)
will let me test post compression lengths for various encoding
formats, given an input file of jpfile.txt? I don't want to mess
around with something then have to field criticisms later....
Note: - I don't know if php can do UTF-16. So that bit may not work...
It is knocked up, verbose code designed for readabily - I have not
tested it yet.
Is this ok test code for php?
Nick
// grabs test file jpfile.txt
// determines current encoding
// converts from that to a specified encoding
// gzips converted text (including headers...)
// gets length of variables containing the gzipped
// repeats for various encodings
// TO BE DONE: echo it all...
// put file content in variable
$str = implode("", file("jpfile.txt"));
// get encoding from list of possibilites.
$encoding= mb_detect_encoding($str,
"ASCII,JIS,UTF-8,UTF-16,ISO-8859-1,EUC-JP,SJIS");
// convert to sjis
$str_sjis = mb_convert_encoding($str, "SJIS" , $encoding);
// gzip
$str_sjis_gz = gzencode($str_sjis, 9);
//get length of gzip file
$str_len_sjis_gz = strlen($str_sjis_gz);
// rinse, repeat.
$str_eucjp = mb_convert_encoding($str, "EUC-JP" , $encoding);
$str_eucjp_gz = gzencode($str_eucjp, 9);
$str_len_eucjp_gz = strlen($str_eucjp_gz);
$str_utf8 = mb_convert_encoding($str, "UTF-8" , $encoding);
$str_utf8_gz = gzencode($str_utf8, 9);
$str_len_utf8_gz = strlen($str_utf8_gz);
$str_utf16 = mb_convert_encoding($str, "UTF-16" , $encoding);
$str_utf16_gz = gzencode($str_utf16, 9);
$str_len_utf16_gz = strlen($str_utf16_gz);
// echo it all...
Received on Fri Jan 13 08:59:25 2006