(keitai-l) Re: determining what is an i-mode page

From: Craig Dunn <craig.dunn_at_conceptdevelopment.net> Date: 02/09/01 Message-ID: <OGEGIKAMGPHPOJLMMLMLEEJMCAAA.craig.dunn@conceptdevelopment.net>

--- In general it just seems "wrong" to presuppose that a certain
--- url will serve content in a set format.

I don't really have preferences either way, but despite the drift towards
'all content is dynamic' (and hence able to be 'sniff-n-served' based on
USER-AGENT), high-traffic sites will continue to make the concession to
[processing power|database access] and serve static pages (html, chtml, wml,
xml,...) that have been preprocessed/generated.
By definition this results in "certain url will serve... set format", and
restricts prevents the 'disappearing META' solution.

Thinking about my previous email, using ROBOTS.TXT with a USER-AGENT filter
for keitei crawlers *could* work for everyone.
It's very presence could indicate (to a spider so trained) that the current
site *has support* for the content type.
The Disallow: list could then be as long or short as required, and works
whether your pages 'sniff-n-serve' or live in content-specific sub-dirs. The
disallow USER-AGENT could be as simple as
	User-agent: Docomo

--- Your engine identifying itself is very important
YES. Interestingly, my site was crawled by Google on 1Feb, with
	User-agent: Googlebot/2.1+(+http://www.googlebot.com/bot.html)

Does Google use a 'different' agent when crawling for i-mode, and if so,
what? Has anyone on the list been crawled by an i-mode-savvy spider? What
USER-AGENT did it send?

cd

[ Did you check the archives?   http://www.appelsiini.net/keitai-l/ ]