--- In general it just seems "wrong" to presuppose that a certain
--- url will serve content in a set format.
I don't really have preferences either way, but despite the drift towards
'all content is dynamic' (and hence able to be 'sniff-n-served' based on
USER-AGENT), high-traffic sites will continue to make the concession to
[processing power|database access] and serve static pages (html, chtml, wml,
xml,...) that have been preprocessed/generated.
By definition this results in "certain url will serve... set format", and
restricts prevents the 'disappearing META' solution.
Thinking about my previous email, using ROBOTS.TXT with a USER-AGENT filter
for keitei crawlers *could* work for everyone.
It's very presence could indicate (to a spider so trained) that the current
site *has support* for the content type.
The Disallow: list could then be as long or short as required, and works
whether your pages 'sniff-n-serve' or live in content-specific sub-dirs. The
disallow USER-AGENT could be as simple as
User-agent: Docomo
--- Your engine identifying itself is very important
YES. Interestingly, my site was crawled by Google on 1Feb, with
User-agent: Googlebot/2.1+(+http://www.googlebot.com/bot.html)
Does Google use a 'different' agent when crawling for i-mode, and if so,
what? Has anyone on the list been crawled by an i-mode-savvy spider? What
USER-AGENT did it send?
cd
[ Did you check the archives? http://www.appelsiini.net/keitai-l/ ]
Received on Fri Feb 9 12:17:22 2001