using robots.txt is a nice idea!
and so simple to set up.
the mere presence of a line that says:
User-agent: DoCoMo/*
would indicate that there is i-mode content to be found here.
but, does Google take the robots.txt file seriously?
i've seen other well known search engines (although not Google)
blast right into directories that i listed as "Disallow".
-rolf
"Craig Dunn" <craig.dunn@conceptdevelopment.net> writes:
...
>
> my 2\ on META tags:
>
> * The existing ROBOTS.TXT spec allows for rules by USER-AGENT. So if Google
> tells us what they're sending, we could add
> User-agent: Docomo/Google_ROBOT # or whatever
> Disallow: /'dirs-pages that aren't imode' # multiple lines OK
> which would narrow their search dramatically AND fit within existing and
> accepted crawling rules. Note this assumes Google is crawling with a
> different USER-AGENT for their imode catalog than they do for their web
> catalog.
> This works for HDML, WML, whatever IF you divide content by directory -- but
> not if all your pages are USER-AGENT sniffing and presenting custom
> presentation.
>
> * if you want a META tag and you're doing major USER-AGENT checking already,
> just insert the META when ROBOT is in the USER-AGENT string. typically
> robots don't like pages being 'customised' for them, but in this case it'll
> save all your *other* users from a pointless 20 or so bytes of download per
> page.
...
[ Did you check the archives? http://www.appelsiini.net/keitai-l/ ]
Received on Fri Feb 9 12:12:09 2001