(keitai-l) Re: determining what is an i-mode page

From: rolf van widenfelt <rolf_at_pizzicato.com> Date: 02/09/01 Message-ID: <3A83C40D.F6DC667B@pizzicato.com>

using robots.txt is a nice idea!
and so simple to set up.

the mere presence of a line that says:

	User-agent: DoCoMo/*

would indicate that there is i-mode content to be found here.

but, does Google take the robots.txt file seriously?
i've seen other well known search engines (although not Google)
blast right into directories that i listed as "Disallow".

-rolf

"Craig Dunn" <craig.dunn@conceptdevelopment.net> writes:
...
> 
> my 2\ on META tags:
> 
> * The existing ROBOTS.TXT spec allows for rules by USER-AGENT. So if Google
> tells us what they're sending, we could add
>         User-agent: Docomo/Google_ROBOT # or whatever
>         Disallow: /'dirs-pages that aren't imode'       # multiple lines OK
> which would narrow their search dramatically AND fit within existing and
> accepted crawling rules. Note this assumes Google is crawling with a
> different USER-AGENT for their imode catalog than they do for their web
> catalog.
> This works for HDML, WML, whatever IF you divide content by directory -- but
> not if all your pages are USER-AGENT sniffing and presenting custom
> presentation.
> 
> * if you want a META tag and you're doing major USER-AGENT checking already,
> just insert the META when ROBOT is in the USER-AGENT string. typically
> robots don't like pages being 'customised' for them, but in this case it'll
> save all your *other* users from a pointless 20 or so bytes of download per
> page.
...

[ Did you check the archives?   http://www.appelsiini.net/keitai-l/ ]