PDA

View Full Version : 406 for dynamic pages - mod_rewrite the solution???


marcus-miller
07-03-2002, 06:22/06:22AM
Hi all,

The company I work for pays another small firm to submit our pages to the inktomi database. ( I have just picked up on this task so excuse me if some of my info is a bit ropey )

<<<<<BACKGROUND INFO>>>>>
The pages are dynamic and use a function of my own to parse the .htm urls to populate the neccessary variables with data needed to query the databases and build the dynamic aspects of the pages.

the urls are in the format of

www.domain.com/page/ - with no vars appended to the .htm string
-or-
www.domain.com/page/3-category-name/

where /page/ is a .php page hosted on a linux box running apache server and /3-category-name/ is exploded to give a category number 3 and extra details (category name) for dynamic meta tags, titles and body text

Where the default url above is called the php script searches for the presence of the variables < if (!isset($var)) { > where non exists it gives a default value.

----------------------------------------------------------------------------------

The problem we are having is detailed in the following text from an email from the company doing the submission:

We contatcted Inktomi and here is the response returned.

Their server is actually serving our crawler an HTTP 406 error code because
it seems it doesn't like the Accept: header which we send (Accept: text/*).
Normally this happens when people try and serve dynamic content based on the
Accept/User-Agent combination, and don't code for crawler accesses. If you
can get them to check their content-negotiation code for errors and update
it so that it can handle text-only clients then that should solve the
problem.

Support
xxxxxxx xxxxxxxxxxx


Im not sure if anyone can help, but a little birdy mentioned something to me about some apache dark-arts going by the name of 'mod_rewrite' that may be a solution to this.


Please excuse me if i have overstated some details here but this is my second post, and I have just had this problem dropped on my desk and dont know where to start....

all replies appreciated..

Thanks in advance
Marcus
:cheers:

highman
07-03-2002, 06:31/06:31AM
Hi, information on this and links to resources are here;

http://www.ihelpyouservices.com/forums/showthread.php?s=&threadid=504&highlight=mod+rewrite

hth

Alan Perkins
07-03-2002, 06:39/06:39AM
Hi Marcus

I saw your post in the other thread but this is a better place to respond.

I requested this page:

http://www.ringtones-direct.com/ringtones/artist-so-solid-crew.htm

using a spider-like HTTP request and received this in the response:

HTTP/1.1 302 Found
Location: http://www.mobilefun.co.uk/ringtones/artist-so-solid-crew.htm

That's bad - a redirect to another site! But let's assume for a moment it will be followed (I'm not sure that it will). I then requested the redirected-to page and received exactly the same response:

HTTP/1.1 302 Found
Location: http://www.mobilefun.co.uk/ringtones/artist-so-solid-crew.htm

Now that's really bad - an infinite loop. No chance of being indexed.

Your dynamic content delivery system is not taking account of a typical spider's request header, which normally consists of a HTTP/1.0 plus a Host field, a non-Mozilla User Agent field and a limited Accept-Types field. You need to change your programming to cope with this kind of request and deliver a HTTP 200 response straight off the bat.

That should be enough for you to work it out from there, if you are the programmer. PM me if you require further assistance. ;)

marcus-miller
07-03-2002, 07:29/07:29AM
Hi Alan,

thanks for your reply, but you have thrown me into a world of hurt. :)

Ill give u a little bit more background. The ringtones-direct site is a virtual domain on a cobalt raq4 box with one ip that points to the mobile fun site. This is not a redirection as some kind of trick.

Is there any chance you could send me some details, links etc regarding how you did your spider style http request and how you think i should solve this (with php, apache-mod_rewrite???) Im sure i can sort it out but I am really at odds with it at the moment.

Thanks in advance..

marcus

marcus@mobilefun.co.uk

P.S. This is a copy of the PM i sent u for the benefit of others. ( and me if i get more help :) )

ihelpyou
08-03-2002, 16:09/04:09PM
Welcome to the forums marcus-miller! :hi: