PDA

View Full Version : Recommendations for Indexing a Large Catalog


jbernat
26-08-2005, 15:22/03:22PM
What would be the recommended way to optimize a commerce site with a catalog of approximately 15,000 automotive publications?

Cross-reference information by vehicle make, model, and year, as well as type of publication (Owner Manual, Service Manual, Labor Time Guide, Technical Service Bulletin, Wiring Diagram, etc.)

The site currently has a custom product search engine to take the parameters listed above and return pages of results.

Unfortunately, the spider will not get past the form, so nearly no product pages are indexed.

Should a decision tree of static pages be generated from the data on a semi-regular basis (as product information is updated) to provide a "browse" or "drill-down" navigation alternative to the product search navigation?

The product pages are dynamic URLs with parameters, but my understanding is that Google will index a dynamic URL linked from a static page. (It just won't go any deeper beyond that dynamic page.) True?

If this decision tree resulted in 10-20,000 new pages being added, how would the engines react?

I can think of at least two ways to slice the tree:
1. Make > Model > Year (ending at a list of applicable publication types)
2. Pub Type > Make > Model (ending at a list of model years for which the pubs apply)

Should both be implemented? Current keyword analysis seems to favor strategy #2, but a others might opt for #1.

Are there completely different alternatives to this strategy?

I welcome your thoughts and feedback.

Thanks,
Jim

dvduval
27-08-2005, 01:51/01:51AM
I think either method would work, and you might want to employ Mod Rewrite, as it does often get the search engine to index a little better. Using good page titles, good navigation and linking the breadcrumb trail, you should be able to get some pretty decent indexing (I assume the pages each contain sufficient unique content). I nice bonus would be to create a Google Sitemap and a Froogle Feed (and they would both be pretty easy to implement). Let me know if you need any help.

jbernat
29-08-2005, 09:17/09:17AM
Thank you for responding. I am not so certain about the sufficient unique content. The database is rather sparse in many areas, redundant in others.

With so many thousands of products that vary in concept so little (Chevy owner manual, vs a Ford Owner Manual) a boilerplate description of an owner manual appears for many entries. Further, there are even smaller variations, such as a 1994 Ford Mustang Owner Manual vs. a 1995 Ford Mustang Owner Manual. A completely unique description of these two products is highly unlikely to ever be written.

I could consider a denser "blockier" page organization where there are fewer pages with many more links, to lessen the duplicate content, but I also wonder if this would be useful to the live vistors.

In any case, I think the Google site map and Froogle feeds are excellent ideas and seem well-suited to their situation.

Will Google index a page (provided in a Google XML sitemap) that cannot be reached by their spider through hyperlinks? Right now they can only be found by submitting the product search form, which the spider cannot do.

Jim