PDA

View Full Version : Is this true about Robots.txt ???


Rockrz
28-04-2007, 11:51/11:51AM
I get a SEO newsletter (I won't say which one as to not offend the owner of this board), and the most recent copy had some interesting things to say about a new development with Robots.txt


Did you know as of April 11th there was no longer a need to manually submit your sitemap to search engines. Last fall, the major search engines agreed on a sitemaps format. You can now add a simple line to your robots.txt file and let the engines know where your sitemap document resides on your site.

Just include the following line in your robots.txt file and you should be all set:
Sitemap: http://www.yourdomain.com/sitemap.html

Robots.txt has traditionally been used in a more prohibitive fashion - by telling search engine spiders where not to go on your site. This latest sitemaps implementation of robots.txt however is telling the spiders where TO go.
Is this information correct???

Comeran
28-04-2007, 14:26/02:26PM
I haven't seen anything on this, I will be doing my homework and get back to you. This would be a smart thing for them to do and it could be true but I need to see it to believe it.

Anyone seen this in play yet? Or heard anything about it?

Com-

Connie
28-04-2007, 14:36/02:36PM
I believe so. At least with Google.

g1smd
28-04-2007, 15:01/03:01PM
Matt Cutts or Vanessa Fox mentioned it on one of their blogs about two weeks ago.

They have added a proprietory extension to the robots.txt file specification.

Rockrz
28-04-2007, 17:04/05:04PM
Originally posted by g1smd
They have added a proprietory extension to the robots.txt file specification. Who is "they", and can you explain specifically what you mean by "proprietory extension"?

Did someone like copyright robots.txt, or something? :eek:

Connie
28-04-2007, 19:24/07:24PM
Google. I'm not sure if any of the other SEs will pay any attention, but Googles spiders will. I'll see if I can find the blog post were this was announced.

You have to keep in mind when Google wants to do something, they do it.

Connie
28-04-2007, 19:40/07:40PM
Here's the blog post where Vanssa Fox announced this What's new with Sitemaps (http://googlewebmastercentral.blogspot.com/2007/04/whats-new-with-sitemapsorg.html)

Quadrille
28-04-2007, 19:43/07:43PM
And it's in this interview (http://www.webpronews.com/topnews/2007/04/12/a-conversation-with-googles-vanessa-fox)

Dave Hawley
28-04-2007, 23:01/11:01PM
Why is that the SE (Google) who can and does index the most of any SE comes up with these ideas? MSN could certainly take a leaf from their book, or at least swallow pride and use Google sitemaps.

Rockrz
28-04-2007, 23:10/11:10PM
Alrighty then. Sitmaps are chic!

Anybody know of a program that will crawl your website and list all the pages for you so one doesn't have to do it manually?

With today's high technology, one might not want to work too hard unless one has to :p

Quadrille
29-04-2007, 06:40/06:40AM
There's plenty of online and offline ways to do it - but finding one right for you means trial and error, as some are free, some charge, some are good for small sites, some for larger, some give you more options.

Google webmasters has a list, with live links.

Rockrz
29-04-2007, 10:57/10:57AM
Does it HAVE to be XML...or can it be a simple html webpage that lists all the URLs of the website?

WebSavvy
29-04-2007, 12:53/12:53PM
It can be a text file (e.g., .txt) and contain one URL per line. However, they'd prefer the sitemaps format of XML. It's really not that hard to learn it. It's actually even easier than regular HTML.

If your site isn't having any crawl issues, I wouldn't suggest using sitemaps at least until it's out of beta stage.

There have been quite a few reports of serious side affects using sitemaps. My own site was one that suffered. Soon as the sitemaps came down, all of my URLs went back to the regular index. Sitemaps ended up forcing them into Supplemental, and this is on a previously well-indexed 8 year old site.

Use with caution.

Connie
29-04-2007, 13:50/01:50PM
Whether its xml or a txt file there is a certain way it needs to be set up. This is a map for spiders and not people.

You can use one of the fee online programs to make one to see how it needs to be done.

I agree with Deb. No need for one if your not having crawl issues.

g1smd
29-04-2007, 15:05/03:05PM
>> Anybody know of a program that will crawl your website and list all the pages for you so one doesn't have to do it manually? <<

I know of one that will do that, as well as check for errors and inform you of redirected URLs too.

Xenu LinkSleuth

Quadrille
29-04-2007, 15:27/03:27PM
It does, too - thanks! G1, I'd forgotten that function.

Connie
29-04-2007, 16:41/04:41PM
Unfortunately, at least the version I have will not give you acceptable out put for a Google sitemap.

At least in my understanding a Google sitemap has to be in a certain format.

You do not link to the sitemap from any web page. It is for SEs only.

Xenu LinkSleuth will provide you with a sitemap for people. One you can link to from every page.

All sitemaps are not equal. There are sitemaps specifically for SEs. You visitor should never see that one. You should never link to that sitemap. That's why you have to tell the SEs where the sitemap is located.

On the other hand the sitemap that Xenu creates will help you build a site map that is both user friendly and SE friendly.

g1smd
29-04-2007, 16:56/04:56PM
The version of Xenu LinkSleuth released late in 2006 has 100% valid HTML code in the site report and the generated sitemap.

The sitemap code can be cut and paste to a page on the real website, but that page must not be scanned when you run Xenu LinkSleuth to test the site.



If you need to make a TEXT file sitemap for Google, then you can take the Xenu Report and simply highlight the text of all the listed URLs and then copy and paste that to a separate text file.

Rockrz
29-04-2007, 18:13/06:13PM
Originally posted by WebSavvy
If your site isn't having any crawl issues, I wouldn't suggest using sitemaps at least until it's out of beta stage.No, all my sites do pretty well.

In fact, I learned here several years ago that it was best to use text links on each page so there are URLs on each page that lead to all the other pages.

I've been doing that for several years now and it's been working great. I just saw the sitemap article and thought that might me the newest thing the SE were starting to require for best ranking.

Sounds like I need to give it a while and check back on the status of this concept. Thanks.

Dave Hawley
30-04-2007, 00:11/12:11AM
Google site maps give great feedback on broken links etc and it's straight from the Google spiders mouth.

Comeran
30-04-2007, 19:01/07:01PM
I like finding out about dead links BEFORE G does though :p

Xenu is a webmasters best friend.

Com-

g1smd
30-04-2007, 19:38/07:38PM
I don't trust the XML sitemaps function, and neither do a few other people that I know.

Rockrz
30-04-2007, 19:47/07:47PM
Originally posted by Comeran
I like finding out about dead links BEFORE G does though :pWhy would anyone have deadlinks on their site in the first place? :confused:

Don't people know what pages have been uploaded on a site they built and are managing?
I know I do.

Connie
30-04-2007, 19:53/07:53PM
Sometimes I make a mistake. :D

Rockrz
30-04-2007, 20:09/08:09PM
I think alot of that comes down to testing as you're building the site. Plus, being very organized also helps.

Dave Hawley
30-04-2007, 21:16/09:16PM
I don't mind Google tellling me about dead links, it's great to know how googlebot crawls and sees ones site. Plus, it gives a range of other good feedback. You can even speed up, or slow down the crawl rate. There's no downside to having Google sitemaps report dead links over any other method.

I too often hear about "problems", but as with most things, it's more often than not a mistake by the Webmaster in the use of XML. Kind of like, my site is dropping and there's no reason for it ;)

I use a .txt based site map and have never had any issues. I have also used many other methods, but don't find them close to Google site maps.

I like finding out about dead links BEFORE G does thoughUnless your site is NOT online, G likely knows about dead links BEFORE you do. You just don't know that G knows :)

Comeran
30-04-2007, 21:54/09:54PM
I just meant that I like to find out as soon as possible, but I do keep an eye on the G crawler info too.

The problem isn't on small sites, but when the site is dynamic and has thousands of pages xenu can save your life!

Com-

Dave Hawley
30-04-2007, 22:13/10:13PM
My site is both dynamic and static and has 10 of thousands of pages. New pages are not the problem for me as I can check them as soon as they upload. The problems come about when I re-name, move or delete pages. Then I change all links that I know about, but some may get overlooked. I then check site maps to see what I may have missed. I can often use it to track down any external links as well.

I guess I'm the kind of guy who doesn't like double handling and so now only use site maps. I tend to focus long term mainly, so catching a possible dead link 24hrs early doesn't mean much.

AFAIK Google site maps is the ONLY way to get direct feedback from Google themselves.

Do you use site maps?

Connie
30-04-2007, 22:18/10:18PM
I do test, but I still make mistakes. On the other hand, could some of those dead links be located on other sites?

I've had bad links show up in Google, for pages that the links have not been touched in years from any pages on my site.

My most recent bad link resulted in a 403 error. It was formed like this /wood-widgets/.htm.

#1 I haven't created any new links on my site for wood-widgets in at least a year if not longer.

#2 I have checked every link on my site that links to wood-widgets.

That link is not on my site. Thanks to Google showing bad links, I was able to set up a 301 redirect to take care of it.

Dave Hawley
30-04-2007, 22:24/10:24PM
Like you Connie, I too make mistakes......just down't own up :)

Yes, the 301 redirect has saved me many a page from external links.

Rockrz
01-05-2007, 02:06/02:06AM
Originally posted by Dave Hawley
My site is both dynamic and static and has 10 of thousands of pagesMy gAWd, man! What kind of website could anyone possibly have that needs "10 of thousands of pages" :confused:

How many sets of 10 thousands pages are there on your site?
More than 100,000?

Just out of curiosity, can you post your URL?
I'd be interested in seeing what kind of site needs that many pages.

WebSavvy
01-05-2007, 02:15/02:15AM
There are lots of sites that have 10s of 1000s of pages. :D
I have 22,000 categories and each one has a category page and an add url page = 44,000 pages + more when you count the other pages in the site.

Dave Hawley
01-05-2007, 02:19/02:19AM
It's not a case of "need".

Most of the pages are on our forum. The forum generates a new page with each new Thread (normally about 50 a day).

Forums are one of THE best ways to perpetually add content pages into Google IMO.

Rockrz
01-05-2007, 02:52/02:52AM
Well, I don't recall ever seeing any sites with tens of thousands of pages.

Dave Hawley
01-05-2007, 03:48/03:48AM
Err, how would you know? Truth is you wouldn't and I can prove it. You are on a site right now with pages in the tens of thousands :)

There are literally hundreds of thousands (maybe millions) of sites out there with pages in the tens of thousands. The World is a BIG place.

Rockrz
01-05-2007, 12:26/12:26PM
Originally posted by Dave Hawley
Err, how would you know? I've got people...

g1smd
01-05-2007, 19:28/07:28PM
>> I don't recall ever seeing any sites with tens of thousands of pages. <<

BBC, CNN, Amazon, e-Bay, ODP, WHO, most large retailers, many universities, most government departments, yada, yada, yada, ....

WebSavvy
01-05-2007, 21:06/09:06PM
Well, I don't recall ever seeing any sites with tens of thousands of pages.
I've got people...
Have your "people" visit DMOZ. :D
10s of 1000s of pages at DMOZ. ;)
They may have already seen it and didn't realize it. :p

Dave Hawley
02-05-2007, 00:16/12:16AM
I've got people...Time for some new "people" me thinks :D