PDA

View Full Version : Search Engine Spam & User Spam


Alan Perkins
26-10-2001, 06:28/06:28AM
Hey all - particularly those who stuck around through the White Paper thread here:

http://www.ihelpyouservices.com/forums/t842/s.html

Something suddenly struck me. I guess it's so obvious that I was looking right through it before, but it's solid all right. There are two kinds of spam!

The White Paper (http://www.ebrandmanagement.com/whitepapers/spam-classification/) is talking about Search Engine Spam. Most who object to the White Paper object on the grounds that it does not consider User Spam.

If I could define these two:

The White Paper currently defines Search Engine Spam as "Any attempt to artificially influence a search engine's ability to calculate relevancy ". As a result of the thread here, I'm working on changing that. But do I need to? What if I defined User Spam, right there alongside Search Engine Spam, as something like "Any resource that a user would not expect to receive in response to their query"

99.9% of SEOs would consider User Spam to be a bad thing and not something to strive for. To me that has been so obvious that it was never even worth talking about.

Until now my focus on search engines, rather than end users, has come from a couple of different perspectives:

1) Having a programming background, I know that those programmers writing search engine algorithms base those algorithms on the ways that good technical architects, information architects, designers and copywriters create good sites. Search engine spam attempts to make a site appear better than it actually is to a search engine robot, yet still deliver the less good site to the user. Note that the "less good" site may still not be User Spam.

2) In many ways the search engine is acting as my agent. I believe it deserves treating with respect, not trickery. They are not *my* users until they get to *my* site.

I'd appreciate your comments on the definitions of Search Engine Spam and User Spam.

Alan

ihelpyou
26-10-2001, 07:28/07:28AM
"Any resource that a user would not expect to receive in response to their query"
Okay. As usual, I am confused. What exactly do you mean by this? Am I thinking too much about it? Is this really as easy to understand as it looks? :confused:

Alan Perkins
26-10-2001, 07:45/07:45AM
"Is this really as easy to understand as it looks?"

Yes! Or maybe not.. I'll give you some examples:

1) You type in "tools" looking for a toolstore site and get a toolstore site, that's not User Spam.
2) You type in "tools" looking for a toolstore site and get a porno site, that is User Spam.

Obvious maybe. But what about this:

3) You type in "tools" looking for a porno site and get a porno site, that's not User Spam
4) You type in "tools" looking for a porno site and get a toolstore site, that is User Spam

This is my point with search engine spam. It's up to the search engine to determine their market and deliver relevant results to that market. If they can find enough users that agree with their opinion, the search engine is in business.

There is no such thing as absolute relevancy.

Alan

ihelpyou
26-10-2001, 07:51/07:51AM
I gotcha now!

Yes. This is what I mean when I said in the other thread that the engines wish they could get into the searchers head when they produce results. It is a constant struggle with both search engine relevance and searcher relevancy.

Alan Perkins
26-10-2001, 07:57/07:57AM
Search Engine Spam: Any attempt to artificially influence a search engine's ability to calculate relevancy
User Spam: Any resource that a user would not expect to receive in response to their query
I'd appreciate your comments on the definitions of Search Engine Spam and User Spam...

Does the definition of User Spam make the definition of Search Engine Spam easier to swallow? Or is there a more SEO-friendly, still catches-all, version?

Alan

ihelpyou
26-10-2001, 08:12/08:12AM
Sure it does. Just because a search engine might "think" it is giving relevant results, does not necessarily equate to "relevant" for the user.

Alan Perkins
26-10-2001, 08:23/08:23AM
Thanks Doug. :cheers: Anybody else agree/disagree?

MazY
26-10-2001, 10:35/10:35AM
You know me - I have to read it, read it again and then read it again when I have lots of time to consider it again....

Let ya know soon...

Advisor
26-10-2001, 14:37/02:37PM
Hmmm...another tricky one.

I'm left pondering this thought:

You type in "tools" looking for a porno site and get a porno site, that's not User Spam

I wonder what kind of ummm tools they use on those porno sites? :eek:

J

Kal
26-10-2001, 20:08/08:08PM
The ummm "large" variety, I would expect? :rolleyes:

ihelpyou
26-10-2001, 20:11/08:11PM
yea, hmm.//..... some women do like that. :rolleyes: Average is best.

Advisor
26-10-2001, 21:11/09:11PM
Originally posted by ihelpyou
yea, hmm.//..... some women do like that. :rolleyes: Average is best. :green: That's what all you average guys say!

J

Mel
28-10-2001, 09:13/09:13AM
The big problems that I have here are;

1. What does "artificial" mean?

2. I still maintain that SEs calculate ranks not relevancy. To the best of my knowledge all search engines discuss their "ranking algorithm" not their "relevancy algorithm". I do agree that search engines are very concerned with ensuring that their rankings include relevant results, however.

Mel
28-10-2001, 09:17/09:17AM
2) You type in "tools" looking for a toolstore site and get a porno site, that is User Spam.

LOL why could than not be a bad SE algo?

Advisor
28-10-2001, 09:23/09:23AM
why could than not be a bad SE algo? I might be wrong, but wasn't Alan saying that User Spam pretty much is a bad SE algo? Or no?

J

Alan Perkins
28-10-2001, 09:39/09:39AM
There seems to be only three reasons for Search Engine User Spam:

1) The wrong resource was delivered to the SE spider (if deliberately, then this was Search Engine Spam, too)
2) SE algo was not up to the job
3) User was not up to the job - i.e. did not perform a "good" enough search or, in some way, supplied incorrect data to the search engine and therefore got incorrect results

More thoughts:
Search Engine Spam occurs at the point when the resource is presented to the Search Engine (i.e. when the spider comes round)
Search Engine User Spam occurs each time a resource is presented to a user that didn't expect it.

In other words (here's a quote out of my next I-Search reply):

When a user receive irrelevant results, the fault can lay with search engine spam, the search engine algorithm or with the user themselves. That's a big puzzle to solve. The White Paper is not attempting to solve the whole puzzle, just put one of the pieces in place. The piece that our industry is responsible for.

Alan

Mel
28-10-2001, 18:50/06:50PM
LOL I thought everyone knew that results depend on the crafstman, not the tool

-or-

A good craftsman never blames his tool

Alan Perkins
29-10-2001, 06:54/06:54AM
I knew "tools" would be a bad example :rolleyes: ! It's just that everyone always uses porn when citing spam. I was trying to show the flip side...

Mel, I've started an FAQ. It's not linked to by the White Paper yet, but you can have a sneak preview by looking here:

http://www.ebrandmanagement.com/whitepapers/spam-classification/faq.htm

Questions answered so far:

Why have you written the White Paper?
Why does the definition of relevancy not include end users?
Why does the definition of spam not include end users?
What is the role of SEO given the definition of Search Engine Spam?
Why is all cloaking spam?

You might enjoy the second answer...it was written for you.

Alan

Mel
29-10-2001, 14:23/02:23PM
Hi Alan:

I have read your FAQ but still cannot quite agree with your perceptions and that is perhaps because we have two different viewpoints and/or perceptions.

I view the user as the single most important item in the system and thus tend to view things from the users point of view, whereas it seems to me that you perceive the Search Engine as the most important item in the system, and so tend to view things from the Search Engines point of view. A natural occurance of your being a programmer, perhaps.

I view the User as more important than the search engines because the search engines have no purpose other than to serve the user, therefor users rank more highly in the scheme of things. Granted that each are mutually dependant to some degree, but the bottom line is search engines were established only to serve users, and are only useful in that context. By and of themselves they have no usefulness without users, but the same cannot be said of users (or customers, as we sometimes call them).

Therefor, from my viewpoint, if the search engine delivers "bad" results (I am not going to get into a semantics argument by using that well worn and hackneyed term "relevancy") to the user then that is mostly the fault of either the search engine or the webmaster responsible for the site, though I do not entirely discount the effects of very imprecise search terms.

It is because of the difficulty the search engine sometimes has in interpreting the users search terms correctly that I have mentioned that there needs to be some sort of direct user feedback to the search engines so that they can learn where thier interpretaton was wrong or right. While I agree with you that in the very long term, a search engines success will be determined by market forces, I can see no way that you, as a programmer, can use that information to improve your algorithm in sufficient time to do much good.

It is also for this reason that I do not agree that Search engine relevancy is more precise than User relevancy. While the search engines can calculate their ranking to thousands of decimal points, this is not equivalent to accuracy. It's much like the civil engineer who calculates his stresses to twenty decimal points, then rounds them off to ten decimal points, and then adds a multiplier of two to take care of "imponderables". Looks great, but still just an estimate.

Lastly, in your definition of search engine spam, the word "artificially" is the keyword (no pun intended). Without the precise definition of that term the entire defintion of Search Engine spam becomes meaningless. And so we come to the nub of the problem - what is the definition of "Artificially"???

Mel
30-10-2001, 00:13/12:13AM
Hi Alan:

Just ran across this in my web travels and thought it might add something to this thread. This is Alta vistas one sentence defintion of Spam:

Any action taken by a promoter "that would not create a positive user experience" could be labeled as spam and could cause a site to be blocked

I note that AVs definition is with reference to the user, which seems to me to be correct way to do it.

Seems like Mazy has hit the nail on the head with his new term, but, Sorry Maz, it looks like you are going to have to fight AV for the copyright.

While AV may not be the fount from which all search engine information flows, it is a rare insight into their definition of spam.

Alan Perkins
30-10-2001, 18:14/06:14PM
Hi Mel

Here we go again...thanks for your input.

I view the user as the single most important item in the system
So do I. But, as an SEO, you only get access to the user THROUGH the search engine. It's the search engine's user first.

... Lots of stuff about MelSearch.com :) ...then
It is also for this reason that I do not agree that Search engine relevancy is more precise than User relevancy.

You're right, it's not more precise. What I'm trying to say is that SE relevancy is expressed as a precise number, e.g. 55%. User relevancy is a lot more vague. Ask Joe whether site A is relevant to his search and he may answer "Yes". Ask him whether site B is relevant and he may answer "Yes" again. Ask him which is MORE relevant and he might give you an answer. Ask him HOW MUCH more relevant, and his vocabulary has reached its limits...search engines have a way of EXPRESSING precisely how much more relevant they think one site is versus another. But, in fact, Joe's Boolean Logic (Yes/No) is more precise.

What is the definition of "Artificially"I'm working on it!

Any action taken by a promoter "that would not create a positive user experience" could be labeled as spam and could cause a site to be blocked
Not precise enough and simply not good enough. They should know better ... that kind of statement is fodder to cloakers, invisitext merchants, the lot. "I didn't want to put the text on the page because it would create a less positive user experience"...

MakeMeTop
30-10-2001, 18:40/06:40PM
For the last 48 hours, I have had (according to my logs) 150 SEO enthusiasts trying to glean from my site why I rank reasonably highly on www2 on Google for the main SEO phrase discussed elsewhere (with an 's').

Given that people on this forum know most of my techniques (which I do not believe are 'spam') - but Doug and Alan would do! Why am I not on the 'sh*t' list of all the SEs - and have never been (and no longer so on Google)?

I don't use hidden text, no hidden links, no link farms (and my current PageRank ****s). I do give Googlebot what is on my visible page (but always have done) - but was downgraded for many months! So:

a) Why am I now given decent rankings on Google?
b) Why wasn't I?
c) Why have I had them consistently on other SEs?
d) Would you class me a spammer?

To help people out (and to show I don't take offence) - by Doug's and Alan's definitions, I am a spammer - but not (according to me) a true spam merchant. I have to say. I don't think I spam at all - but I'm interested in the opinion of my peers :)

ihelpyou
30-10-2001, 18:58/06:58PM
LOL. Are you? To tell you the absolute truth Barry, I have Never gone to your site! I did not know you did things I might not like. :green: NOW, of course, I will be visiting. :)

I have also noted a few visits from www2 the last couple of days.

ihelpyou
30-10-2001, 19:05/07:05PM
Barry, I just looked and could not find anything. Was I suppose to see something? Of course, did not study the code very much, but nothing stuck out to me. Why would I think you are a spammer?

MakeMeTop
30-10-2001, 19:09/07:09PM
You are always welcome to visit, Doug :cheers:

Major reason for the www2 visits is the waffle about the Aussie SEOs. Nothing to hide on my site - so they are welcome to look - and so are you.

However, as Alan has pointed out before - the reason I made the comment is I have (sometimes) used IP delivery, interesting use of agent delivery, and other things you would hate (no, not the obvious things we ALL hate) - but would make a great discussion over a beer or 10 ;)

Major points is that there is nothing on my site that could be construed as any form of spam for Google. So, it goes back to my question - why was I given a penalty?

ihelpyou
30-10-2001, 19:14/07:14PM
BTW, good-looking site!

Well, if you cloak, simply do not let me know about it. :)

MakeMeTop
30-10-2001, 19:26/07:26PM
Thanks, Doug. A way to go before I make it a true resource site for UK people trying to understand SEO - but getting there. I'm adding this forum (and your site) to my resources section this weekend. Least I can do - this site deserves it for the help it can give others.

ihelpyou
30-10-2001, 19:28/07:28PM
Thanks for the comments Barry. And if I ever get a resource page built, you will have a link as well. :) Working on it.

MazY
31-10-2001, 00:16/12:16AM
The only thing(s) that I could see on your site, Barry were (a) duplicated content (though within different folders) and (b) exact duplication across different sites.

From my own personal perspective, the only one of those that I am uncomfortable with is the duplicity within the same site. Why bother? If you have the page in one folder then I can't see why you need it in two.

But, I am always aware of the fact that one never knows any technical reasons and so forth why a webmaster does the things that he or she does at any time.

It doesn't take a genius to work out why you rank as well as you do but it does take some emulating! lol

Now believe me, I am not one to lecture about penalties. Check my AV rankings, if you can find me. They caught me on one of my "experimental" months some time ago. Seems they have still not quite forgiven me.

As I said elsewhere, you don't know how much is enough until you know how much too much is. Much better to "experiment" with one's own site than with a client's.

MakeMeTop
31-10-2001, 06:31/06:31AM
>(a) duplicated content (though within different folders)

Content delivered by user agent for one particular SE ;)

>(b) exact duplication across different sites

makemetop.com calls .co.uk in a frameset - so basically is done instead of a redirect - not meant to rank anywhere.

microchannel-technologies.com is my admin site where all my client secure data and log-ins are and was never submitted to anything. Amazingly, Google has indexed it, Lycos UK added it to their directory but all other SEs have left it out. So I put up the same content with American spelling. Not a deliberate ploy to enhance rankings.

>Much better to "experiment" with one's own site than with a client's.

You are soooo right!

Alan Perkins
31-10-2001, 07:15/07:15AM
Doug/Barry/MazY:
Stop boardwalking all over my thread :) ! That was an outrageous context switch, Barry.

Barry:
Congratulations on ranking well on Google. If you are doing nothing wrong, you should continue to enjoy those rankings... :)

To help you out, Barry, this thread is about the difference between Search Engine Spammers and Search Engine User Spammers. I would only classify you as a Search Engine Spammer, not someone who deliberately sets out to spam search engine users. Let me try to explain my problem with Search Engine Spam. I'll start by saying I like you Barry :together: , and nothing here is addressed at you personally. I've never evaluated your techniques and have no idea what you are doing (other than cloaking) that I might consider to be spam. So, here goes...

When a user receives an unexpected result, there are many possible reasons, but broadly the three classes of reason are:

1) Webmaster presented misleading content to search engine
2) Search engine algo failed to understand the user's requirement (e.g. didn't base rankings on platform, language, location or other contextual information) or wasn't working with a fresh enough index
3) User failed to provide enough information, or provided the wrong information, to the search engine

I contend that the only thing we as marketers have any control over is (1). In order for search engines to get (2) right, and to improve their ability to get (2) right, we have a duty to do (1) right. As soon as you start justifying your actions by saying that they do not amount to Search Engine User Spam, you open yourself up to endless, pointless arguments about how much more relevant your site is to the search criteria than the next site above/below you, ad infinitum. The search engine is there to obviate those arguments - that's its role. Anybody who justifies Search Engine Spam by saying their content is relevant to end users is ignoring the role of the search engine yet still taking its customers.

When you step over the line and present "slightly" misleading content, you put yourself on the same continuum as the real spam merchants (bait and switch), but you are maybe less than 10% down it, instead of over 90%. When you do a little wrong, you make it easier for others to do a little more wrong, and so on, until eventually kids are deliberately being served porn. Where on this continuum should the line be drawn? I say at 0%. There are lots of ways of doing it totally right and still getting good rankings. The flip-side of the White Paper is to make it clear that none of these ways should be considered spam.

For some time I held the opinion that only cloaked content was spam. i.e. if the Webmaster was presenting the same content to users as to the search engine, then it was up to the search engine to work out how much merit to give certain parts of the page (e.g. on a scale of 0 to 100). I gradually moved to my current opinion - attempted spam such as reported on this forum was so rife that it warrants demerit (e.g. on a scale of 0 to minus 100) in order to discourage its use.

Alan

Mel
31-10-2001, 07:31/07:31AM
Barry:

Thought about discussing this with you but decided more links was a more productive endeavor.

MakeMeTop
31-10-2001, 07:55/07:55AM
Sorry about that Alan. Yes - I'll agree with your definition (and no offence is taken - I respect your view and like you too).

I don't think anyone is going to object to your definition of search engine user spammers and I would strongly support eradication of misleading listings in SE results.

Alan Perkins
31-10-2001, 09:44/09:44AM
"I would strongly support eradication of misleading listings in SE results."

Good on yer, Barry. :)