PDA

View Full Version : take a sneak peak! - sensis Australia


nuthin
08-07-2004, 03:49/03:49AM
telstra's answer to Google?

http://beta.sensis.com.au/

quick look at the results returned, they look fairly decent.
i have been trying to find out where some of these results are getting pulled from; first thoughts was Looksmart Australia because they of course acquired them; however results are totally different than looksmart.com.au :-)

if marketed right; this could be the next big thing in Australia in particular with people wanting to find & support local business.

knowing telstra, we can expect some form of PPC advertising coming to this engine after its launched.

nuthin
12-07-2004, 03:23/03:23AM
no Aussie's interested? :/
I think it's officially launched now http://www.sensis.com.au/
I must say, the results are quite relevant to my query when I search for something local (.AU based), a much welcome change from trawling through pages of junk at Google.
anyone know where there getting that data from?
I got clients on there positioned nicely & no idea how..
definately crawler based serp's.. but which bot?? looksmarts? wisenut? :/
havent seen these results anywhere else except there.

:)

ihelpyou
12-07-2004, 09:04/09:04AM
Don't they have their own robot?

excell
12-07-2004, 09:51/09:51AM
It's looking pretty damn good for something that is associated with Tel$tra :) I am very pleasantly surprised (good to see all my sites where they should be - Hah).

Interesting to note if you search for your terms all in the box "I'm looking for:" and put the location there as any untrained person might.. the results are totally different than if you use it the way intended.

The answer might be found here (http://australianit.news.com.au/articles/0,7204,10113735%5E15336%5E%5Enbv%5E15306-15318,00.html) ... (haven't read much yet)

Adding - But he said Sensis would have to launch the site well in order to break into the market.

True - it will take a while before folks will start sensising it...

nuthin
12-07-2004, 11:10/11:10AM
yes thats very true, have you seen there ad's on tv the last week or so?
there doing a big advertising run now for Sensis, possible free traffic for clients sites can't complain :)

i actually think people will start to use it after Google and it's changes of late, making it alot harder to find local relevant content.

it's actually quite nice the .au search box when it's ticked default, some of the worldwide results look a bit dodge, but if your after Australian relevant content it looks nice :-)

I actually set sensis as my new homepage at the office today.. hah.

thats how more relevant it seems to be compared to other search engines currently out there for Australian content.

excell
12-07-2004, 11:52/11:52AM
It's going to be interesting how their network performs as a whole across the International engines. I wonder if *they* will ever have any SPAM guidelines or ethics statements for their search in the future.. LOL:cool:

Adding - yes I have noticed their prolific off-line advertising.

projectphp
12-07-2004, 21:00/09:00PM
I donm't like their results. The look like rubbish!! They need to pack more information onto each page, and have less whitespace.

Still some bugs; try the first link on this page: http://sensis.com.au/search.do;jsessionid=ekoh4e26ohnmb.server1-2?find=nttc&location=&searchIndex=australia for an example :)

Other annoyances:
1. I search for plumer newtown (http://sensis.com.au/search.do?find=plumber+newtown&location=&searchIndex=australia) I get no Yellow pages results. Yet I search for plumber: locatuion newtown (http://sensis.com.au/search.do?find=plumber+&location=newtown&searchIndex=australia) and I get yellow pages. Great, like that is usable.
2. Yellow pages results look like crap. They should pack more into each result, and make them less vertical.
3. Why not just monetize the info via existing channels? Make Google pay to have local results on google.com.au? Surely that would have been more effective, both for their advertisers and fiscally. Seems like the Sensis motto is a negative, hoarding attitude. They are trying to monetize for their own reasons.

Interesting that you made it your home page, nuthin. Definitely needs some improvement for me to care much!!!

excell
12-07-2004, 21:41/09:41PM
Seems they are gearing up to get the $s happening soon... lots of pages in the links down the bottom to investigate. (heaps of broken links and 404s)

I still cannot find info on where they are getting data.

Sensis- Bidsmart - Pay for Performance search marketing platform (http://bidsmart.com.au)

Dez
25-07-2004, 22:54/10:54PM
if you do the homework on this you'll find that it's the same data as Inktomi.

This is just another Yahoo client using the now Yahoo owned Inktomi engine.

Folk like Hotbot.com, Anzwers,com.au and hundreds of others have been doing it for years.

Telstra ( and now their Sensis project ) have a history of using 3rd party search tools, Telstra.COM remeber launched their search engine on the white/yellow pages sites, which was just a rebadged Altavista.

Dez 2.0

projectphp
26-07-2004, 00:26/12:26AM
Actually, Telstra bought Enterprise search stuff off FAST, and their UA is FAST Enterprise Crawler/6 used by Sensis (http://stats.barc.org.au/agent_200406.html).

They use this to create their "Australian Index" and Inktom,i for backfill.

I dunno, still think Sensis have a long way to go.

Dez
26-07-2004, 08:31/08:31AM
Originally posted by projectphp
Actually, Telstra bought Enterprise search stuff off FAST, and their UA is FAST Enterprise Crawler/6 used by Sensis (http://stats.barc.org.au/agent_200406.html).

They use this to create their "Australian Index" and Inktom,i for backfill.

I dunno, still think Sensis have a long way to go.

hmm... This is neat - I'd have to say that:

FAST Enterprise Crawler/6 used by Sensis

This is actually Yahoo, using the far smarter FAST indexing crawler, to feed Inktomi the best search engine in the planet, which in turn is used by Sensis via a tiny Java front end, probably running on Sun boxes running Solaris ( or pc's running linux or freebsd ) but most likely Sun as Telstra can afford them and they are 10,000 times more stable than desktop pc's in rack mount cases, sucking search results form the USA, caching them, in case you re-run or someone else, re-runs the same search, and a bunch of bandwidth, load balancing and stats, with their hacked up Looksmart listings manager for paid ads, filled with Overture clients ads..

* grin *

But that's just my first guess

But they are NOT using the FAST search engine, and that much I can promise you.

FAST have and will never be used outside of alltheweb.com as it's not Search Engine software, you can't just plug FAST into a homogeneos network of open systems and expect it to work.

the FAST engine is firmware, it's code cut into a bunch of ASIC's, four usually, on and ultra-PCI card, four of which plug into a generic DELL pc, so you get 4 x 4 search engines in a single 4RU DELL server ( they used to use the 5400 series last time I spoke to them ) and they have their own scary FEP and Stroage architecture running on a neatly tied togeather switched fabirc, so there's no way anyone but ATW are running the FAST engine.

but nice pickup on the Spider text.

Did you notice that Sensis are wising up on anti-crawler activity, view the source of the home page and note:

<!--htdig_noindex-->

too many fruit cakes down loading htdig and spidering the web till it swap's it self to a page faulted death trying to index more than 10 million URLs at which time they realise that the /usr/local/htdig-3.2.6/db/urls file is full of mailto: references and hey, perhaps they could sell it to a spam mailer and make that search engine pay back the cost of crawling a tiny tiny bit of the internet.

Dez 2.0

Dez
26-07-2004, 11:39/11:39AM
hmm.. I'm curious about what sensis is running now that it's taken up a few milliseconds of my brain - I wonder what good old Greg Ellis the GM of Sensis Search has put his you know what's on the line with.

Let's do some quick profiling of www.sensis.com.au for fun, this should prove entertaining.

ok, let's figure out what they are actually runing, let's start at the front end - are they really doing something smart with caching and load balancing.

I suspect they are given that it's a telstra project, so they would be stupid not to use all the zillions of dollars of tax payers money they have poured into their Internet Data Centres in Sydney and Melboune:


dez$ telnet www.sensis.com.au 80
Trying 203.36.59.8...
Connected to www.sensis.com.au.
Escape character is '^]'.
HEAD / HTTP/1.0

HTTP/1.0 403 Forbidden
Date: Mon, 26 Jul 2004 13:19:56 GMT
Content-Length: 257
Content-Type: text/html
Server: NetCache appliance (NetApp/5.5R4)
Connection: keep-alive

Connection closed by foreign host.
dez$


well, we know that they are using Telstra's favourite provider of caching NetAppliance's NetCache for the front end processing.

so most likely we're talking load balancing and reverse proxy back to a small cluster of search servers - nothing very exiting here - I guessed as much already.


Let's find out if it's in the Sydney IDC at of in the Melbourne IDC:

dez$ traceroute www.sensis.com.au
gigabitethernet2-5.exi1.melbourne.telstra.net (203.50.77.18 ) 25.828 ms 26.297 ms 26.576 ms
9 exhi1-colo-r02 (139.130.193.218 ) 65.867 ms 203.78 ms 147.133 ms
10 202.12.166.101 (202.12.166.101) 25.699 ms 26.585 ms 25.788 ms
11 * *^C


well that's nice and easy, so SENSIS is hosted in the Telstra COLO space in the Telstra Melbourne IDC ( internet data centre ). I won't put addresses, what floor level or part of the floor for security reasons ( laughable though I guess in light of the fact that I'm profiling the web site ).


What's listening over there I wonder, let's nmap 'em, a bit naughty but we're not pushing the legal boundaries too far, yet:


dez$ nmap -v -sS -O www.sensis.com.au

Starting nmap V. 2.54BETA31 ( www.insecure.org/nmap/ )
Host (203.36.59.8) appears to be up ... good.
Initiating SYN Stealth Scan against (203.36.59.8)
Adding open port 80/tcp
The SYN Stealth Scan took 160 seconds to scan 1554 ports.
Warning: OS detection will be MUCH less reliable because we did not find at least 1 open and 1 closed TCP port
For OSScan assuming that port 80 is open and port 43672 is closed and neither are firewalled
For OSScan assuming that port 80 is open and port 41926 is closed and neither are firewalled
For OSScan assuming that port 80 is open and port 44013 is closed and neither are firewalled
Interesting ports on (203.36.59.8):
(The 1553 ports scanned but not shown below are in state: filtered)
Port State Service
80/tcp open http

No OS matches for host (test conditions non-ideal).
TCP/IP fingerprint:
SInfo(V=2.54BETA31%P=i386-redhat-linux-gnu%D=7/27%Time=41052F67%O=80%C=-1)
TSeq(Class=TR%IPID=RD%TS=U)
T1(Resp=Y%DF=N%W=400%ACK=S++%Flags=BAS%Ops=ME)
T2(Resp=N)
T3(Resp=Y%DF=N%W=400%ACK=S++%Flags=A%Ops=)
T4(Resp=Y%DF=N%W=400%ACK=S%Flags=R%Ops=)
T5(Resp=N)
T6(Resp=N)
T7(Resp=N)
PU(Resp=N)

TCP Sequence Prediction: Class=truly random
Difficulty=9999999 (Good luck!)
IPID Sequence Generation: Randomized

Nmap run completed -- 1 IP address (1 host up) scanned in 176 seconds
dez$



ignore the TCP/IP fingerprint attempts, this was run on a Linux box, and gave us:
SInfo(V=2.54BETA31%P=i386-redhat-linux-gnu%D=7/27%Time=41052F67%O=80%C=-1)


on a Sun UltraSPARC it gave us this:
SInfo(V=2.54BETA33%P=sparc-sun-solaris2.7%D=7/27%Time=41051E9A%O=80%C=-1)



Now, what are they really running behind the NetApp cache engine(s) - and it's probably safe to assume that the NetCashe box(s) are dedicated to SENSIS as they are inside the Telstra COLO space, where nothing is shared, but then anything is possible, but let's say they are not, so dedicated NetCache(s) at the front, but what's that NetCache crap actually "caching".

Well, our good old pals at Netcraft often get this right:

http://uptime.netcraft.com/up/graph/?host=www.sensis.com.au


This gives us:

NT4/Windows 98 Apache 12-Jul-2004 203.36.59.8 Telstra Internet



Let's get whatever is listening on port 80 to tell us what it thinks it is:

dez$telnet www.sensis.com.au 80
Trying 203.36.59.8...
Connected to www.sensis.com.au.
Escape character is '^]'.
GET /index.html HTTP/1.0

HTTP/1.0 403 Forbidden
Date: Mon, 26 Jul 2004 13:22:47 GMT
Content-Length: 1378
Content-Type: text/html
Server: NetCache appliance (NetApp/5.5R4)
Connection: keep-alive
dez$


yep, it's NetApp chaces alright, version 5.54R apparently!?


now given that 203.36.59.8 is the virtual IP for the front end of the most likely fault tollerant load ballanced fail over ( heart beat ) interface to the NetCache(s) and Netcraft have data up to as late as the 12th of this month, July 2004, as horrible as it seems, it may be that SENSIS really are running Windows NT ( most likely 2000 Server though ) but I find that hard to beleive, it just doesn't seem a smart platform to run the front end of the search engine on.

But a little hunting around and you can quickly find that if they are running Windows NT of some form, let's assume it's at least 2000 server or 2003, then and they are running apache on Windows, that the Java application server they are running is something like either Orion ( http://www.orionserver.com/ ) or Resin ( http://www.caucho.com/ ) or similar given that every page served has a .do extension.

We we're talking about a Windows platform, so it's intel 100%, at least it's Apache, a Java apps server like Orion or Resin or similar, and most likely a portal engine to manage the content, with the search plugin from Inktomi, either going to a private searchable index kept locally or more likely a virtual private index crawled by FAST for them.

Well if it really is Windows, at least someone was kluefull and savy enough to realise that running IIS was going to sink the boat pretty quickly so they have gone with Apache which as long as they have delt with the recent SSL issue's ( security problem with apache mod_ssl underwindows ) then they are in good shape, becaue lord know's there's no way you're going to secure a windows box running ISS!

Info on the SSL issue at:
http://news.netcraft.com/archives/2004/07/22/surge_in_scans_seeking_ssl_servers.html

But for the life of me I can't see them running this on NT. That's just so wrong, but I guess folk like WiseNut built a half decent engine on Windows 2000 server before they gave up, sold out to Looksmart, and now just suck Google pond scum for results..

I can't get my head around the idea that sensis is running their front end on Windows, it just seems wrong, perhaps they are running the NetCache on a windows platform, but that seems odd too as Telstra are so big and cash rich they just buy more yellow boxes and by default the yellow boxes come running Unix or their own hacked up linux ( remeber NetCache came from the Harvest project et al ).

And then I remembered seeing this ( http://www.computerworld.com.au/index.php/id;443172121;fp;16;fpid;0 ) around the traps so they are talking about linux and more importantly Unix and their existing Sun One environment - but hey, who can tell these days.

hmm... bored now, but that was fun..

Anyone else want to add to this?

Dez 2.0