SearchEngineWatch joins the link counting fray

Danny Sullivan is skeptical about the accuracy of Google’s and Yahoo’s results counts, used by Tristan Louis in two studies, which concluded that Yahoo has better coverage of blogs than Google, which in turn has better coverage than Technorati. Danny posted an email conversation with Tristan about his study. It’s a little hard to follow the lines of argument, but it’s well worth reading because it illuminates the difficulties in getting a handle on index size, and especially blog coverage, by the search giants.

Danny, from his exchange with Tristan:

Also, Google did say “of about” with the numbers it reports. That’s not an accident. They’re saying that this is an estimate. But no disagreement with me. If you put up a count, it would be nice if the count was as accurate as possible. Google’s have come under question.

Hmm. From what I’ve seen in Tristan’s data and my own testing, it’s Yahoo’s counts that ought to come under question, specifically for link: queries.

Danny to Tristan again:

The link: command is completely different than the site: command. The link command tells you nothing about the size of the index. As for a confirmation that all links aren’t reported, this past blog post from SEW gives you confirmation and this page on Google mentions links are only a sampling of what Google knows although this other Google page fails to make this clear.

link: and site: are very different, that’s true enough. And maybe the link command doesn’t tell you much about the size of an index, but if link collection methods are similar between Yahoo and Google (and why wouldn’t they be, it’s a relatively easy part of the whole game), then the counts ought to be similar. But they’re not, not by a long shot.

By the way, a big thanks to Tristan for posting his studies and kicking off this discussion. Most of us don’t take the time to do analysis of that depth to support our opinions, and to post the entire method and dataset so others can reproduce it, shoot holes in it, go off on tangents from it.

(I stumbled onto Danny’s post via John Battelle)

No Comments

What’s up with Yahoo’s link count estimates?

Dave Sifry is chiming in on some analysis done by Tristan Louis about how well Google, Yahoo and Technorati are covering the blogosphere. Briefly, here’s what Tristan did: He ran link: queries on Google, Yahoo and Technorati for the blogs in the Technorati Top 100 and recorded the number of results reported by each search engine. For example, taking BoingBoing, the 1st blog on that list:
Read the rest of this entry »

No Comments

Fallows on getting answers

Great column on the state of search by James Fallows in today’s New York Times (online version here), entitled “Enough Keyword Searches, Just Answer My Question”. Fallows doesn’t mince words.
Read the rest of this entry »

No Comments

The nature of blog search

The new Technorati is beautiful. The UI is beautifully conceived and lavishly rendered, and completes the integration of tags and photos with search that Technorati has been working on for some time. It strikes me as the first of its generation of blog search engines that has fully grown up to be what it wants to be, and the UI implementation is head and shoulders above its peers. And yet, when you use it, you have the feeling of opening the door to an overstuffed closet. There’s a lot of stuff that comes tumbling at you.

The presentation reflects some real qualities of the blogosphere: In aggregate, the blogosphere is noisy, diverse, urgent, in-your-face, gah! Technorati gets across the busy-ness of the blogosphere of the last few hours, where bloggers continuously decant their paragraphs and photographs into the teeming “world live web”, as Technorati used to call it. Is this the best way to do blog search? Should blog search be a megaphone or an earphone? Should it be an amplifier, a repeater, a filter, or a tuner? Some of each? Something else entirely? A purple frog?

No Comments

Blog Search news roundup

Peter Caputa has a concise roundup of blog search news from this week.

No Comments

Google’s “Secret Lab”? Ho-hum.

Henk van Ess makes a dramatic show of scooping a story about a Google “Secret Lab”, which consists of an army of students worldwide that rate Google search results and new features using an eval UI. Ho-hum. I’m not steeped enough in Googlemania to know whether this is some kind of scandal, e.g. whether Google has claimed that it doesn’t use human raters or whatever. Every search company needs something like this, appropriately scaled for its content and audience, of course.

This flash movie shows some screenshots of what’s purported to be the Google eval UI. More or less what you’d expect, but not as nice as some others I’ve seen …

No Comments

Blogebrity

Now that’s funny. The Blogebrity site is an entry in this contest to create an effective viral marketing site. I dunno, maybe this kind of thing is even instructive to some people, but a little satire is good enough for me. I’ll leave it to others to cover the marketing lessons in this.

(Via Micro Persuasion).

No Comments

MindSet, intent-driven search from Yahoo! Research

This is very interesting. Yahoo Research MindSet is a search UI that includes a slider that lets you indicate your “intent” by moving a slider between shopping on one end and research on the other. It then re-ranks the search results accordingly. Works pretty well. Try a search for shoes or search for wind surfing.

The whole effect is surprisingly transparent, probably because it’s a single axis and a fairly natural one for web users at that. I’m not sure a slider (vs. a couple of radio buttons to represent each extreme of the spectrum) is the right UI presentation, but the slider’s a fun toy for the research guys at Yahoo!, I’m sure.

(Via Geeking with Greg, who includes a characteristic dismissal of anything not invented at Findory. I’m with him on the sliders, although I’m not against tuning knobs of all kinds, period.)

I think search has to get smarter, and I think users will know what to do with a few judiciously chosen and well-implemented control knobs, at least until some kind of consensus on a new “ideal search engine” emerges, a decade or so from now (if ever).

My car has a setting for the wipers that moves the wipers only when a sensor on the windshield detects a certain amount of moisture. I use that sometimes, and it works pretty well, but it’s not perfect and there’s also a manual control that activates the wipers at fixed intervals, a feature which I don’t think is going away anytime soon. The knobs currently available (collectively known as “advanced search”, e.g. date ranges and all words vs. any of the words) in search engines are not intuitive and they certainly fail the “grandma test”. But a few knobs that more intuitively allow the user to guide the search will be well received, I think. Yahoo! is going in the right direction here.

But while the shopping vs. research axis of intent is useful (it cuts down results spam — for now — and folds in a Froogle-like tab into the main search results), it’s only the Flatland of intent, a modest beginning. The next trick would be to accomodate additional axes of intent (maybe whole hyperspaces of intent?), without giving up the transparency and intuitiveness of the UI.

No Comments

Bloglines search coming this summer

Stephen Baker @ Businessweek:

The CEO of Bloglines (now a division of AskJeeves) says that his company will release a blog search engine this summer which will surpass the likes of Technorati, Feedster, and PubSub. “The challenge,” he says, “is to create world-class blog search, which we don’t think exists now.”

Of course, lots of companies, big and small, are chasing that vision. Fletcher says that with improved search, Bloglines will lead users to the relevant blogs, and then help them organize all the feeds pouring onto their desktop. He sees the technology automatically grouping the feeds, or perhaps ranking them according to the user’s interests (as documented by clicks).

If anyone wants to read the notes from this interview, Download file have at them. And if you find stories or angles there that I should have stressed, let me know

Via buzzhit! : Bloglines to enter blog search fray this summer:

Not surprisingly (mostly because it was noted at the time of the AskJeeves acquisition of Bloglines), BusinessWeek is reporting (via an interview with Mark Fletcher) that Bloglines intends to enter the blog search fray this summer, taking on PubSub, Technorati, Feedster and the ever improving BlogPulse.

Could MSN, Google and Yahoo be that far behind? Unlikely.

No Comments

The smallest flying web server in the world

linuxdevices.com:

Researchers at the University of Essex are using Linux and tiny embedded computer modules to build fleets of unmanned aircraft that fly in flocking formations like birds, while performing parallel, distributed computing tasks using Bluetooth-connected Linux clustering software…

(Via boingboing)

No Comments