Sphere, the related content company I co-founded three years ago, has been acquired by AOL.
I think I logged about 185 hours of sleep and 3 blog posts on remylabs over the last three years, but building Sphere has been really fun, thanks to the best team in the business. We have a small but incredibly great team at Sphere (a photo of our handsome crew is here), and I’d like to congratulate and thank each of them for being a part of this adventure:
Our word-class business team: Jeff Yolen and Josh Guttman.
Our stellar technology team: Alex Bendig, Andy Cabell, Kevin Cowan, Adam Embick, Mike Garfias, Michael Harzheim, Sven Henderson, Troy Vitullo and Jeremy Rice.
And of course my co-founders, Steve Nieker and Tony Conrad, as well as a superb group of investors and advisors including our 4th co-founder and advisor Toni Schneider.
Not strictly part of our team, but not far from it: Some of the best partners and customers (early adopters and others) that a startup can hope for.
I’m very excited to join AOL. Sphere’s content discovery products are a great fit for AOL’s sites and platforms, and I look forward to working with the great people there. We’ve worked with several groups within AOL over the last couple of years, so I know first-hand that the place is chock-full of smart people.
Sphere: Related Content
April 15th, 2008
I stumbled across this profoundly cool tool this week. From the Alloy homepage:
Alloy [is] a simple structural modeling language based on first-order logic. The [Alloy Analyzer] can generate instances of invariants, simulate the execution of operations (even those defined implicitly), and check user-specified properties of a model. Alloy and its analyzer have been used primarily to explore abstract software designs. Its use in analyzing code for conformance to a specification and as an automatic test case generator are being investigated in ongoing research projects.
Or, put another way, you can use Alloy to model a software system, complete with facts and assertions about the model, and have the Alloy Analyzer check correctness of the model. It can discover exception cases allowed by your model that violate your assertions. Now you can do agile software design without diving into code right away (even the most elegant code obscures your model’s abstractions with implementation details, and makes it harder to revise your design), and you can do real modeling without generating stacks of UML diagrams. The model is expressed in the Alloy language, and diagrams are generated dynamically by the Analyzer. Here’s an example for Daniel Jackson’s book on Alloy. This system models a set of traffic lights at a junction:
module chapter4/lights —– The model from page 127
abstract sig Color {}
one sig Red, Yellow, Green extends Color {}
fun colorSequence: Color -> Color {
Color < : iden + Red->Green + Green->Yellow + Yellow->Red
}
sig Light {}
sig LightState {color: Light -> one Color}
sig Junction {lights: set Light}
fun redLights [s: LightState]: set Light { s.color.Red }
pred mostlyRed [s: LightState, j: Junction] {
lone j.lights - redLights[s]
}
pred trans [s, s’: LightState, j: Junction] {
lone x: j.lights | s.color[x] != s’.color[x]
all x: j.lights |
let step = s.color[x] -> s’.color[x] {
step in colorSequence
step in Red->(Color-Red) => j.lights in redLights[s]
}
}
assert Safe {
all s, s’: LightState, j: Junction |
mostlyRed [s, j] and trans [s, s’, j] => mostlyRed [s’, j]
}
check Safe for 4 but 1 Junction
The last line tells the Alloy Analyzer to run some test cases with 4 lights and 1 junction and check that the junction is safe (satisfies the predicate mostlyRed at all times and for all transitions. It comes back with:

“No counterexample found” means the model makes safe junctions. 27ms means it’s freakin’ fast. The alloy Analyzer is not an exhaustive model checker, but rather uses a SAT solver, which gives Alloy an efficient way to check huge spaces. Checking a few billion cases in the modeling phase is a lot cheaper, and gives much better coverage with less effort, than running a few hundred unit tests when you’re already writing code.
Here is the diagram generated by Alloy for the small model above:

One slightly annoying thing is that the book uses Alloy 3 syntax, which has been slightly changed for Alloy 4, the current version, but the Alloy team has published a thorough list of changes (”how to update the book for Alloy 4″) which lists all changes and the page numbers on which they occur.
Alloy is free and there are binaries available for OS X, Windows and Linux.
Powered by ScribeFire.
Sphere: Related Content
April 26th, 2007
FORTUNE has an article (”Blogging for Dollars”) that covers Umbria, a company based here in Colorado that tracks what bloggers are saying about its clients (aka mining blogs for market intelligence).
Economically, this market is finally starting to take shape — the ideas and attempts have been out there for a few years, but consumer companies have been on the fence about whether the blogosphere is worth listening in on. Until recently, that is. Umbria claims they’ll have $2M revenue this year and will be profitable next year, but the overall market for this kind of service is still only $20M according to the article (Intelliseek has about 1/3rd of that market).
Technologically, Umbria also sounds pretty interesting. They claim to have a competitive edge in automating most of the process:
Umbria’s solution is entirely software-based. [Umbria’s] competitors also meet with clients to interpret the data and suggest strategic responses. “Ultimately we rely on both technology and humans for analysis,” says Max Kalehoff, marketing director for BuzzMetrics [another Umbria competitor]. “Umbria takes an extremely automated approach.”
Umbria’s technology sounds like a pipeline of parsers that generates features that in turn drive product and sentiment classifiers (and those drive reporting):
Every few hours Umbria sends an application called a spider out over the web to scour the blogosphere for postings about the firm’s clients, most of which are big consumer companies, such as Electronic Arts, SAP, and Sprint. By analyzing keywords in blogs, Umbria can classify each citation thematically. In the case of Sprint, for example, Umbria’s software can tell whether a blogger is talking about customer service, the company’s advertisements, or a particular calling plan.
Another big challenge is to decipher what’s on a blogger’s mind. To figure out whether an opinion is strong or tepid, for example, it helps to know that “awesome” is a stronger endorsement than “pretty cool,” and that “shoddy” is less damning than “abominable.” Umbria has several employees with Ph.D.s in linguistics and artificial intelligence who are forever tweaking the software to make it better at categorizing opinions.
I can’t help thinking that more manual tweaking goes into each client’s setup than this description lets on, but still, I’m glad they’re seeing success, and I bet those linguists are having fun with the blogosphere, even if they have to do a bit of slumming to come up with their rules:
The software can also estimate the author’s age and gender. Elongated spellings (”soooooooo”), multiple exclamation marks (!!!), and acronyms such as POS (”parent over shoulder”) suggest a teenage female member of Generation Y (born after 1979). The blogger is probably a teenage boy if a posting is rife with hip-hop terminology such as “aight” (translation: “all right”) and “true dat” (”I agree!”).
There you have it, you don’t even have to know the language to have your voice heard by the people who want to sell you more stuff. Now that’s power. On one side of that function, at least.
December 8th, 2005
Joel Spolsky had a similarly disappointing experience with Windows Live as I did. He calls it the Marimba Phenomenon:
The Marimba Phenomenon is what happens when you spend more on PR and marketing than on development. “Result: everybody checks out your code, and it’s not good yet. These people will be permanently convinced that your code is simple and inadequate, even if you improve it drastically later.”
Hadn’t heard it called that before, but it happens often enough, and the name is great: In the mid 90’s, the company Marimba trumpeted and eventually launched a product called Castanet that was something like a Java-based push platform (this wasn’t long after, or maybe even concurrently to, PointCast — remember them?). To Java types, and many non-Java types (or Java non-types?), Marimba Castanet sounded foundational, revolutionary, indispensable, and had an aura of universal usefulness, with a hip 90’s name to boot. I think the product is still around in a different incarnation. Anyway, the release of Castanet revealed a product that just didn’t live up to the high expectations that had been set. It worked (and probably still works), but it wasn’t foundational, revolutionary, indispensable or universal. So we moved on.
Microsoft has a bit more staying than Marimba, but as long as there’s choice in the Web marketplace, it can’t afford too many launches going off the rails like this.
Sphere: Related Content
November 2nd, 2005
Windows Live looks like Microsoft’s tardy and half-baked answer to My Yahoo! It’s a customizable portal, with placeholders for weather and news and feed subscriptions etc. According to Bill Gates’ announcement today (video at CNET), Windows and Office are not required to use Windows Live. But try it from Safari on your Mac, and you’ll get just a fraction of the page (only an MSN search box). On Firefox (at least on Mac) you’ll see this:
Firefox support is coming soon. Please be patient
You know, a garage startup can maybe get away with this kind of thing. But this is Microsoft! And the announcement was a major event, not some leak of an internal research project. OK, so it works only in IE, and I guess Windows Live is destined to be the home page for millions of unsuspecting users of the next version of IE. But if Microsoft wants the rest of us to pay attention and if it wants to be taken seriously in its efforts to catch up with the new realities of Web-as-Desktop (call it whatever you want, but don’t call it Web 2.0), then it has to demonstrate that it’s a) adding some value — that’s TBD for Windows Live — and b) not going to make a fool of itself by trying to bring its insidious embrace-and-extend practices to Web content. That would be fun to watch, though. Never a dull moment … [nor a productive one].
Sphere: Related Content
November 1st, 2005
I like my mail to be portable. When I switched to a Mac earlier this year, I installed Thunderbird and moved (or imported, can’t remember) my existing mailboxes over from my Windows laptop. Thunderbird is perfect for this kind of thing, it’s available on Windows, Linux, Mac, so you can take your mail with you and continue where you left off.
But there have been three things that have been bugging me about Thunderbird for a good 8 months. I’ve faithfully kept up with minor version upgrades, but none of these annoyances have been fixed. Today I installed Thunderbird 1.5 Beta 2 and all three issues are still there. Apparently, they are not going to be fixed in 1.5, so I’m going to grudgingly abandon portable mail and start using the Mac Mail app.
If you’re curious, here are some bug reports of the three annoyances:
#1. You can specify where attachments are to be saved, but if you configure Thunderbird to automatically open attachments of certain types, it ignores the setting of where to save them, and it instead litters your desktop with them. The reason for this has to do with shared code between Firefox and T-bird that reads the download destination from Safari’s configuration. I’m sure that the code commonality between T-bird and Firefox, as well as reading config settings from Safari sounded good as a design goal, but it leads to utterly annoying behavior. There are at least three bugs filed about it, going back to at least March:
Bug 241523
Bug 286094
Bug 238789
#2. The next one is just a plain old bug, but it’s been there a long time (December ‘04):
Bug 274500
#3. The last of my three Thunderbird annoyances is that the ‘new message count’ that’s shown in the dock includes new junk mail, which has already been moved to the junk folder. You can see why this one’s in my top 3 list. There’s no new mail (only junk mail, not in the inbox), yet the dock icon shows N new messages. Grrr. This one’s also been around in bugzilla since December ‘04:
Bug 274688
I hate giving up portability, but seeing that none of these three buggy bugging bugs have been fixed in Thunderbird 1.5 was the last straw. Maybe I’ll try T-bird again after a sabbatical on Mac Mail.
In other news, I also installed Firefox 1.5 Beta 2. So far, so good. No complaints. Had to leave some extensions behind, but they’ll catch up.
Sphere: Related Content
October 28th, 2005
Now this is cool. Tagyu is an “auto-tagging” service of sorts, created by Adam Kalsey. You paste in some text (or submit via their REST API) and it suggests tags, using some kind of a similarity metric between your text and already tagged texts in Tagyu’s index (gathered from del.icio.us etc.).
So far, I’ve tried a few different texts, and about half the time the returned tags are great. This is impressive, because this is not an easy problem to solve, but 50% precision is not quite enough for prime time. If someone (sploggers?) unleashes Tagyu to auto-tag a large volume of posts that feed back into the del.icio.us and Tagyu system, that would be detrimental to improving precision of the system, unless you could assign some kind of a score to the quality of tags (yes, that’s a chicken/egg thing).
Maybe we need some kind of a large-scale tag-quality feedback system. Some clever piece of javascript that lets you click “this tag is right on” or “this tag is a cruel joke” when reading someone’s blog or feed. Of course, if you’re an idiot at tagging, you’re not going to install that piece of javascript. An aggregator might be the best place to do that, where attention.xml lives (eventually).
This is the first service of this kind that I’m aware of, and there are lots of applications of this kind of thing in blog search. There could be an ad-matching app in there, too. And, an intermediate step in Tagyu is matching content to other content (and then to tags). I hope Adam Kalsey keeps up the R&D effort on this. Tagyu has a super-clean looking site. Very nice.
btw, for this post’s text, Tagyu returned the following tags: tagging del.icio.us tools. Looks good to me.
—–
(Via BuzzMachine)
Sphere: Related Content
October 12th, 2005
I downloaded a 30-day trial of Filemaker for OS X about 2-3 weeks ago. Had some ideas for a notetaking system with tagging, dynamic cross-linking, flexible querying, stuff like that. I haven’t had time to even unzip the trial, but I’ve already received two phone calls and two follow-up emails from Filemaker sales reps. Don’t they have anything better to do? Isn’t Filemaker selling without this kind of pestering? What’s wrong with these people? If I have trouble with it, I know where to go. If I want to buy a license, I know where to go for that, too.
At this rate, it’s unlikely that I’ll even take the time to install the trial. I’ll keep using MacJournal and see if I can uncover some features that get me closer to what I’m imagining my note taking application to be. MacJounral is a nice piece of software, and I haven’t been pestered by them once.
Sphere: Related Content
September 26th, 2005
If you’re a word nerd or, heck, if you just speak or are learning English, you’ll enjoy The Word Nerds, a weekly podcast by two brothers on topics like The Cold War and Hostile Language, Collective Nouns, The Unnamed Antecedent and segments like The Rude Word of the Week. It’s obvious these guys spend hours preparing for their podcasts. They don’t just sit down and start talking (I won’t name names. Well, maybe in a future post I will). Anyway, The Word Nerds gets the coveted remylabs podcast seal of approval, which means I’ll stay subscribed in iTunes NetNewsWire (does OPML, unlike iTunes) as long as they keep ‘casting.
Sphere: Related Content
August 4th, 2005
Article in NY Times today, Yahoo is wooing I.B.M. Technical Talent:
Yahoo plans to announce Thursday that it is recruiting scientists who pioneered an advanced search-engine technology at I.B.M.’s Silicon Valley research laboratory.
…
Prabhakar Raghavan, a computer scientist who once led the Clever effort, joined Yahoo last week as head of research. He left I.B.M. in 2000 to become a vice president and chief scientist at Verity Inc., a maker of search and retrieval software for corporations; he was later named chief technical officer.
…
Yahoo offers one of the best opportunities to explore new ideas in search, Mr. Raghavan said
…
One area that will be pursued is new search technologies related to digital media.
It’s been fun to watch Google being forced from the position of category killer to more-or-less evenly matched contestant over the last year or two. There’s a mind-boggling amount of innovation happening in search, which is levelling the playing field for new entrants, but even the stuff we’re seeing now is only the beginning. Search, and other modes of information retrieval, will become even more ubiquitous and integrated than they are now, and we’ll wonder how an OS like Windows without integrated search ever came to dominate a market. The desktop market itself may go away (yes, I’ve been reading Paul Graham’s book Hackers and Painters, which contains this great essay on server-based software from 2001, which is still relevant and engaging, as are his many other essays).
Search is poised to become the great collective memory, and new research being brought to market in real services, along with the availability of public APIs, will speed progress toward that reality. But it won’t be just the extent of information covered by search that will grow, but also interconnectivity of seach services and, most importantly, new modes of retrieving information (the only mode now in widespread use is keyword search, which is as old computer science itself — or much older, if you count manual versions such as file cabinets and card catalogs and other manually compiled indexes). I don’t see any reason why search shouldn’t aim to duplicate in software all of the modes in which humans retrieve information in their own brains (by context, by association and so on) or from others, by interactive question answering or guided discovery.
Sphere: Related Content
July 28th, 2005
Previous Posts