Comment: Dave Lemen (Sep 28, 2006)
Thanks for posting these slides! We're currently developing some search analysis tools and I think we'll benefit from some of the ideas in your presentation. I wish I could have attended the seminar to hear it live. (And I hope the storm that came through during happy hour didn't ruin the evening for you.)
Hanging over the reception desk at Google headquarters is a scrolling display of searches pulled from the Google.com logs. We implemented a Web page like that using an Atom feed of the most recent queries, and an AJAX page that polls that feed and does the scrolling. We've found that simply watching that stream for a few minutes can be very educational, and offers a view of the data that's more useful than simply looking at the N most popular queries (which are all navigational). I'd recommend something like that for any enterprise search implementation.
Comment: Lou (Sep 30, 2006)
Dave, thanks for the note (and the happy hour soldiered on, successfully, despite the rain). Would love to see more of your work if it's ready/available for public consumption.
Comment: Walter Underwood (Oct 3, 2006)
Hmm, I disagree with several of the conclusions in this presentation. Some are small issues, like using logs to build a controlled vocabulary. It isn't exactly "controlled" if it is built from logs. That is a "found" vocabulary.
My most serious disagreement is with the suggestion of date-sorting for Financual Times. This is certainly not "obvious" to me. Date-sorting relevance results is usually a bad idea and would be the last thing I'd recomend. Date-sorting will take recent results which are barely relevant and push them to the top. Not useful at all.
In addition, the customers have to know that date-sorting exists, decide that it might be useful in this case, and use it. Very unlikely.
Here are some solutions that are more useful and more usable than a date-sorting option:
* extract the date and restrict results to that automatically
* if the dates are recent, use a recency factory in the relevance
* offer "by year" narrowing links on all results pages
* make sure the date information is searchable as text in the format that people use in queries
Every one of these is simpler and more effective than sorting by date.
"SA" means "system administrator". Using that for "search analytics" makes the slides *very* hard to read.
Comment spam has forced me to close comment functionality for older entries. However, if you have something vital to add concerning this entry (or its associated comments), please email your sage insights to me (lou [at] louisrosenfeld dot com). I'll make sure your comments are added to the conversation. Sorry for the inconvenience.