|
It’s the beginning of the year, so
you’ll see dozens of “Year in Review” and “Predictions for the Coming
Year” articles about the search engine industry. I have to admit, I was
going to jump on the bandwagon myself, but as I started looking at what
I would say, one thing dominated the future landscape for the next 4
years to such an extent that it made all the other developments pale in
comparison. This tsunami of change will shape and affect every corner of
the business. Search as we know it will be swept away because of it, and
all the search providers we know will scramble to readjust and find
their place in the new landscape. When Microsoft enters search, all else
will become a footnote in the history of the web.
So, forget Yahoo and Google. Sorry Looksmart and Ask Jeeves, you’ve been
pushed off the front page. Today, the spotlight is on Microsoft, and how
they will likely change the face of web search. In this column, I won’t
be talking about industry impact. Instead, with the help of our Organic
Search wiz, Rob Sullivan, I’m looking at the promise of Microsoft’s
research itself, and what the tool may actually look like.
MS Search...It’s all about Indexing
First, Microsoft is looking to solve a long standing desktop irritation.
And when they find the answer, it will change the indexing of file
information forever.
The current way of finding files on your computer leaves a lot to be
desired. There has been no single system that effectively searches
content from multiple file formats. To solve that problem, Microsoft is
looking to employ three different technologies. First, to ensure
compatibility, Microsoft will continue to use their NTFS File Structure
system. They will combine it with the indexing capabilities of a SQL
server relational database and the file labeling potential of XML. The
new system is called
WinFS.
WinFS
The problem with current file systems is that they are hierarchal. Files
occupy one single place within a nested pyramid of file folders. But
people don’t tend to think that way. A file may be relevant in a number
of different ways, depending on the context in which you’re looking for
it.
The other problem with hierarchal systems is that they need a librarian.
Someone has to establish and organize the hierarchy. Usually this
organization is established in anticipation of the context in which
you’ll have to reaccess this information.
I know there are people out there who are diligent about filing away
every single document in a well organized file system, but for the 99%
that make up the rest of us, our hard drives are a vast junk yard of old
files, spreadsheets and emails. More often than not, we desperately use
Microsoft’s find file application to try to track down that elusive bit
of information we’re looking for.
The other problem is that there is no good way to quickly search a
number of different file formats for a scrap of information that may be
hidden in one of them.
Microsoft’s new WinFS will work on top of the current NTFS structure,
but it will introduce a dramatic new way of indexing files and their
contents. XML tags will be used to send relevant information to an SQL
database. It will bridge the current gap between indexable structured
data, stored in a database and data which has been un-indexable, stored
in unstructured formats such as Word documents, webpages and email
messages. It also allows users to add “metadata”, identifying tags to
existing files. For instance, a picture file could include information
about the subject of the picture, or a sound file could include
information about the audio captured.
Stuff I Have Seen
A Microsoft research team has been working on a prototype application
called SIS, or
Stuff I’ve Seen. Although it’s focus is to help users find files and
information on their desktop, its implications for web search
functionality could be dramatic. It pulls information from multiple file
formats, including emails and webpages, and records them in a single
index. This allows the user to search through them using a powerful
interface that allows for the application of several filters at the same
time. The search process becames a real time iterative process, allowing
the user to quickly narrow down the search to the most relevant
findings.
Implicit Query
“Stuff I’ve Seen” gives the user a powerful tool to find files and
information on their desktop. Implicit Query (the link goes to a
interesting Powerpoint presentation prepared by the Microsoft Research
team) goes one step further by continually searching and retrieving
information based on what the user is doing. As the program tracks user
behaviour, it refines its model of what is important and relevant to the
user and filters the search results accordingly. This is an extension of
Microsoft''s Lumiere research which has modeled the Bayesian logic
behind the current automated assistance functionality.
In an example, a Microsoft researcher was typing an email to a colleague
about an upcoming conference. As she was typing, Implicit Query brought
up presentations, slides and documents prepared for the conference in
it’s results panel. In another instance, she was preparing an email to
another colleague about a broken link in her group’s website. Before she
was finished, she was shown an unopened email that contained the fix.
Memory Landmarks
A third Microsoft project doesn’t hold nearly the same promise for web
search, but it would make an interesting add on feature.
Memory
Landmarks can add historical remarks to a list of chronological search
results. For example, if you were searching for articles regarding the
capture of Saddam Hussein, you could sort the list by date and Memory
Landmarks would indicate where on the list the capture took place.
What Will MS Search Look Like
I think the above prototype applications give us some real clues as to
what Microsoft Search will look like. As Microsoft works on the new
Longhorn OS, we have to remember:
-
As Microsoft works on ways to index and search files locally, it’s a
logical extension to apply the new technology to web search.
-
Longhorn’s
Indigo makes a major move away from object oriented
programming towards web services. There will be a much richer and deeper
exchange of information between your local computer and web service
sites. This allows for much greater localization in search tools.
-
Microsoft has a long history of incorporating what were 3rd party stand
alone applications into their applications and operating systems. They
have already identified search as one of the key activities people do
online.
-
Microsoft’s ASI (Adaptive Systems and Interaction) research department
is working to make their systems more intuitive and intelligent by
letting them learn how the user works and adapting itself accordingly.
-
Microsoft is working on desktop applications that will dramatically
change how people launch searches for information.
Given all this, here is what I believe Microsoft Search will eventually
look like;
Microsoft will use WinFS as the basis for eventually indexing every
document on the web. Remember, because it’s integrated at the OS level,
it will be native to every Microsoft IIS server on the internet. It gets
around the current problem of the
“invisible” web by allowing web
publishers to include metadata to allow for quicker indexing. Its SQL
foundation will make indexing of data based information quick and
transparent, as was shown when legal publisher LexisNexis™ allowed
Microsoft to index a portion of their huge database.
This common indexing procedure will erase the dividing line between
desktop searches and web searches. The entire web will be accessible
from the Microsoft search sidebar. What’s more, the next evolution of
“Stuff I’ve Seen” and “Implicit Query” will monitor what you’re working
on and provide suggested information sources and files from both your
desktop and the web.
If a user wants to launch a manual search, the current trial and error
method of search (try a search, check the results, refine the query and
try again if you don’t find what you’re looking for) will become much
quicker and more powerful with an interface that allows for real time
updating of results as filters are applied and parameters are tweaked.
I’m willing to bet that Microsoft will also unveil leading edge natural
language query technology that will mine web data based on interpreted
concepts and not the current structured query method used on most search
engines. By the time Microsoft Search is unveiled, I believe a more
intuitive search interface will be standard on all the leading search
portals.
Search functionality will eventually be integrated into every Microsoft
application, much as the ubiquitous Office Assistant (I can’t tell you
how much I hate that damned paper clip..until I need him) is now.
Microsoft will be able to capitalize on this by selling sponsored search
suggestions that will also be offered via the implicit query channel.
For instance, if you’re writing an email about an upcoming business trip
to New York, the Microsoft search pane will offer airfare and hotel
specials, as well as suggestions of things to do while in New York.
Microsoft will be able to monitor everything you do. The more you do on
your desktop, the more Microsoft and its applications will learn about
your preferences and priorities. Their ASI research will allow them to
adapt search functionality and personalize it just for you. So the
search results you see won’t be the same as everyone else’s.
Personalization will move beyond just geographic location to take into
account the types of sites you tend to visit, business priorities, your
typical workday activities and even your lifestyle interests. Big
Brother lives, and his name is Microsoft!
Finally, Microsoft’s Indigo feature in Longhorn will remove the
distinction between server side tasks and client side tasks. Therefore,
local indexes will be utilized whenever possible to increase search
performance and the options for personalization. The line between your
desktop and the internet will become more and more indistinct as time
goes on.
Implications for the Real World
Today, I just wanted to focus on what Microsoft’s search could look
like. In writing this, I kept asking the same question, “Boy, I wonder
what this means for Google?” Obviously, the gang at Google is very aware
of the impending threat of Microsoft search. So, in the next NetProfit,
I’m going to ask our head of Organic Search, Rob Sullivan, to join me
for a little brainstorming. We started chatting today by the water
cooler and he has some very interesting theories. Stay tuned! |