The Search Engine Professionals at Rank for $ales.com --- In business since 1997.
Back to our Homepage SEO Tips that will make a big difference in your rankings and our most popular ** How To ** section The most common myths about SEO -- Read what the experts have to say about today's most common SEO myths and misconceptions Frequently Asked Questions to Search Engine Optimization and Positioning Search Engine Optimization Industry News -- Stay in tune with the most recent developments in search engine technology and the SEO industry Contact Rank for $ales today and get your site's rankings high in the engines-- Right where they should be!

  SEARCH FOR:   CITY or STATE:

Search this site


Clustering technology used at Google

October 12, 2004

Google just gave a sneak preview of its next steps to improve Internet search, and clustering technology played a critical role.

During a panel discussion of research lab leaders at the Web 2.0 conference here, one of Google's top researchers previewed the search company's work in clustering both entities and words as a way to better glean users' intentions and distill information on the Web.

Another space in Google's research net is statistical machine translation for turning Web pages into other languages, said Peter Norvig, director of search quality at Google.

"[We're] trying to go just beyond keywords and the linking structure of the Web, the innovation that we brought to search, and get behind the deeper meaning," Norvig said during his presentation.

In clustering, Norvig demonstrated a six-month-old project called "named entities abstraction," where Google's researchers are analyzing the company's large Web index to extract entities—such as the name of a company—from the structure of content and then decipher their relationship to one another.

Get the best Linux or Windows Web hosting plan for your website.
Get the lowest rate and the best tech support on any Linux or Windows hosting plan. Learn more by clicking here.

For example, Norvig said, researchers are looking for ways to break down sentences by looking for a phrase like "such as" and grabbing the names that follow it. The goal is to not only pull out the name but also its clusters, so that a name such as "Java" can be associated both with the computer language and with language in general, Norvig said.

"We want to be able to search and find these [entities] and the relationships between them, rather than you typing in the words specifically," Norvig said.

With word clustering, the focus is on making the search engine better at understanding the multiple meanings of a word, Norvig said. Google started working on word clustering about three years ago.

Apropos of the heated U.S. presidential election, Norvig demonstrated a prototype of word clustering with results both for President Bush and for his Democratic contender, Sen. John Kerry.

Bush appeared in clusters for words around "president" and "White House," to name some examples, but the results drew laughter when he also appeared in descriptive categories such as "idiot" and "chimp."

"This is what the Web says, not my opinion," Norvig said following the laughter.

Kerry appeared within groups for "senator" and for his wife, "Teresa Heinz Kerry," as well as for "Bob Kerry," a former senator with whom some people may confuse him.

None of the clustering approaches is publicly available, though Norvig said in an interview following the panel that they may become Google Labs betas in the future.

Google Labs often prototypes features and services publicly that, sometimes, become new offerings. News alerts and Google's local search are among the labs' graduates.

"Certainly one application for clusters is in results pages, and it may be something we do at some time," Norvig said in the interview.

A growing number of search startups have targeted the automatic clustering of search results. Vivisimo Inc., one of the best-known startups that recently launched Clusty search site, groups results gathered from other search engines into clusters, or categories, as a way of drilling down into results.

Grid computing has been powered on. Are you taking advantage? Join the eSeminar "Tapping the Juice In Underused Servers" on Oct. 13 at 2 p.m. EDT with eWEEK.com Database Center Editor Lisa Vaas and topical experts.

Montreal Web Design will build a great-looking website for your business.
Montreal Web Design will build a professionally-looking website for your company, and do it at a really competitive price. Learn more by clicking here.

While it might make sense for startups to deploy clustering technology today, Norvig said, Google still views the technology as too immature. It is most useful only for a small percentage of search results, he said, so Google is focusing on improving the technology and increasing its usefulness.

"Our take is that the state of the art is not there yet," Norvig said.

With machine translation, Google is bringing to bear its formidable Web index—which at last count included 6 billion documents, images and items—as well as its computing resources. Google is well-known for having one of the largest clusters of Linux-based servers, which number in the thousands.

Google already provides a Web-page translation feature, but Norvig said it is based on technology from a third party. Its research project is based on homegrown technology that eventually could translate Web pages and links more automatically, he said.

Source: SF Gate.com


Back to the top of the page.         

Drop your e-mail address
& get our free weekly newsletter

Read Serge Thibodeau's daily blogs on search engines at Serge Thibodeau Live. We strongly suggest you bookmark our web site by clicking here.

Tired of receiving unwanted spam in your in box? Then get SpamArrest™ and put a stop to all that nonsense. Click here to get all the details.
Tired of receiving unwanted spam in your in box? Get SpamArrest™ and put a stop to all that SPAM. Click here and get rid of SPAM forever!

Get your business or company listed in the Global Business Listing directory and increase your business. It takes less then 24 hours to get a premium listing in the most powerful business search engine there is. Click here to find out all about it.

Rank for $ales strongly recommends the use of WordTracker to effectively identify all your right industry keywords. Accurate identification of the right keywords and key phrases used in your industry is the first basic step in any serious search engine optimization program. Click here to start your keyword and key phrase research.

Pay Rank for $ales securely with your Visa, MasterCard, Discover, or American Express credit card through the secure PayPal network. (Note: PayPal is an eBay company, and maintains a net free capital of US $ 50 Million).
VisaMasterCardDiscoverAmerican Express

You can link to the Rank for Sales web site as much as you like. Read our section on how your company can participate in our reciprocal link exchange program and increase your rankings in all the major search engines such as Google, AltaVista, Yahoo and all the others.

Powered by Sun Hosting                  Sponsored by Avantex          Traffic stats by Site Clicks™

Site design by Mtl. Web D.         Sponsored by Press Broadcast         Sponsored by Blog Hosting.ca


Call Rank for Sales toll free from anywhere in the US or Canada:   1-800-631-3221
email:   info@rankforsales.com

| Home | SEO Tips | SEO Myths | FAQ | SEO News | Articles | Sitemap | Contact |


Copyright © Rank for Sales 2003    Terms of use    Privacy agreement    Legal disclaimer

       Ce site est disponible en Français