Google counts more than 1 trillion unique Web URLs

July 27, 2008, 02:22 PM —  IDG News Service — 

In a discovery that would probably send the Dr. Evil character of the "Austin Powers" movies into cardiac arrest, Google recently detected more than a trillion unique URLs on the Web.

This milestone awed Google search engineers, who are seeing the Web growing by several billion individual pages every day, company officials wrote in a blog post Friday.

In addition to announcing this finding, Google took the opportunity to promote the scope and magnitude of its index.

"We don't index every one of those trillion pages -- many of them are similar to each other, or represent auto-generated content ... that isn't very useful to searchers. But we're proud to have the most comprehensive index of any search engine, and our goal always has been to index all the world's data," wrote Jesse Alpert and Nissan Hajaj, software engineers in Google's Web Search Infrastructure Team.

It had been a while since Google had made public pronouncements about the size of its index, a topic that routinely generated controversy and counterclaims among the major search engine players years ago.

Those days of index-size envy ended when it became clear that most people rarely scan more than two pages of Web results. In other words, what matters is delivering 10 or 20 really relevant Web links, or, even better, a direct factual answer, because few people will wade through 5,000 results to find the desired information.

It will be interesting to see if this announcement from Google, posted on its main official blog, will trigger a round of reactions from rivals like Yahoo, Microsoft and Ask.com.

In the meantime, Google also disclosed interesting information about how and with what frequency it analyzes these links.

"Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day. This graph of one trillion URLs is similar to a map made up of one trillion intersections. So multiple times every day, we do the computational equivalent of fully exploring every intersection of every road in the United States. Except it'd be a map about 50,000 times as big as the U.S., with 50,000 times as many roads and intersections," the officials wrote.

IDG News Service

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Free books

Build your tech library with our book giveaways.

Windows PowerShell 2.0 Unleashed
By Tyson Kopczynski, Pete Handley, Marco Shaw; Published by Sams

Windows PowerShell Unleashed will not only give you deep mastery over PowerShell but also a greater understanding of the features being introduced in PowerShell 2.0–and show you how to use it to solve your challenges in your production environment. Enter now!

 

Ubuntu Server Administration
By Michael Jang; Published by McGraw-Hill Osborne Media

Realize a dynamic, stable, and secure Ubuntu Server environment with expert guidance, tips, and techniques from a Linux professional. Ubuntu Server Administration covers every facet of system management -- from users and file systems to performance tuning and troubleshooting. Enter now!

Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

More Resources