PageRank is defined as follows: We assume page A has pages T1…Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows: PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(TN)) – Sergey Brin, Lawrence Page; The Anatomy of a Large-Scale Hypertextual Web Search Engine; 1998. |
Google is one of the smartest search engines. Google was originally created by Stanford University students Sergey Brin and Larry Page, and retains a well thought out structure and methodology. The site actually runs on thousands of linked PC’s distributed in centres located in different places around the world, and usually provides relevant results with very fast response times.
Google uses a page ranking system to determine which sites are returned at the top in response to a search query. In Google’s words:
In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote.
This method ensures that the pages most highly recommended by other pages are returned closest to the top of a search listing. This technique is really a kind of centralized analysis of peer-to-peer data, and works quite well.
The Google advanced search page provides a menu driven way to search the Internet, and text equivalents of these features are listed below:
Search Features |
||
Function |
Query Example |
Results |
Boolean |
+space +mars -venus |
“space” and “mars” but not “venus” |
space OR mars |
“space” or “mars” |
|
space and (mars OR venus) |
“space” and either “mars” or “venus” |
|
Phrases |
+electric +”fastest car” |
includes phrase “fastest car” |
Fields |
allinurl:garden rose |
“garden” and “rose” in the URL |
gardens filetype:pdf | Searches only PDF format files. | |
inurl:garden |
“garden” in the URL |
|
link:livinginternet.com | pages that link to “livinginternet.com” | |
related:livinginternet.com | pages that are related to “livinginternet.com” | |
site:garden |
“garden” in the site domain name |
Other Google search options are described below.
- Languages. You can set your page display preferences to a wide range of languages, and search in international language sets
- SafeSearch. Provides a safe search option where most adult content pages are filtered out and not returned in search results
- Wildcards. Google doesn’t support wildcards, so for example instead of searching for “comed*” you need to search for “+comedy +comedian”.
Resources. Additional Google related resources are listed below:
- Google Groups — Provides free read access to an excellent Usenet newsgroups archive, and provides posting capability with registration for an account
- Google Image Search — A search function for images
- Google Zeitgeist — Trends and patterns in the use of Google
- johnny.ihackstuff.com — Google searches that display sensitive data mistakenly posted to the web.