Search Engine Indexing and Page Rank Algorithm

Search Engines Indexing

Search engines like Google maintain huge databases called “indexes” of all the keywords and the web addresses of pages where these keywords appear.

When a web designer creates a new website they can contact the search engine to let them know they would like their web page to be scanned and added to the search engine index. They can do so by completing an online submission form. (e.g. https://www.google.com/webmasters/tools/submit-url)

Search engines also use “bots” (software robots) called web-bots or spider bots that constantly (24/7) crawl the web to scan webpages, update the index, follow hyperlinks to move from one page to the other. They can hence find new pages if these have been linked to existing pages that have already been indexed.

In their indexes, for each URL that has been indexed, a page rank score is also stored. This is a number that will be used to sort search results when displaying these to the end user.

Page Rank Algorithm

When a user uses a search engine (e.g. Google) the following steps take place:

• A user submits a search query using Google’s search engine.
• Google searches all of the pages/URLs it has indexed for relevant content. (based on keywords)
• Google sorts the relevant pages/URLs based on PageRank scores.
• Google displays a results page, placing those pages/URLs with the most PageRank (assumed importance) first.

So we can define the page rank score as a numerical value, calculated by a search engine for each page on the web to measure its degree of importance.

The page rank of a page is regularly updated when the spider bots of a search engine crawl the web.

Page Rank Formula?

Google does not disclose its exact PageRank formula. But it is a pretty safe bet that calculating PageRank is not easy math. The key concept of this formula though is as follows:

PageRank for a given page = Initial PageRank + (total ranking power of a page / number of outbound links of this page) + …

Where the total ranking power is calculated by adding the Page Rank score of all web pages that links to your own web page divided by the number of outbound links this page have. (“+…” means repeat this formula for each single page that links towards your page.)

Which means:

• The more web pages link towards your web page, the higher the page rank score for your page.
• The higher the page rank score of pages linking towards your web page, the higher the page rank score for your page.
• If a webpage has a high page rank score but also have hundreds of outbound links it will not give you much page rank score. (e.g. This is to minimise the impact of “link sharing” to artificially boost your page rank score).

The following diagram shows how the page rank score of pages A to G is calculated. Each arrow on this diagram represents a hyperlink from one page to another page.

Nowadays the Page Rank algorithm used by Google is based on a more complex formula that takes into consideration a wide range of other factors such as:

• Is your website layout responsive / mobile friendly?
• Is your website secure? (e.g. Using the https protocol)
• Is your website regularly updated?
• How long does it take for your web page to load on average?
• etc.

SEO: Search Engine Optimisation

SEO refers to a range of techniques used to increase your visibility on major search engines. When designing and editing websites, web designers and web authors try to:

• Use keywords rich web addresses (domain names, files names for html pages and picture files).
• Use meta tags in each html page (Header section, invisible to the user but used by search engines).
• Use a lot of relevant keywords in the text of the web pages.
• Increase their page rank score by getting other websites to include hyperlinks to their website:
• Registering their website on business directories.
• Offering link exchange/sharing with other websites.
• Sharing links towards their website on social networks.