Typically, a special crawler program visits a website and reads the pages’ source codes. The process is referred to as ‘spidering’ or ‘crawling’. Then, the page is compressed and put in a repository of the search engine known as an index. This stage is called ‘indexing’. Finally, when a query is submitted to the search engine, it pulls the page from the index and ranks it among other results the query has found. This is known as ranking.
Usually, crawler-based search engines consider many more factors during indexing than those they may find on web pages. Hence, prior to putting a page into an index, a crawler looks at the number of other pages in the index linking to the page. It also looks at the text used in links pointing at the page, the linking pages’ PageRank, or whether the page can be found in directories under similar categories. Such off-page factors are vital considerations to be made when a crawler-based engine evaluates a page.
Theoretically, someone can increase his or her page relevance artificially for certain keywords through an adjustment of HTML code’s corresponding areas. However, he or she does not have much control over other pages linking to them in the internet. Therefore, off-page relevance only appears in the crawler’s eyes.
From here, the main spider-based search engines are looked at, as well as how they index and rank a site highly. Despite this step not closely dealing with search engine optimisation itself, information on how every search engine looks at web pages is provided.
Google is the best search engine among giants such as Bing and Yahoo. It has a search share of more than 60% and indexes billions of pages, enabling users search for any information they desire. In addition, it creates tools and services such as web applications, solutions and advertising networks for businesses to successfully hold on the top position. By submitting one’s site to Google via https://www.google.com/webmasters/tools/submit-url-ac, a person can get it indexed within one or two months. Alternatively, a website owner can sign in to a Google account and submit a sitemap of their website via Google Webmaster Tools.
It is worth bearing in mind that Google can ignore someone’s submission request for a specified time period. Even if the engine crawls the site, it may not index it if no links point to it. However, if they find a website by following links from other already-indexed pages that are regularly spidered, the site can be included with no need for submission. Chances of this happening are much higher if Google finds a site via reading a directory listing. Therefore, submitting one’s website can help, but the best way to get indexed is through links.
Google has typically been performing monthly updates in the past. At the start of a month, a deep web crawl would be carried out. After two weeks, a calculation of the PageRank for the pages retrieved is done. At end month, the index database is eventually updated. Nowadays, the search engine has switched to model of incremental daily updates sometimes known as everflux. In June 2013, an announcement was made by Matt Cutts that each month the Panda algorithm will be updated, with the update slowly rolled out throughout the month. Google will now run this update on a certain day and then push out the impact throughout the month over 10 or so days.
Get some insight into your webpage on-page SEO factors, Click here to go to SEO Perth Audit Tool