Search Engine Basics

Time remaining/total left out of 2h 55m Lessons completed/total / 8 4.3 / 5

Introduction to Search Engines

Time required: 15m
Lesson URL:
Teacher: Paddy Moogan

In this lesson we will cover:

  • What the World Wide Web consists of;
  • History and purpose of search engines;
  • Brief overview of crawling, indexing and ranking;
  • Recent search engine developments.

To become an effective SEO, you not only need to understand the principles of on-page and off-page factors, but you must also understand users and how the Internet works.  This understanding gives you a firm grounding into how search works and how users interact with the web.

What does the World Wide Web consist of?

The web is fundamentally a collection of pages and files which are interlinked by a complex set of hyperlinks.  These hyperlinks allow users and search engines to navigate their way around the web to discover new content.  Before search engines existed, the only way to find your way around the web was to type in the exact address of the page you wanted or to click on hyperlinks which would lead you to different pages.

The files can be a number of things including:

  • Images
  • Videos
  • PDFs
  • Flash videos
  • Javascript

These files can be used to improve web pages so that they are more than just plain text.  In the early days of the Internet, these files were very hard for the search engines to crawl, let alone understand and index.  This was partly because of crawling technology not being very advanced, but also because crawling and indexing files other than plain text and images required a lot of resources that the search engines simply did not have or could not afford at the time.

Because of the improvement in their resources and technology, coupled with the introduction of high speed internet connections, web pages have become far richer in the types of content they can provide.  

Search engines can still struggle, however, with crawling and indexing certain types of content.  They are improving all the time and have made significant improvements, but right now, you need to be aware of what they struggle with.  We cover this in more detail below in the “Potential Problems for Search Engines” section.

As an SEO, you need to be aware of these file types so that you can enrich the websites you work with which can ultimately give users a better experience.  At the same time, you need to be aware of how these file types can cause problems.

History and Purpose of Search Engines

The World Wide Web has been mainstream for around 20 years, and has grown phenomenally in that time. In the early 1990s when the web was young, it was far more difficult for an average web user to create their own website. Websites were mainly hosted by tech savvy companies or hobbyists.

In these days, there was no such thing as a ‘search engine’ -- websites were discovered by word of mouth, or one of the few ‘What’s new on the web?’ type pages that listed new sites. This was not very efficient to begin with, but as the web grew over the next couple of years it became clear that a solution was needed.

During 1993/4 the first web search engines sprang up followed over the next couple of years by many commercial engines, including Excite, AltaVista and Yahoo!. The number of webpages and users had grown to the point where discovering the content you were looking for simply was no longer manageable via a centralised list.

Google itself started in 1996 and was called BackRub when Larry Page and Sergey Brin began working on it.  They were the first search engine to realise the power and potential of hyperlinks as a signal of trust and authority, they talked in depth about this in their University paper released in 1997.  Shortly after, PageRank was born and pushed Google ahead of their competitors on both the relevancy and quality of their results.

The World Wide Web now consists of billions of web pages, and search engines are a daily part of most people’s lives.

For a truly in-depth history of search engines, technically dating back to 1945 we’d recommend taking a look at Search Engine History.

3-steps of Search Engines: Crawling, Indexing, and Ranking.

There are 3 main areas to understand when looking at search engines: Crawling, Indexing, and Ranking. We are going to look at these in detail over the course of this module, but they can be summarised:

  • Crawling - This is the process that search engines use to discover new content. They have sophisticated programs that visit web pages and follow the links on them to find new pages.
  • Indexing - The search engines maintain a copy of the content of all web pages they have visited. This index is stored on a large collection of computers, in such a manner that it can be searched through very rapidly.
  • Ranking - This is the area of search engines that SEO is most concerned with. When a user performs a search on any search engine, the engine needs a ‘recipe’ (known as an algorithm) it can use to evaluate the pages in its index to determine which are most relevant, and thus determine in which position (rank) they are returned to the user.

To help give you a bit more of a basic introduction to this process, here is a useful video from Google which explains it quite well.  Whilst we don’t always feel that videos from Google are good for practitioners, they do sometimes produce good videos like this one which give great insight into how they think:

Recent Search Engine Developments

For many years search engines determined which pages were most relevant for a given query based solely on the content of those pages, and how other pages on the web referred to them. All of the information that the search engines examined to make the determination of relevancy was encapsulated within the web itself.

Anyone searching for a specific word or search phrase would get the same results as everyone else who searched from within the same country.

However, over the last few years this has changed in two important ways:

  • Social Networking - Sites such as Facebook and Twitter provide the search engines with important clues about which webpages people are talking about, or have shared with each other. This has meant these clues (we call them signals) have provided additional information to the search engines, allowing them to change the ‘recipe’ for determining a site’s ranking.
  • Personalised Search - Similarly, the search engines have been able to use a specific user’s Social Network usage, and their previous searches, to determine what is more importantly to them personally. This has meant that now different users searching for the same search phrase might see somewhat different results.

There have also been other major developments over recent years which have changed the way that people search.  Google in particular has become much more advanced at using machine learning and user data to predict the best results for a given query.  There are two Google features which demonstrate this ability and show the advances they have made:

  • Google Suggest - launched in August 2008, Google Suggest uses advanced algorithms and machine learning to predict what you may be searching for.  As you start typing your query, Google suggests keywords for you, this allows you to refine your query as you go and get ideas for what you may want
  • Google Instant - launched in September 2010, Google Instant significantly changed how people search by creating dynamic results as the user typed their query.  The results would update “live” without the user even having to press enter

As an SEO, it is important to not only be aware of these developments, but how they affect your work.  In particular you need to figure out how they may change the way people search and the types of keywords they use.

Lesson action:

  • Go to Google and refine a search query using Google Instant, start with a query where you are looking for a product or a service.  Then change the query to one where you’re looking for a local business, notice how the search results change

Here are some examples of what you should be looking for.

A search for “chocolate cookies” shows me image results and some reciples:

Now if I change my search to “london chocolate cookie shops” I see local results and a map:

Using a Search Engine

Time required: 15m
Teacher: Paddy Moogan

Advanced searches

Time required: 20m
Teacher: Paddy Moogan

Crawling

Time required: 15m
Teacher: Paddy Moogan

Indexing

Time required: 25m
Teacher: Paddy Moogan

Ranking

Time required: 15m
Teacher: Paddy Moogan

Q&A: Test your Knowledge

Time required: 10m
Teacher: Paddy Moogan

Curated Resources

Time required: 1 hour
Teacher: Paddy Moogan