What is indexing and how to improve it. Number of duplicates and search results

  • Tutorial

We recently completed a course on SEO in Netology and made for ourselves a checklist of what must be done on the “I love individual entrepreneur” website for search engine optimization. But these tips are universal for any project. In the article you will find a list practical recommendations from the entire course of 13 lectures from 8 different specialists, as well as useful links and services that will help you improve your website's SEO.


Who is this article for:

  • for web designers and developers who want to create websites that are natively optimized for SEO,
  • for owners of Internet resources who want to understand SEO on their own in order to increase search traffic.

Disclaimer: these tips are unlikely to help you get to the top high frequency queries, but you will be able to correct all technical and textual errors on the site to improve your rankings.


SEO work mainly consists of 5 stages:

  1. Technical audit of the site.
  2. Audit of commercial factors.
  3. Selection semantic core.
  4. Internal and external website optimization.
  5. Link building.

Technical audit of the site

1) Check if all pages of the site are in search:

  • by the number of search results (in Google - using site:site.ru, in Yandex - using host:site.ru),
  • in Yandex.Webmaster (Indexing → Pages in search) or in Google Search Console ( Google Index→ Indexing status).

2) Check for duplicates on the site. Duplicates are pages with the same content but different URLs. Duplicates can be complete (if the content matches 100%) or partial (with a high% match). Duplicate pages must be deleted.


3) Check for blank pages (those with no content). Blank Pages Can:

  • delete,
  • close from indexing ( in the robots.txt file, see below),
  • fill with content.

4) Check for junk pages (those that do not contain useful content). Junk pages can be:

  • close from indexing,
  • make it useful.

5) Check for the robots.txt file. This text file in the root directory of the site, which contains special instructions for search robots. For more details, see Yandex and Google help. The file size must not exceed 32 KB.


6) In the robots.txt file you can specify general rules for all search engines and separately for Yandex. The rules for Yandex must additionally indicate the Host directive (the main mirror of your site with or without www) and the Sitemap directive with a link to your site map. You can check the robots.txt file in Yandex.Webmaster.


User-agent: *
Disallow: /contacts/

User-agent: Yandex
Disallow: /contacts/
Host: www.iloveip.ru
Sitemap: http://www.iloveip.ru/sitemap.xml

Example robots.txt file from our website


7) Check for the presence of the sitemap.xml file. This is an analogue of a site map, designed specifically for search robots. For more details, see Yandex and Google help. You can create a sitemap using this link. You can check the sitemap.xml file in Yandex.Webmaster.


8) Check for broken links (links to non-existent or inaccessible pages). Everything needs to be deleted broken links, both external and internal. You can check broken links in the Screaming Frog SEO Spider Tool program (downloaded to your computer, there is free version) or online using the tool Technical analysis from SeoWizard (paid service). You can also check broken links in Yandex.Webmaster: Indexing → Crawl statistics (see HTTP code 404).


9) Check for redirects on the site. Types of redirects:

  • 301 - the requested document has been permanently moved to the new URL,
  • 302 - the requested documents are temporarily available at a different URL.

It is better not to abuse redirects, since the presence on site pages of links leading to pages with a redirect contributes to the “loss” of link juice.


You can check it in Yandex.Webmaster: Indexing → Crawl statistics (see HTTP page code).


10) Check the site loading speed (should be less than 3 seconds). This is one of important factors, which affects the ranking of a site by search engines. You can check it using Google PageSpeed ​​or in Google Search Console (Crawling → Crawling Statistics → Time taken to load page).


11) Set up a 404 error for remote or non-existent pages. This can be done in the .htaccess file. For more details, see Yandex help.


12) Check the server responses and the .htaccess file. Most common mistakes that occur:

  • Both versions of the site are available with and without www (for example, site.ru and www.site.ru). This has a bad effect on indexing, since the search engine tries to exclude duplicates and may choose as the original page a completely different page than the one you are promoting.
  • There are no redirects for pages with or without a “/” at the end. If site pages without a slash at the end and with a slash give a server response of 200 (the page is available), then when they get into the search engine index they are complete duplicates.

13) Check that the URL is correct. Non-leaf pages (sections, subsections) must contain “/” at the end of the URL, and end pages (product pages, articles) must not contain “/”. But it is recommended to apply this format only to new pages, since for old pages this will lead to the document losing its age.


14) Try to use “Human Readable URLs” (abbreviated as “HURLs”) or nice and friendly URLs. Example of no CNC: yoursite.net/viewpage.php?page_id=23. Basic recommendations:

  • you can use foreign words (/contacts/) or transliteration (/kontakty/),
  • Use a hyphen “-” as a separator between words.
  • between "/" delimiters URL address should be no more than 2-3 words,
  • URL length should not exceed the competitor average.

15) Maintain folder hierarchy in the URL. For example:


site.ru/section-name/subsection-name/final-page


This will help Yandex create navigation chains and reflect them in the snippet of your site when displayed in search results. For more details, see Yandex help.


16) Check the display of the site on mobile devices. This can be done in Yandex.Webmaster (Tools → Verification mobile pages) or in Google Search Console.


17) Specify the page encoding meta charset="utf-8" in head.


18) Check the presence and uniqueness of title, description and h1 tags on each page.


19) The title tag should be as close to the beginning of the head as possible.


20) Try to include everything in the title tag keywords, and the most popular keyword should be closer to the beginning of the tag.


21) Maximum length title tag - 150 characters, optimally - 60 characters.


22) B title tag Identical words should not be repeated (maximum 2 times); you can use synonyms, related words or other words from queries. For example: A bank loan secured by a room. Get a loan secured by a room in Moscow.


23) To separate different phrases in the title tag (for example, the name of the page and the site), use the “|” symbol.


24) The description tag does not directly affect site ranking, but search engines its content can be used for the site snippet when issuing.


25) The length of description can be from 100 to 250 characters, optimally 155 characters. This is usually one or two meaningful sentences to describe the page, including search queries.


26) Specify Open Graph Protocol metadata in head to correctly represent the site in in social networks.


27) Add to root directory site favicon.


28) Styles and scripts should be loaded into head as separate files.


29) There can only be one h1 heading per page.


30) The h1 heading should not copy the title.


31) The h1 heading can contain from 1 to 7 words and must include the exact occurrence of the main search query. For example: Loan secured by a room.


32) Avoid using nested tags in the h1 tag (eg span, em, a href, etc.).


33) Maintain a h2-h6 heading sequence and include other keywords. H2-h6 tags should only be used to mark up SEO texts.


34) Use semantic layout(for paragraphs - p, not div), try to include keywords in lists, tables, pictures (alt, title tags), highlights (em, strong).


35) alt attributes and title for pictures should be different. Alt is an alternative text for the image if it does not load. Title is the title of the picture, which pops up when you hover over the picture, and also appears in the search.


36) Add Shema.org micro markup to the site. Micro markup validator in Yandex.Webmaster.


37) If you are planning to move your site to https, read this.

Audit of commercial factors

38) Commercial factors important for commercial sites.


39) The site should contain contacts:

  • phones,
  • online consultant,
  • back call,
  • address, directions,
  • schedule.

40) Place legal information on the website:

  • contract offer, terms of service,
  • Company details,
  • terms of exchange/return,
  • delivery conditions.

41) Place an assortment on the website:

  • price list,
  • quantity of goods in stock,
  • discounts, promotions.

42) Add information that inspires confidence:

  • reviews,
  • portfolio (examples of work),
  • video,
  • vacancies.

43) Place email on your domain (for example, [email protected]).


44) If news is published on the site, make sure it is updated.


45) In the copyright (c) indicate the current year.


46) Strive to keep your website design modern and mobile-friendly.

Selection of semantic core

Selecting a semantic core is a big topic that deserves a separate article. Here we will focus on the basic principles.


47) Before moving on to the selection of the semantic core, you need to understand what types of user queries there are and for what queries you will promote the site:

  • Navigation (brand) queries - search for a specific site or place on the Internet. Usually, for such requests, sites are in first place, and promotion is not necessary.
  • Information requests - searching for information, no matter what site (for example, how to treat a cold).
  • Transactional requests - the user wants to perform some action (“download”, “buy”, etc.). Commercial requests are always transactional. But not all transactional requests are commercial (for example, “download for free”).

Commercial pages (online stores, company websites) need to be promoted according to commercial requests, informational (forums, blogs, articles) - according to informational requests.


48) You can determine whether a request is informational or transactional using a search. Enter a phrase and look at the result search results. If there are mainly informational articles, then the request is informational, if commercial pages- it's commercial.

Site indexing is a process consisting of searching, collecting, processing and adding web resource information to the search engine database by search robots.

Search index is a search engine database designed to store all information found by search robots on sites included in indexing.

Explanation of the terms “site indexing” and “search index”

Under web resource indexing imply bots visiting his Internet pages, analyzing the content they contain and adding it to the database. This is done so that users can then find information on the resource using key queries in search engines.

Simply put, the user goes to a search engine, enters the query he needs in the search bar, and in response receives a list of many web pages indexed by search robots.

Indexing is a mandatory procedure in the operation of search engines. For this purpose, a special specialized database is created, through which the search results are generated.

Search index any site depends directly on its content, external and internal links, availability of images, graphs and other materials. By entering a query in the search bar, the Internet user accesses the index. Then, based on the data, the search results are ranked, a list of pages that are built as their relevance to the query decreases.

Imagine that World Wide Web- This is a large library. It must have a special directory that makes the search necessary materials much simpler. All books in the library have their own code. All ciphers are united by themes, sections and other parameters.

When a person comes to the library and asks for a book on a certain topic (makes a request), the librarian goes to to the required section, takes out all the books that match it, and selects the most suitable one for the reader.

Search engines work on a similar principle: the user makes a request, search engine retrieves all relevant pages and displays the most relevant ones.

On a note. At the end of the last century, indexing took place precisely on the principle of cataloging - bots searched for keywords on resources that made up the database. Nowadays, robots, in addition to keywords, take into account many other content parameters, including uniqueness, informativeness, literacy and much more. This is what modern indexing is based on.

From year to year search algorithms are becoming more perfect, the database is becoming more and more populated additional information, while the search for users becomes much easier and more relevant.

How do Yandex and Google index sites?

There are two types of robots involved in indexing:

  1. Basic. Study the content contained on the pages of the Internet resource;
  2. Fast (fast robots). Analyze and index new materials that were added after the site was updated.

In order for a web resource to be indexed in the most popular search engines, the webmaster needs to inform about his project:

  • Add a site for indexing by filling out the special form search engine through services such as Google Webmaster, Yandex.Webmaster, etc. This method of indexing is slow, from two weeks or longer, because the project falls into a queue.
  • Submit a resource for indexing by posting links on other websites. This method is the most effective, because bots consider the pages found in this way useful, and index them much faster - no more than two weeks, and if you’re lucky, even in 12 hours.

In most cases, new sites and pages are indexed in 1-2 weeks. Many people note that the search engine giant Google includes Internet resources into the index much faster, in just a few days. This is due to the fact that it indexes pages not only with high-quality material, but also with bad material. But only useful content gets ranked.

In Yandex, a similar process is slower, but only informative and useful pages, and the garbage is eliminated immediately.

Indexing Internet sites takes place in 3 stages:

  1. The robot finds a resource and studies the information contained in it.
  2. Adds the found material to the database.
  3. After 1-2 weeks, the information that has successfully passed indexing is included in the search engine results.

How to check indexing in Google and Yandex

You can check whether a site or page has been indexed in Yandex or Google in 3 ways:

  1. Using webmaster.yandex.ru tools or google.com/webmasters. For Yandex, go to “Site Indexing”, and then to “Pages in Search”. For Google, click “Search Console”. Next, select the “Google Index” section and find the necessary data in the “Status” menu.
  2. Through browser plugins. The most popular today is RDS Bar.
  3. By entering in search bar command: site:domain.ru

How to make indexing faster?

Naturally, any webmaster wants robots to index their site as soon as possible, because this determines how quickly the material it contains will appear in search results, which will attract new visitors. To make indexing faster, you should follow these recommendations:

  • Add the project to the search engine.
  • Constantly update the site with new unique, informative and useful target audience content.
  • Place the project on reliable and high-speed hosting.
  • Create convenient navigation through the resource; access to pages should be no more than 3 clicks from the main page.
  • Correctly configure the robots.txt file, namely: block indexing of service pages and remove unnecessary restrictions.
  • Check the number of keywords, eliminate errors in the source code.
  • Provide internal linking(connect website pages with links).
  • Create a site map. You can even make a sitemap separately for robots and for visitors.
  • Post links to portal articles on social networks.

How to block a resource from indexing?


There are times when it is necessary to block search engine robots from accessing a project or its individual pages, parts of text or images. As a rule, site owners resort to such actions when they want to hide some information from public access, hide sites under development, technical or duplicate pages, etc. You can do it like this:

1. Using the robots.txt file.
Create in the root of the site Text Document robots.txt and write in it the rules for search engines, which consist of two parts. The first (User-agent) tells which search engine to take commands into account, and the second (Disallow) prohibits indexing of certain material. To prevent indexing of an entire resource for robots of all search engines, you need to write the command:

User-agent: * Disallow: /

2. Through a meta tag.
This method is better suited for preventing one page from being indexed. The nofollow and noindex tags allow you to disable indexing separate page or a fragment of text to robots of all search engines. Registered in the code specific page, which you want to close from indexing.

Command to disable indexing of the entire document:

Prohibition for a specific search engine robot:

The role of indexing in website promotion

Without indexing, web resources would not appear in search engines. Regular website updates quality content contributes to its frequent visits by search bots, which leads to more fast indexing, high positions of the project in search results and an influx of traffic.

In addition to the quality of content, search robots also take into account traffic and behavior of visitors on the site to assess its usefulness for further ranking. Therefore, indexing is one of the the most important processes for SEO promotion of Internet resources.

And in order for information to be successfully entered into the search engine database, bots must make sure that the materials contained on the site are useful to visitors.

Conclusion

Site indexing is the process of collecting and placing information from Internet resources into a search engine database, and a search engine index is the database itself, which contains all materials from sites.

Without indexing the portal and getting it into the index, it is impossible to promote the project, attract traffic and, accordingly, receive income from it. From the moment a site gets into the index, the countdown of its age begins. And the more a document is in the index, the better it is ranked.

I found 16 more new ways to speed up site indexing.

The point of speeding up indexing is to attract search robots (spiders) to the site. The main thing is to know where these spiders are found :)

1. Questions and Answers Services

IN Lately Question and answer services are becoming very popular. Comes to them a large number of visitors, the content is constantly updated, and search spiders just live there.

The point is not to spam links to your site in every comment. It's better to make one meaningful comment than several meaningless ones. Find a category related to the topic of your site and read it for a while. Then find a question in which you are competent, and help the person asking the question with meaningful and useful advice, appropriately including a link to your website.

Over time, new services will appear, so from time to time, type the query “questions and answers” ​​into the search engine to find new ones.

There are still few social networks for bloggers on the RuNet, but they are well indexed by search engines and also provide direct links to your site. Study the materials on these sites, and if the topic of your site matches the topics published on these sites, publish 1-2 announcements of the most interesting materials on your website. Then from time to time you can add announcements of your best articles to them.

It was still good service BlogParad.ru, but recently it has not worked.

3. Comments in popular blogs

Search engine spiders often visit popular blogs. By using a comment on such a blog, you can effectively attract search spiders to your site. In this case, of course, you need to write an intelligent comment. There is no need to make a link in the text of the comment - in many cases this will lead to the deletion of the comment. When adding a comment, it is enough to indicate the address of your website in the appropriate field. Write the name as a name, not a keyword.

How to find popular blogs? Very simple. Go to the Yandex blog rating and you will find many popular blogs.

4. Posts and comments in popular communities

Communities differ from blogs in that you can join them and publish your notes. You can leave comments in communities, just like on regular blogs. Under no circumstances should you spam communities. You will be promptly removed from it and your access will be denied.

Popular communities have tens of thousands of participants. In order to find popular communities, go to the Yandex community rating.

Most communities are located on blogging services LiveJournal, Liveinternet and Blogs.Mail.ru. In order to join communities, you must first register with these services. After this, you will have a blog at your disposal.

By the way, it is not at all necessary to publish a link to your site in the community. It is enough to write several posts on your fresh free blog, and put 1 link to your website in them. Then you post a note or comment in the community without any links. The search spider will visit your profile link and index your free blog, and then your website.

5. Ordering posts on the Blogun service

Another way to speed up site indexing is to order paid post on blogs through the Blogun service. There you will find many blogs on any topic that are ready to write posts about your site in the following formats:

Blogs are indexed very quickly by search engines, so placing links to your site in them is very effective for speeding up indexing.

More important advantage What I really like about blogging is that you pay for a link to your site only once, and it remains for the life of the blog. You don’t have to pay for it monthly, like on other link exchanges.

🔥 By the way! I'm planning to release paid course to promote English-language websites. If you are interested, you can apply for the early list through this form to be the first to know about the release of the course and receive a special discount.

I invite you to subscribe to my channels on Telegram:

After you have organized a blog on your site, in the admin panel go to

Options - Writing - Update Services
(Settings-Writing-Update Services)

and at the very bottom you will find a list of ping services.

Then copy my list and paste them into the window. Now, when you publish any post on your blog, all these services will receive a signal that you have new material, and search spiders will soon visit your blog and site.

7. Add RSS feed your site in RSS aggregators

RSS aggregators are sites that publish announcements of materials from various sites using their RSS feeds. RSS aggregators are well indexed by search engines, largely due to the fact that they have a lot of constantly updated content, that is, announcements from RSS feeds.

You can add your website's RSS feed to RSS aggregators for free. If your site does not have an RSS feed, then organize a blog. All blogs have an RSS feed, so you can add it to RSS aggregators.

8. Redirect your site's RSS feed via Feedburner

According to my observations, the link in the signature on the Searchengines.ru optimizers forum helps search engines index a new site within 1-4 days. The link in the signature on the English-language DigitalPoint webmasters forum allows the Google robot to index your site within 15-45 minutes.

11. Add your news to StumbleUpon or Digg

Using this method you can attract Google robot to your website in 5-30 minutes. But to do this you need to make one interesting article in English and publish it on your website. Then register on the social networks StumbleUpon and Digg and add your English page there.

You can read more about website promotion using StumbleUpon in my article.

12. Create a blog using free blogging services

Blogs on popular services are indexed quite quickly by search engines. It is enough to create a blog there and write several articles or notes with links to your site. To further speed up indexing, you can add friends or leave comments on popular blogs. Usually the robot comes to the site 1-5 days after creating a free blog.

Here is a list of the most popular blogging services:

13. Publish a press release

A press release is a great way to speed up site indexing. You can read more about press releases and how to use them to promote websites in my article.

To speed up the indexing of English-language sites, you can use my latest one.

Posting is a form of advertising on blogs where the blogger places a direct link to the advertiser at the beginning or end of the post. Choose blogs that are updated frequently and have good traffic. Costs will range from 3 to several tens of dollars per link in the post.

Go to the blog you like, look for the “Advertising on the blog” section and write to the blogger. Most bloggers offer advertising on their blogs, so finding a suitable blog will not be a problem.

15. Write a guest post on a popular blog

A guest post is when you write an interesting and... useful article on a certain topic and by agreement with the blogger, you send him your article, and he publishes it on his blog. In this case, your authorship is indicated and a link to your site is placed.

If you have professional knowledge in any field and can write an interesting article, then this method is great for speeding up indexing. Write an article and send it to one or more bloggers indicating that it is unique. Many bloggers will be willing to post your material if it suits the topic of their blog, and you will effectively speed up the indexing of your site.

16. Interview a popular blogger

It would seem, why take an interview to speed up site indexing? The point is that many bloggers enjoy doing interviews, and in most cases they will link directly to the interview page on your site. Search robots often visit popular blogs.

This method is not suitable for all sites, but if you have a young and interesting thematic site or blog, then this good way not only speed up site indexing, but also get additional visitors to your site.

You can read more about how to interview in my article.

In general, I can say that if you use the above methods for speeding up site indexing, as well as the methods from the article, then you are guaranteed to bring search robots to your site.

P.S. Postscript today - an amazing video in continuation of the article about real graffiti fans. I've never seen this before:

Site indexing – the presence of site pages in the search engine database. In order for a site to be indexed, it is necessary for a search bot to visit it.

In this article we will look at how you can speed up the indexing of your site and make sure it is indexed perfectly.

For complete understanding:

Site indexing– this is a crawling of a website by search engine robots and adding it to the database. Search robots enter into the database information about the site and all its pages that are available for search. The search bot indexes links, images, videos, and other elements on the site.

In order for the site to start being indexed, you need to:

1. Make sure that indexing is open in robots.txt

It often happens that a site is forgotten to open for indexing in a . This is the default in many site management systems.

When closed from indexing it looks something like this:

In order to open it, you just need to remove the *, for example, as in robots.txt on our website:

Using the Disallow line we close unnecessary pages, in our case these are service sections, or those sections that do not need to be indexed.

2. Add your site to search engines

The easiest way to make search engines aware of your site is to add it through forms. About that, we read the article. Everything is described there in detail for the main search engines.

Blogs and media sites are best indexed; you can place links to them using link exchanges where there are big choice a variety of sites, from bad to good.

5. Adding a site to social bookmarks

Social bookmarking sites such as: bobrdobr.ru, memori.qip.ru, moemesto.ru, mister-wong.ru and hundreds of others.

By using social bookmarking you can quickly attract search robots.

In order to index a site as quickly as possible, it is best to use all methods at once, then the return will be better.

If the site has not been indexed for a long time, you need to first check whether it is closed from indexing; if everything is fine, then use the methods described above to attract search bots to the site.

Website optimization- a necessary stage in, if the site is not indexed, then it will not be possible to promote it in the search; first you need to achieve its indexing.