Pages disappear from the Yandex index. What happened and how to deal with it - Archangel's Blog

We have opened applications for the Yandex Manager School. The school is designed for students and recent graduates of humanities and technical universities who have decided on a profession, but want to gain knowledge and experience. This year we invite to classes not only novice managers, but also those who want to improve in the field of marketing and product analytics.

We have updated the “Photos” section in the Yandex.Disk application for Android. Now it displays all your photos: both those uploaded to the cloud and those in the smartphone’s memory. You no longer need to remember where to look for what. Pictures from the vacation before last year, fresh scans of documents, video from a matinee in kindergarten- everything is stored in one application.

Yandex has entered into an agreement of intent with Hyundai Mobis, a subsidiary of Hyundai Motor Group and one of the world's largest manufacturers of automotive components. The companies plan to jointly develop a software and hardware system that will allow the creation of unmanned vehicles of the 4th and 5th level of autonomy. The complex will be based on Yandex technologies.

Yandex.Translator has learned to support translations with examples. If you ask the service how to translate a particular word, it will not only list the options, but also show examples of context - how this word was used in books, films and television series and how it was translated. Comparing different options, it is easier to understand the shades of meaning and choose the most suitable translation.

Yandex.Mail now has a tool for managing mailings. With its help, you can get rid of unnecessary mailings and clear your inbox of old letters. Post office distinguishes important messages from the unimportant: if you unsubscribe from the store's mailing list, you will continue to receive order confirmations.

We have opened recruitment for a summer internship. It can be completed at Yandex offices in eight cities. Interns join one of the teams as a developer or analyst and work together on Yandex services. Basic requirements for interns are knowledge of mathematics and programming ability. More information about the internship will be available at the “Summer at Yandex” event on February 21.

Yandex has established an award named after Ilya Segalovich. It will be awarded to young scientists and scientific leaders from universities in Russia, Belarus and Kazakhstan who have achieved serious achievements in the field of computer science. The size of the bonus for undergraduate and graduate students will be 350 thousand, and for scientific supervisors - 700 thousand rubles. The first award ceremony for the Segalovich Prize laureates will take place in April of this year.

Let's meet New Year together. On the night of December 31 to January 1, Yandex will show its own holiday show for the first time. When preparing it, we sought to preserve the best traditions of New Year's broadcasting, but relied on new formats, new humor and new faces. Guests of the holiday will be musicians, artists and bloggers, about whom Yandex has been asked a lot throughout the year.

A section with audio lectures and podcasts has opened on Yandex.Music. They are recorded by Meduza, Arzamas, Lifehacker and other authors and projects. Podcasts dedicated to different topics, from personal effectiveness to contemporary art. For now, the section is operating in beta mode: podcasts can be listened to on the web and on the Music mobile website. In the future they will appear in the application.

Yandex has found a place for a new headquarters. It will be located on Kosygina Street, in the Gagarinsky district of Moscow. This is where our story once began: the CompTek company was located at the Institute of General Genetics, and the first real Yandex office was located at the Computing Center of the Russian Academy of Sciences.

Two new devices with Alice are going on sale - IRBIS A and DEXP Smartbox. This smart assistants for the home, created third party manufacturers based on the Yandex.IO platform. The devices look like mini-speakers, and inside they have a computer on which Alice lives: she runs errands and answers questions. The devices will be sold in M.Video and DNS stores at a price of 3,290 rubles.

Introducing Yandex.Phone - a smartphone controlled by Alice. Voice assistant carries out instructions and anticipates the wishes of the owner of the device. Yandex applications and services help her with this. The phone will be available for purchase in electronics stores and on the Internet at a price of 17,990 rubles.

The Russian crazy printer, that is, the State Duma, decided to find something else to ban. This time, news aggregators, such as Yandex.News, were targeted.

On Wednesday, the State Duma introduced amendments to the law “On Information, Information Technologies and Information Protection,” better known as the “law on bloggers,” and to the Administrative Code. The project introduces a special legal regime for Internet resources that collect, process and distribute information (a new concept of “news aggregator” is introduced). The regime will apply to aggregators with more than 1 million users per day. The federal body responsible for control and supervision in the media sector (Roskomnadzor) has been entrusted with maintaining their register. The draft obliges news aggregators to “verify the accuracy of disseminated socially significant information,” including the source of which is the media, and involves extrajudicial “taking measures to suppress the dissemination of false information” upon a complaint from authorized bodies. Their list will be determined by the government. The news aggregator must be a national legal entity, foreign participation in which it is possible in an amount of no more than 20%. Six months are given to bring the company structure into compliance with the new legal requirements. The company is also obliged to store the information it disseminates for six months. Liability (fines from 400 thousand rubles for individuals up to 5 million rubles For legal entities) the aggregator will bear if not removed false information at the request of Roskomnadzor. Kommersant

Most comments boil down to the fact that aggregators will not be able to exist in such conditions - that is why Yandex has already stated that Yandex.News will have to be closed, and Mail.ru will most likely sell its aggregator.

It seems true that in the wave of criticism of the proposal there are already signs of victory for the authors of the bill. Please note that one of the main arguments for protecting aggregators is the claim that aggregators index media that are already licensed. Now, this is not true. Firstly, a publication or website does not need to be registered as a media outlet in order to be indexed by Yandex.News.

And secondly, a publication or website does not need to be registered as Media in Russia, to be indexed by Yandex.News.

It’s easy to imagine what a “compromise” solution might turn out to be - the law could easily be adjusted so that aggregators would not be held liable only if the original message was published in a media outlet registered in Russia. And aggregators will have to choose - to close down altogether or to tear off all non-Russian sources from the Russian version of their service, at the same time requiring a certificate of registration from all Russian sources. And the final goal will be achieved - Russian users will not be able to access news reports from sources uncontrolled by the Russian state - be it Ukrainskaya Pravda or Reuters.


The rating is calculated for exactly six months, so it may fall even if many new ones are betting on you now good links- if six months ago they were installed more.
We understand that this approach is not obvious, and some things are already outdated, so we are currently working on improving the ranking algorithm.
Blog ranking is, of course, one of possible ways determining popularity. But by no means the only one.


earlyhawk
Please tell me, are there any plans to close the ranking of blogs on Yandex?
If unfortunately not, is it possible for the rating not to take into account cheating technologies, such as:
1) Repost by clicking on anything with an automatic link to the advertising merchant Professional_Blogger,
2) Idiot tests with the publication of the result, which contains a link to the cheater without any restrictions
3) Hidden links in comments?
Thank you!
ps
Yes, and thanks a lot for closing the top, which turned into an easily (and inexpensively) manageable PR tool for Leaders_Khomyachkov, etc. =)

anton
There are no plans to close the blog rankings. Ratings in Blog Search will change in the future to better reflect the real state of things, less to help cheaters show themselves and more to help users find what is interesting and necessary for them.
Already now, when building a rating, automatic and semi-automatic links have less weight than those placed manually.


Do homemade repost buttons affect ratings? Or are links provided by bloggers more valuable?
anton
They have an impact, but less than manually placed links.


How does Yandex treat promotions and purchases of links by blogs? The first two pages of the rating already consist of one third, and maybe even half, of dummy blogs created for one purpose - to get high rating.
vityok_m4_15
When will the rating take into account the uniqueness of the content?
there are too many copy-pasters in top bloggers

anton
We have a negative attitude towards any attempts to deceive users and algorithms and are working to clearly distinguish an artificially inflated blog from a truly authoritative one.


Microblogs have become the cheapest (1-3 rubles per link) and the most effective tool cheating, Twight.ru is thriving. Will the weight of Twitter links be reduced?
dikiymugchina
Since various advertising services on Russian Twitter, is there any plans to lose weight in the near future? transmitted links from microblogs?
anton
We do not plan to reduce link weight for any specific blog hosting site. We try to automatically detect paid links on all services. As I already said, the structure and formula of the rating should soon change, so that the influence of such markups on it will be significantly reduced.

dikiymugchina
Anton, I’m interested in the attitude towards collective blogs. What does Yandex understand about a collective blog and do such blogs have the right to participate in the ranking?
anton
Any blogs have the right to participate in the ranking. The second question is which one.


How can you explain the high rating of the Yandex blog (http://clubs.ya.ru/company/)? I’ve never seen any references to him in friends’ feeds, and in general he’s not “well-known” - he doesn’t get into the TOPs, he’s not cited by copy-pastors.
anton
The authority of the Yandex corporate blog is calculated on the same basis as all other blogs; we, of course, do not give it any preferences. Now it has more than 60 thousand readers, and the number of links to it over the past six months is about 7.5 thousand. These indicators are quite comparable with other blogs in the TOP.


Section "Today in Blogs" on home page Yandex appears and disappears, then shrinks to three records. How can you explain this?
anton
The “Main Topics of the Day” block includes only really actively discussed events, so if there are very few of them, we do not show this block at all. There was also an isolated case when a block disappeared due to technical reasons.


What happens to blog search when we see the message "Sorry, the service is temporarily unavailable. Try refreshing the page or come back later"?
semigr
What is Yandex doing to improve the stability of blog searches? Yandex blog search is too often buggy. Today it issued several times (maybe it still does): “Sorry, the service is temporarily unavailable. Try refreshing the page or come back later.” Google Search on blogs it works more consistently (I’ve never had any failures). I wish you success and surpass Google on a global scale.
anton
Currently, less than half a percent of users per day see an error message in Blog Search. We continue to work on the speed and stability of the service and hope that soon this percentage will decrease even more.

saboy
I always compare Yandex with Google blog search
IMHO it has been done more intelligently, that the search results are generated in the same way as in a regular search, there is no stupid concept of “authority” which everyone manipulates as they want, and fresh entries are always higher than those made earlier
the top lines are occupied by thematic blogs, and not everything in a row like a hodgepodge of cheaters of all stripes; spam in the search results below and dies very quickly
Question 1: is it a principled position to form a rating using a different algorithm (not like in regular search) or is there simply no possibility/desire to do this?
question 2: how are you going to combat the cheating of the “readers” parameter? I think it’s no secret to anyone that you can artificially add at least 10 thousand readers to this rating - accordingly, the trust in this classifier will drop to zero (and it weighs quite a lot in the rating)
proposal: take away authority from those who recommend/advertise/spam blogs; roughly speaking, you put a link, your authority fell twice as much as you received for a similar incoming link; all promotions and advertising are killed in one fell swoop, and bloggers write what really concerns them; I think it's as easy as shelling pears
We discussed this topic here yesterday
micoff.livejournal.com/111783.html
The author also has a very relevant question
I’ll be glad if Anton stops by and says something about

anton
1) Blog rankings and blog search results are two different things. Authority affects the rating, but has nothing to do with search results. By default, blog search results are ranked by date (most recent at the top). If desired, they can be sorted, on the contrary, in chronological order, or by relevance, that is, by the degree of correspondence to the query text.
2) We try to fight any markups when they worsen the quality of the resulting product for our users. But for obvious reasons I won’t tell you how we will fight :)
Thanks for your offer.


I know that Yandex really doesn’t like to talk about upcoming changes, just calm me down. Tell them that they will, that we will begin to trust authority again and will not look for any other tools for evaluating blogs, such as the number of readers.
anton
We are thinking about new metrics that will better guide blogs, microblogs, social networks, etc. And perhaps it will no longer be “authority.”

chinz
Tell me, what do you think is the main motivating factor for a blogger in the rating system - the desire to find out how cool he is, the desire to make friends or read the coolest bloggers, something else? Why are people even interested in ratings? We do not consider business issues, we are only interested in the psychological aspect.
anton
Probably, people like to play ratings because it is a way to show themselves better than others in something, like any competition.

micoff
My questions:
1. Why does Yandex use a policy of double standards?
a) to cheat links.
Yandex punishes sites (lowers their place in search results and excludes from indexing) who buy links en masse and arrange mass link exchanges. But it encourages blogs that do the same, raising them to the top of the TOP blogs.
b) to stolen content.
Yandex constantly catches and pessimizes so-called “varezniks” - sites with stolen content created for the purpose of making money from advertising. But blogs, essentially the same “varezniks”, occupy the first four lines in the blog ranking.
2. Why can’t Yandex install a text uniqueness filter that filters out advertising blogs where all entries are not copyrighted?
After all, in essence, a blog is personal diary and here it is much more important unique content than for the site. If you install a filter that assumes that in order to be included in the rating you must have at least 51% unique text, then the entire rating will change and only “blogs made for people” will remain in the TOP.

anton
1. These seem to be statements rather than questions. Thank you for your opinion.
2. Unfortunately, mechanistic criteria do not always work. If tomorrow we set a limit of “51% unique text,” then there will be more spammers who comply with it than normal people. Alas.
Our goal is to make an interesting service for users, and not to outline clear boundaries, or to arrange a game for site authors.

bsitnikov
Question: many ratings based on the Yandex API show completely different results. this is a consequence of the fact that Yandex different times Does the day/year/workload/phases of the moon give different results, or is this filtering directly by the webmasters of the local ratings?
Filtration, by the way, is also not a simple matter. everyone remembers how the same Subject in the rating on design.ru set the filtering to Other. Another one disappeared from the rating, only the rating began to show nonsense, incompatible with life and with a delay of a couple of days.
Is this a crooked API or a tricky one?

anton
We do not provide a ready-made rating, but rather data on which ratings can be based. The result depends on how you arrange the coefficients. Therefore different people turn out different ratings.

drugalev
Tell me how to quickly and inexpensively enter the TOP10? To whom and how much should I contribute? =)
anton
Places in the top are not for sale. To enter there, you need to have an interesting diary a large number people.

miha_vxc
1. Are there any plans to keep any global statistics (total number of posts, number of blogs, geo distribution of bloggers)?
2. Does a nickname link in LiveJournal comments affect the TCI of the commentator?
3. Which blog from the top 10 do you like?
4. Should we expect “instant” indexing in the near future? Google still wins in this parameter.

anton
1. We periodically publish studies of the Russian-language blogosphere - this data is there. (http://company.yandex.ru/facts/researches/)
2. Sorry, I didn’t understand the question.
3. My friend feed includes Tyoma and Leonid Kaganov. But, of course, my personal preferences do not in any way affect how our ratings are structured or anything else; I try to think more about the users of the service and how to make it convenient and interesting for them.
4. Entries appear “instantly” in Blog Search for a very long time. For the last 4 years, they have been included in searches in less than 10 minutes, and last year Most entries can be found with us faster than a minute after writing.
Nowadays, the main search includes documents that have literally just appeared - for example, on news resource sites.

ctulhubris
Anton, which bloggers do you read and why, if you read them at all?
Do you have anonymous clones on social services?)))
Will you come to Yarushnik next time?)
What books do you like to read, what is the last one, what would you recommend reading?
What kind of music do you like? Do you play any instrument?
What is your hobby? :-)

Let's take a look at the TOP blogs together.
Are you sure that the word “authority” (with the base “authority”, the meaning of which is easy to Google and Yandex) is well suited to describe it?
Maybe it’s worth changing it to - “popularity”, “promoted”, “celebrity”, “there’s a plug in every barrel”, “intrusiveness”, “venality”, “exchange of links”?
After all, no one in their right mind says that Ksyusha Sobchak, Timati or Sergei Zverev are authoritative.
They say celebrities, they say popular people, they say famous people, they say public people, they even say stars or elite, but to say about them that they are authoritative...
And the second question:
Do you maintain your own blog and can you sell it if necessary? If yes, then for what amount?
anton
1) I believe that the meaning is more important than the name. In any case, we didn’t think about simply changing the word, we are thinking about more global changes. I have already talked about them here.
2) I lead, although not very actively. I don't understand what the point of selling would be. personal blog, so I wouldn’t sell (although my blog toster , which I ran a long time ago, and which was quite popular, was offered to sell more than once).

stormax
Isn’t Yandex going to buy Tema’s rating engines and, without straining, make a new and cool little thing out of them, so that Tema will be as excited about it as before? :))
anton
No.

pe3yc
I won't have a question, but a suggestion.
Dear Anton,
We both know that the Yandex-blog-search project is in complete trouble. The database is 95% filled with anything but blogs (mainly slogs, doorways and other garbage), indexing works through the roof, the relevance of the search results leaves much to be desired, and your ugly games with the assessment and rating of content further aggravate the situation. The reputational damage for the parent company is obvious; there is nothing to compensate for it.
The only way out of this situation (apart from, of course, curtailing the project) must begin with the complete and irrevocable elimination of ratings: blog ratings, topic ratings, and any ratings whatsoever. In search engine tasks (and Yandex - search engine after all, not “Field of Miracles”) does not include evaluation and rating of content. And when and if a search engine arrogates to itself the functions of evaluation and rating, it inevitably turns into the disgrace that JPPB is today.
Of course, eliminating ratings in itself is not a panacea. This will not replace other mandatory steps: creating a relevant blog database, fixing the engine (and better replacement- why not use technologies that have proven themselves in the main search?), clearing spam from search results, etc.
But without canceling the ratings, all other actions are meaningless. With such weights you will not swim out of all this shit.

anton
Thank you, I read your opinion. Since this is not a question, I will allow myself not to answer anything.

Pages disappear from the Yandex index. What happened and how to deal with it

IN lately cases have become more frequent when Yandex excludes from the index individual pages, and sometimes entire sites. I have seen discussions of this behavior of Yandex on many forums and clubs, but a couple of weeks ago I felt the beauty of such an event on one of my blogs.

When I checked the visitor statistics, the fact that search traffic from Yandex, which was usually a little less than from Google, fell sharply! I looked into Yandex.Webmaster and found the answer: more than 80% of my blog pages have disappeared from the index...

My first thought was that the site had come under some kind of filter, but why? I did not sell or buy links for this site, all articles were written by me personally and cannot be a copy. The site is updated stably and quite often, and there are also active discussions in the comments. In a word, it is clear that nothing is clear...

It became even more fun yesterday when I discovered that my blog had completely disappeared from the Yandex index, along with several other of my sites! Moreover, all sites have completely different topics and different structures and run on different engines. Today, however, all sites have again appeared in the Yandex index, but the number of indexed pages for each of them has noticeably decreased.

So far I can’t find any reasonable explanation for this behavior of Yandex, other than its internal technical problems, but nevertheless I will try to conduct an experiment on returning pages to the index. For this purpose, we will take the following few steps.

1. Create an archives page and sitemap.xml file

Since my blog in question runs on WordPress, I installed the SiteTree plugin. In my experience, this is the most successful plugin for creating both an archives page and a sitemap for search engines, sitemap.xml. The plugin supports Russian language and is very flexible in settings.

So let's create new page content or archives and simply select it in the plugin settings - that’s it, the plugin automatically generates an updated content page, in accordance with our settings.
Also in the settings, we simply activate the sitemap.xml function and the plugin itself does the rest of the work, and also notifies search engines about any updates on the site.

2. Notifying Yandex about the presence of the sitemap.xml file

To do this, first of all, you need to add the address of the sitemap.xml file in the Yandex.Webmaster tools, and also specify the full URL in the file for the robots.txt spider. In the same file you need to check that everything required pages were opened for indexing.

If the site uses too many types of sorting articles, say: by categories, by tags, by authors, by months, then it is advisable to select a couple of the most full parameters, such as categories and dates, and close the rest to robots to avoid duplicate content.

3. Updating old articles and adding blog content

Since every time an article is updated, WordPress sends a notification (ping) to search engines, it makes sense to update the most important articles on the blog. It is almost impossible to change all the articles, especially if it is a fairly mature blog, but you can always choose those that were popular or, on the contrary, poorly visited, based on statistics.

The idea of ​​what to add or change in the article can be easily found from the analysis search queries, by which visitors came to the site, as well as from reader comments on the article. This way we will not only update and add content to the articles, but also make it more valuable for new visitors.

It is also advisable to add a few before the next step latest entries, which the search bot has not yet seen, and continue to regularly update the blog with new articles. Perhaps the most important thing for a blog is that updates be regular, albeit not very frequent.

4. Link run through your favorite Yandex services

To drive Yandex's fastbot back to your blog, let's try to find it on Yandex's favorite resources:


  • Twitter- If