The best Internet search engines. The most famous search engines on the Internet in Russian

On the Internet, a special website on which a user, upon a given request, can receive links to sites that match this request. The search system consists of three components: 1 search robot; 2 system indexes; and 3 programs,... ... Financial Dictionary

Noun, number of synonyms: 3 searcher (9) bloodhound (16) search engine (13) Dictionary of synonyms AS ... Dictionary of synonyms

search engine- search engine A site with which other sites are searched. The search is carried out by entering keywords into the search box. Unlike directories, even if the site has not been previously registered, it can be found using a search engine.... ... Technical Translator's Guide

search engine- ieškos sistema statusas T sritis automatika atitikmenys: engl. searching system vok. Suchsystem, n rus. search engine, f pranc. système de recherche, m … Automatikos terminų žodynas

Search engine- – (English search engine, synonyms: search engine, search engine, search engine) – A tool for searching information on the Internet. As a rule, the work of a search engine consists of two stages. Special program (search robot, machine, agent,... ... Encyclopedic Dictionary of Media

Control, an automatic control system (See Automatic control), in which control actions are automatically changed by the search method so that the best (in some sense) control of the object is carried out; at… … Great Soviet Encyclopedia

An automatic control system, in which the control actions are automatically changed using the automatic search method in order to provide the best control of the object; in this case, changes in the characteristics of the object or external influences. Wednesdays in advance... Big Encyclopedic Polytechnic Dictionary

SMP 1 is designed to search for rescuers who find themselves in critical conditions associated with a threat to life, as well as search for dropped cargo and various objects in conditions of poor visibility. It includes: a radio unit for searching for active... ... Dictionary of emergency situations

automated information retrieval system- 3.2.5 automated information retrieval system: IRS, implemented on the basis of electronic computer technology Source ... Dictionary-reference book of terms of normative and technical documentation

This term has other meanings, see Aport. Aport... Wikipedia

Books

Extreme problems of graph theory and the Internet. Textbook, Raigorodsky Andrey Mikhailovich. This brochure is devoted to the study of various extremal problems in graph theory, the (at least partial) solution of which can be useful in data analysis. It arose based on...
Extremal problems of graph theory and the Internet, Raigorodsky F.M.. This brochure is devoted to the study of various extremal problems of graph theory, the (at least partial) solution of which can be useful in data analysis. It arose based on...

Most of the time a user spends on the Internet is spent searching for information that interests him. At the same time, there are many ways to obtain this data - you can look into an online encyclopedia and try to find the answer there, you can subscribe to a newsletter on a topic of interest and carefully study incoming correspondence, or you can consult with competent people on the forum by asking them a question. But the most universal way to find something on the Internet is to use one of the many search engines. Services for searching millions and millions of websites are, perhaps, the fundamental link of the World Wide Web. Without Google, Yahoo, Yandex and many other search engines familiar today, a user’s stay on the Internet would be more like a blind man walking through the forest. The importance of search engines for working on the Internet can hardly be overestimated - many users have search engine addresses as their starting pages, and it is from there that for many people an endless journey through various network resources begins. However, the effectiveness of Internet excavations is different for everyone - one person finds information instantly, another takes a lot of time, and a third may not find anything useful for himself at all. What is the reason? The answer is simple: searching on the Internet is similar to fishing - you need to know where to fish and what to fish for, i.e. where to look and how to look. In today's article we will talk about the best way to search on the Internet, and we will tell you what search engines exist for this, besides those that are “on everyone’s lips.”

However, we will start with those systems that you know. If a user knows the address of a search engine, this does not mean that he knows how to use it. Let's check how well you understand search query technology. How accurate the results you will get, first of all, depends on how skillfully you formed the search query. For example, if you are looking for information to write a term paper, you do not need to enter its topic verbatim, especially if the work has a narrow specialization. You will find much more valuable information if you try to select keywords, that is, those words that will definitely appear in your work. If you are looking for a lost manual for a car radio, then by entering the model number, you will probably get a huge number of sites offering to purchase it. To weed out unnecessary links, you can use the search function in what was found or exclude some words from the search. In almost every search engine you will find an advanced search function. This is another good way to filter out unnecessary results. Such functions may include searching for pages that have been recently updated, searching for pages only in a certain language, or on sites located in the domain zone you specify. The time spent searching can be significantly saved if you know and use the query language syntax. Each search engine has its own characteristics. For example, when you are looking for something on Yandex, it would be a good idea to use the following techniques:

To search for words that should appear on the page in one sentence, put the & symbol between them
To exclude a specific word from search results, add it to your query by prefixing it with ~~
To find pages that contain at least one of the words specified in the search query, separate them with |
To search for a word in the specified form, put an exclamation mark in front of it

The Google search engine also has its secrets. Here are just a few of them:

To search for information on a specific site (and only on it), enter its address in the query field, preceded by the word site and a colon (for example, site:http://www.site)
To search for a phrase that should appear on the page in its entirety, put it in quotation marks
To exclude pages that contain a specific word from search results, add it to your query by preceding it with a minus sign

These are just a few touches that can help make your online search more efficient. If you want to achieve optimal results, we advise you to become more familiar with the syntax of the query language, which is described in detail in the help system of your favorite search engine. There is no doubt that Google and Yandex are indispensable tools for searching the Internet - search in these systems is convenient, flexible and very accurate. But, nevertheless, this does not mean that alternative search engines do not have the right to exist. Yes - they index fewer pages, yes - their methods for selecting resources are largely controversial. But such search engines have one undeniable advantage - they offer something new, different from accepted standards. Since alternative search services use a different approach to selecting resources that match the query, the search result will be completely different than in the case of conventional search engines. So, if long searches on well-known services did not lead to anything, this means one thing - you need to change tactics and try other methods of searching for information using alternative search engines. Often, alternative search engines use one or more lists of resources that have been found by Google, Yahoo and other major systems to collect results. These results are filtered, the best are selected and often visualized for better understanding using a diagram, site map, tag cloud, etc. Developers of alternative search engines sometimes go so far in their search for a new universal interface that it is sometimes difficult to recognize a search engine in a web page. And yet, these are search engines. Unusual and strange, at first glance...

FindSounds.com - searches for sounds

This resource is intended for those users who are in creative search. The resource allows you to search for sound files of different formats - wav, mp3, aiff, au. The resource database contains a wide variety of sounds - animal screams, grinding cars, ringing, knocking, sirens, buzzing insects, the roar of explosions and gunfire, splashing water, etc. Sound files can be searched by various criteria, for example, by size, the presence of two or one sound channels (stereo/mono), sampling frequency and sound bit depth. In the search results, the resource shows not only links to the found files, but also their main characteristics, and also shows a graph of the sound amplitude, which can be used to judge the nature of the sound of a given sample.

The FindSounds sound effects database can be used in a variety of areas - from the development of computer games and other applications, to the creation of presentations and all kinds of clips. The search engine can be useful, for example, for those who create interactive web graphics and want to add variety to the site by accompanying the clicking of page navigation elements with different sounds.

Gnod.net - will select music, books and films to suit your tastes

When a person has a desire to read a new book, listen to some new music or watch a movie, he usually turns to his friend or acquaintance for advice, who has authority in his eyes. However, finding someone who would agree to express their opinion on this issue is not so easy. Firstly, not everyone likes to give advice, because when recommending something to someone else, a person takes on a share of responsibility, and many are stopped by the question “What if he doesn’t like the film that I recommend?” Secondly, the person who gives advice must understand what exactly the interlocutor will like, and what will be completely uninteresting. After all, taste and color, as they say... But there is an easier way to get good advice - use a special search engine that is made specifically for this purpose. So, you want to listen to a new band, but you don’t have the time or desire to look for good music. The gnod.net resource will ask you for several names of musical artists that you like, analyze the results and offer your own version of a singer or group that you should also like. The service has several databases - on music artists, films, books and people. Thus, the resource includes four services: Gnod Music, Gnod Books, Gnod Movies and Flork. The latest service, Flork, is a social experiment in discovering people who are interested in communicating with each other. We were happy to test the music section of this service and introduced three artists - Gerry and the Pacemakers, The Beatles and Hollies. Our selection was not random - these three groups belong to the era of the sixties, to an interesting phenomenon called the British Invasion. All these bands played a beat, and the search engine had to suggest a band or artist in the same style. And so it happened. The result offered to us is the group Archies, which in the late sixties was on the lips of all Americans with their cheerful song Sugar Sugar. After playing with the search engine for some time, we came to the conclusion that gnod.net often gives correct advice and is not mistaken very often. For clarity, the search engine can provide the results of its “advice” in the form of an animated cloud with the names of groups, authors or films. The database can be replenished independently by having “conversations” with the search engine and answering its questions in the style of “I like this” or “I don’t like this.”

Alldll.net - finds library files

We recommend that you immediately bookmark this search engine, as sooner or later it will definitely come in handy. Probably everyone has at least once encountered the problem of a missing dll library in their system. This usually results in programs or games refusing to launch, and the message “Couldn't find *****.dll” appear on the screen." There can be many reasons for this, for example, the absence of a file may be caused by incorrect removal of a previously installed application , accidental file corruption, etc. In addition, the developer could simply not include this library in the distribution of his product.

Correcting the situation is very simple - just find the missing file on the Internet, download it and copy it to the directory of the program that refuses to start, or to the ..WINDOWSsystem32... folder. You can find and download the missing file easily and quickly using this service. The resource www.alldll.net is a searchable database of the most popular dll libraries. The files are sorted alphabetically and there is a search function. You can search for the file you are looking for even if you only know the approximate name of the library. It is enough to start entering text in the request field, and at the bottom of the page a huge list of files will appear that begin with the letters that were typed.

Medpoisk.ru - search for medical information

Despite the fact that this search engine uses the search engine from Google, this in no way reduces its value. Medpoisk.ru is a universal search engine that is designed to search exclusively on medical sites. This site is an excellent tool for every physician and anyone who wants to get an answer to any question in the field of medicine. How to treat this or that disease, what are the contraindications for this or that medicine, which doctor to see - all this and much more can be found out by “asking” a search engine. The search engine includes a labor exchange and can be used to search for work among medical professionals. The resource also contains a catalog of medical institutions sorted by region. Among these institutions are the addresses of clinics, medical centers of various specialties, maternity hospitals, diagnostic centers, beauty salons, etc. We sincerely wish you to use this search service solely out of curiosity, and not out of necessity.

Taggalaxy.de - search for images and photos

Perhaps you have heard about the popular image sharing service Flickr.com? This is the same service that was blocked by the Chinese authorities in 2007 after photographs of the sad events of 1989 in Tiananmen Square, located in the Chinese capital Beijing, appeared on its pages. Flickr.com is one of the first Web 2.0 services, and the number of images uploaded by users is in the billions. The number of pictures uploaded to the servers of this service is so large that in order to find a specific image in this ocean of photographs and paintings, a separate search engine is needed. The service offers an image search service, but there is a more interesting way to search for images - using the unusual search engine taggalaxy.de. This search service is a tool for searching images on Flickr.com, with previews. What makes it unusual is the search interface, which is completely three-dimensional. The process of searching by keyword is reminiscent of some kind of computer game - different celestial bodies fly in outer space, between which you can move in the virtual world.

After the keyword query is completed, a system of the sun and planets that revolve around the star will appear on the screen. Each celestial body has its own purpose and is “signed” with a word. In the center of the galaxy is the sun, the key query, all other bodies are auxiliary words, clarifications. If you click on the sun, this object will come closer, and photographs will fly towards it from all sides and surround it, the content of which is determined by the search query. This three-dimensional model with photographs can be rotated in virtual space, examining in detail and searching for the image of interest. After this, just click on the picture to enlarge it in size, and then you can better examine it and read the description.

While working with this search engine, you can use the scrolling function - it allows you to zoom in or out of three-dimensional planets. The remaining planets that are visible in the search engine interface after the request are auxiliary words that allow you to clarify the request. For example, if you enter “Sky” into the search field, then among the qualifying words-planets there will be the words “clouds”, “sunset”, “blue” and other tags of similar meaning that users specified when using the Flickr.com service. The disadvantage of the search engine is that taggalaxy.de does not support the Russian language, so queries can only be entered in Latin.

Nigma.ru - filters results from other search engines

Among all the search engines that can be found on the Internet, there is a special group of search engines. It differs from all others in that they implement a multi-search function, that is, simultaneous search in several search engines. One of these multi-search systems is the Russian service Nigma.ru.

Nigma contains its own resource base, but in addition it allows you to search immediately across all the most popular search engines, including Google, MSN, Yandex, Rambler, AltaVista, Yahoo and Aport. The mechanism for selecting results in this search engine differs from most accepted methods of site discovery. The fact is that the engine of this service uses clustering of results. What does this mean? Imagine that you decide to find out for yourself what “rendering” is. Having compared the results in different search engines, the Nigma.ru engine selected the most likely results and, at the same time, on the left side of the window, next to the list of search results, displayed the so-called clusters - “visualization”, “creation”, “system”, “rendering”, “process”, “studio max”, “computer graphics” and other words and phrases. These clusters represent a thematic group of found documents. This way, you can quickly narrow your search or specify your search query. In Nigma.ru, you can also use categories to limit the area from which results will be selected - for example, perform a search only taking into account music resources, or display results only for images. Another opportunity of this service may be of interest to schoolchildren and students. Nigma.ru offers the services Nigma-mathematics and Nigma-chemistry. The first is designed for quickly solving simple equations and various arithmetic operations, the second allows you to work with formulas of chemical reactions. The search service recognizes more than a thousand physical, mathematical constants and units of measurement, allowing you to quickly convert from one dimension to another.

Searchme.com - search engine with preview

Everyone knows that in order to find specific information on the Internet, you need to spend a lot of time. When viewing search results, the user basically opens resources at random, not knowing for sure whether he will find what interests him on the new page, or whether it will be a waste of time. The creators of the search service searchme.com thought about this problem and came up with an original solution. The essence of this solution was to create a search engine in which the user could look at a rough thumbnail of the page before it loaded. This would allow us to form an additional opinion about the seriousness of the resource and its content.

The implementation of this idea was simply magnificent - the created search engine has a beautiful animated three-dimensional interface and shows search results in the form of an animated ribbon of thumbnails, thumbnail screenshots of web pages that include the search keyword. The tape with the results, like a film with old negatives, can be scrolled in the browser window using a special slider located under the string of images. The sketches are loaded instantly, so there are no delays in drawing the results. It is especially convenient to work with search results in full-screen mode - then you can even make out the text of articles in the thumbnails of the results. To appreciate the convenience of this system, just try browsing news resources. Photos for the main news on the title page of the web publication will immediately make it clear which news on this resource is considered the most important.

The solution is a specialized torrent search engine. There are a lot of sites on the Internet that search for torrent resources. However, torrent-finder.com has an undeniable advantage over other search engines - this service allows you to search for files on a huge number of trackers simultaneously.

The Internet is necessary for many users in order to receive answers to queries (questions) that they enter.

If there were no search engines, users would have to independently search for the sites they need, remember them, and write them down. In many cases, finding something suitable “manually” would be very difficult, and often simply impossible.

Search engines do all this routine work of searching, storing and sorting information on websites for us.

Let's start with the famous Runet search engines.

Internet search engines in Russian

1) Let's start with the domestic search engine. Yandex works not only in Russia, but also works in Belarus and Kazakhstan, Ukraine, and Turkey. There is also Yandex in English.

2) The Google search engine came to us from America and has Russian-language localization:

3) Domestic search engine Mail ru, which simultaneously represents the social network VKontakte, Odnoklassniki, also My World, the famous Answers Mail.ru and other projects.

4) Intelligent search engine

Nigma (Nigma) http://www.nigma.ru/

Since September 19, 2017, the nigma “intellectual” has not worked. It ceased to be of financial interest to its creators; they switched to another search engine called CocCoc.

5) The well-known company Rostelecom has created the Sputnik search engine.

There is a search engine called Sputnik, designed specifically for children, which I wrote about.

6) Rambler was one of the first domestic search engines:

There are other famous search engines in the world:

Bing,
Yahoo!,
DuckDuckGo,
Baidu,
Ecosia,

Let's try to figure out how a search engine works, namely, how sites are indexed, analyzed indexing results and generated search results. The principles of operation of search engines are approximately the same: searching for information on the Internet, storing it and sorting it for delivery in response to user requests. But the algorithms that search engines use can differ greatly. These algorithms are kept secret and its disclosure is prohibited.

By entering the same query into the search strings of different search engines, you can get different answers. The reason is that all search engines use their own algorithms.

The purpose of search engines

First of all, you need to know that search engines are commercial organizations. Their goal is to make a profit. You can make a profit from contextual advertising, other types of advertising, and from promoting the necessary sites to the top of the search results. In general, there are many ways.

It depends on the size of the audience, that is, how many people use this search engine. The larger the audience, the more people the ad will be shown to. Accordingly, this advertising will cost more. Search engines can increase their audience through their own advertising, as well as by attracting users by improving the quality of their services, algorithm and search convenience.

The most important and difficult thing here is the development of a fully functioning search algorithm that would provide relevant results for the majority of user queries.

The work of a search engine and the actions of webmasters

Each search engine has its own algorithm, which must take into account a huge number of different factors when analyzing information and compiling results in response to a user’s request:

the age of a particular site,
website domain characteristics,
quality of content on the site and its types,
features of navigation and site structure,
usability (convenience for users),
behavioral factors (the search engine can determine whether the user found what he was looking for on the site or the user returned to the search engine again and there again looks for an answer to the same query)
etc.

All this is necessary precisely so that the results at the user’s request are as relevant as possible, satisfying the user’s requests. At the same time, search engine algorithms are constantly changing and being refined. As they say, there is no limit to perfection.

On the other hand, webmasters and optimizers are constantly inventing new ways to promote their sites, which are not always honest. The task of the developers of the search engine algorithm is to make changes to it that would not allow “bad” sites of dishonest optimizers to appear in the TOP.

How does a search engine work?

Now let's talk about how the search engine actually works. It consists of at least three stages:

scanning,
indexing,
ranging.

The number of sites on the Internet is simply astronomical. And every site is information, information content that is created for readers (living people).

Scanning

This is a search engine wandering around the Internet to collect new information, analyze links and search for new content that can be used to return to the user in response to his requests. For scanning, search engines have special robots called search robots or spiders.

Search robots are programs that automatically visit websites and collect information from them. The crawl can be primary (the robot visits a new site for the first time). After the initial collection of information from the site and entering it into the search engine database, the robot begins to visit its pages with some regularity. If any changes have occurred (new content has been added, old content has been deleted), then all these changes will be recorded by the search engine.

The main task of a search spider is to find new information and send it to the search engine for the next stage of processing, that is, for indexing.

Indexing

A search engine can search for information only among those sites that are already included in its database (indexed by it). If crawling is the process of searching and collecting information that is available on a particular site, then indexing is the process of entering this information into the search engine database. At this stage, the search engine automatically decides whether to enter this or that information into its database and where to enter it, in which section of the database. For example, Google indexes almost all the information found by its robots on the Internet, while Yandex is more picky and does not index everything.

For new sites, the indexing stage can be long, so visitors from search engines may wait a long time for new sites. And new information that appears on old, well-promoted sites can be indexed almost instantly and almost immediately end up in the “index”, that is, in the search engine database.

Ranging

Ranking is the arrangement of information that was previously indexed and entered into the database of a particular search engine, according to rank, that is, what information the search engine will show to its users in the first place, and what information will be placed “rank” lower. Ranking can be attributed to the stage of search engine servicing its client – the user.

On the search engine servers, the received information is processed and results are generated for a huge range of all kinds of queries. This is where the search engine algorithms come into play. All sites included in the database are classified by topic, and topics are divided into groups of queries. For each group of requests, a preliminary issue can be compiled, which will subsequently be adjusted.

At first glance, it may seem that only Yandex can be better than Google, and even that is not a fact. These companies invest huge amounts of money in innovation and development. Does anyone really have a chance not only to compete with the leaders, but also to win? Lifehacker's answer: “Yes!” There are several search engines that have succeeded. Let's look at our heroes.

What is this

This is a fairly well-known open source search engine. Servers are located in the USA. In addition to its own robot, the search engine uses results from other sources: Yahoo! Search BOSS, Wikipedia, Wolfram|Alpha.

The better

DuckDuckGo positions itself as a search engine that provides maximum privacy and confidentiality. The system does not collect any data about the user, does not store logs (no search history), and the use of cookies is as limited as possible.

DuckDuckGo does not collect or share personal information from users. This is our privacy policy.
Gabriel Weinberg, founder of DuckDuckGo

Why do you need this

All major search engines are trying to personalize search results based on data about the person in front of the monitor. This phenomenon is called the “filter bubble”: the user sees only those results that are consistent with his preferences or that the system deems as such.

DuckDuckGo creates an objective picture that does not depend on your past behavior on the Internet, and eliminates thematic advertising from Google and Yandex based on your queries. With DuckDuckGo, it’s easy to search for information in foreign languages: Google and Yandex by default give preference to Russian-language sites, even if the query is entered in another language.

What is this

"" is a Russian metasearch system developed by Moscow State University graduates Viktor Lavrenko and Vladimir Chernyshov. It searches through the indexes of Google, Bing, Yandex and others, and also has its own search algorithm.

The better

Searching through the indexes of all major search engines allows you to generate relevant results. In addition, Nigma divides the results into several thematic groups (clusters) and invites the user to narrow the search field, discarding unnecessary ones or highlighting priority ones. Thanks to the Mathematics and Chemistry modules, you can solve mathematical problems and request the results of chemical reactions directly in the search bar.

Why do you need this

Eliminates the need to search for the same query in different search engines. The cluster system makes it easy to manipulate search results. For example, Nigma collects results from online stores into a separate cluster. If you do not intend to buy anything, then simply exclude this group. By selecting the “English-language sites” cluster, you will receive results only in English. The Mathematics and Chemistry modules will help schoolchildren.

Unfortunately, the project is not currently being developed, as the developers have transferred their activity to the Vietnamese market. Nevertheless, “Nigma” is not only not outdated yet, but in some things it still gives Google a head start. Let's hope development resumes.

What is this

not Evil is a system that searches the anonymous Tor network. To use it, you need to go to this network, for example, by launching a specialized browser with the same name. not Evil is not the only search engine of its kind. There is LOOK (the default search in the Tor browser, accessible from the regular Internet) or TORCH (one of the oldest search engines on the Tor network) and others. We settled on not Evil because of the clear allusion to Google itself (just look at the start page).

The better

It searches where Google, Yandex and other search engines are generally closed.

Why do you need this

The Tor network contains many resources that cannot be found on the law-abiding Internet. And as government control over the content of the Internet tightens, their number will grow. Tor is a kind of Network within the Network: with its own social networks, torrent trackers, media, trading platforms, blogs, libraries, and so on.

YaCy

What is this

YaCy is a decentralized search engine that works on the principle of P2P networks. Each computer on which the main software module is installed scans the Internet independently, that is, it is analogous to a search robot. The results obtained are collected into a common database that is used by all YaCy participants.

The better

It’s difficult to say whether this is better or worse, since YaCy is a completely different approach to organizing search. The absence of a single server and owner company makes the results completely independent of anyone's preferences. The autonomy of each node eliminates censorship. YaCy is capable of searching the deep web and non-indexed public networks.

Why do you need this

If you are a supporter of open source software and a free Internet, not influenced by government agencies and large corporations, then YaCy is your choice. It can also be used to organize a search within a corporate or other autonomous network. And even though YaCy is not very useful in everyday life, it is a worthy alternative to Google in terms of the search process.

Pipl

What is this

Pipl is a system designed to search for information about a specific person.

The better

The authors of Pipl claim that their specialized algorithms search more efficiently than “regular” search engines. In particular, priority sources of information include social network profiles, comments, member lists, and various databases that publish information about people, such as court decisions. Pipl's leadership in this area is confirmed by assessments from Lifehacker.com, TechCrunch and other publications.

Why do you need this

If you need to find information about a person living in the US, Pipl will be much more effective than Google. Databases of Russian courts are apparently inaccessible to the search engine. Therefore, he does not cope so well with Russian citizens.

What is this

Another specialized search engine. Searches for various sounds (house, nature, cars, people, etc.) in open sources. The service does not support queries in Russian, but there is an impressive list of Russian-language tags that you can search for.

The better

The output contains only sounds and nothing extra. In the search settings you can set the desired format and sound quality. All sounds found are available for download. There is a search for sounds by pattern.

Why do you need this

If you need to quickly find the sound of a musket shot, the blows of a suckling woodpecker, or the cry of Homer Simpson, then this service is for you. And I chose this only from the available Russian-language queries. In English the spectrum is even wider. But seriously, a specialized service requires a specialized audience. But what if it comes in handy for you too?

The life of alternative search engines is often fleeting. Lifehacker asked the former general director of the Ukrainian branch of Yandex, Sergei Petrenko, about the long-term prospects of such projects.

As for the fate of alternative search engines, it is simple: to be very niche projects with a small audience, therefore without clear commercial prospects or, conversely, with complete clarity of their absence.

If you look at the examples in the article, you can see that such search engines either specialize in a narrow but popular niche, which, perhaps, has not yet grown enough to be noticeable on the radars of Google or Yandex, or they are testing an original hypothesis in ranking, which is not yet applicable in regular search.

For example, if a search on Tor suddenly turns out to be in demand, that is, results from there are needed by at least a percentage of Google’s audience, then, of course, ordinary search engines will begin to solve the problem of how to find them and show them to the user. If the behavior of the audience shows that for a significant proportion of users in a significant number of queries, results given without taking into account factors depending on the user seem more relevant, then Yandex or Google will begin to produce such results.

“Be better” in the context of this article does not mean “be better at everything.” Yes, in many aspects our heroes are far from Google and Yandex (even far from Bing). But each of these services gives the user something that the search industry giants cannot offer.

They have long become an integral part of the Russian Internet. Search engines are now huge and complex mechanisms that represent not only an information search tool, but also tempting areas for business.

Most search engine users have never thought (or thought about it, but did not find an answer) about the principle of operation of search engines, the scheme for processing user requests, what these systems consist of and how they function...

This master class is designed to answer the question of how search engines work. However, you will not find here factors that influence the ranking of documents. Moreover, you should not count on a detailed explanation of the Yandex algorithm. He, according to Ilya Segalovich, the director of technology and development of the Yandex search engine, can only be recognized “under torture” by Ilya Segalovich himself...

2. Concept and functions of a search engine

A search system is a software and hardware complex designed to search the Internet and respond to a user request, specified in the form of a text phrase (search query), by producing a list of links to sources of information, in order of relevance (in accordance with the request). The largest international search engines: "Google", Yahoo , MSN . On the Russian Internet these are Yandex, Rambler, Aport.

Let's take a closer look at the concept of a search query using the Yandex search engine as an example. The search query should be formulated by the user in accordance with what he wants to find, as briefly and simply as possible. Let's say we want to find information in Yandex on how to choose a car. To do this, open the Yandex main page and enter the text of the search query “how to choose a car.” Next, our task comes down to opening the links provided at our request to sources of information on the Internet. However, it is quite possible that we will not find the information we need. If this happens, then either you need to rephrase your request, or the search engine database really does not have any relevant information on our request (this can happen when asking very “narrow” queries, such as, for example, “how to choose a car in Arkhangelsk”)

The primary goal of any search engine is to deliver to people exactly the information they are looking for. And teach users to make “correct” requests to the system, i.e. queries that comply with the operating principles of search engines are impossible. Therefore, developers create algorithms and operating principles for search engines that would allow users to find the information they are looking for.

This means the search engine must “think” the same way the user thinks when searching for information. When a user makes a request to a search engine, he wants to find what he needs as quickly and easily as possible. Receiving the result, he evaluates the performance of the system, guided by several basic parameters. Did he find what he was looking for? If he didn’t find it, how many times did he have to rephrase the query to find what he was looking for? How much relevant information could he find? How quickly did the search engine process the query? How convenient were the search results presented? Was the result you were looking for the first or the hundredth? How much unnecessary garbage was found along with useful information? Will the necessary information be found when accessing a search engine, say, in a week, or in a month?

In order to satisfy all these questions with answers, search engine developers are constantly improving search algorithms and principles, adding new functions and capabilities, and trying in every possible way to speed up the operation of the system.

3. Main characteristics of a search engine

Let us describe the main characteristics of search engines:

Completeness
Completeness is one of the main characteristics of a search system, which is the ratio of the number of documents found by request to the total number of documents on the Internet that satisfy the given request. For example, if there are 100 pages on the Internet containing the phrase “how to choose a car,” and only 60 of them were found for the corresponding query, then the completeness of the search will be 0.6. Obviously, the more complete the search, the less likely it is that the user will not find the document he needs, provided that it exists on the Internet at all.
Accuracy
Accuracy is another main characteristic of a search engine, which is determined by the degree to which the found documents match the user's query. For example, if the query “how to choose a car” contains 100 documents, 50 of them contain the phrase “how to choose a car”, and the rest simply contain these words (“how to choose the right radio and install it in a car”), then the search accuracy is considered equal to 50/100 (=0.5). The more accurate the search, the faster the user will find the documents he needs, the less various kinds of “garbage” will be found among them, the less often the found documents will not correspond to the request.
Relevance
Relevance is an equally important component of search, which is characterized by the time that passes from the moment documents are published on the Internet until they are entered into the search engine index database. For example, the day after interesting news appeared, a large number of users turned to search engines with relevant queries. Objectively, less than a day has passed since the publication of news information on this topic, but the main documents have already been indexed and available for search, thanks to the existence of the so-called “fast database” of large search engines, which is updated several times a day.
Search speed
Search speed is closely related to its load resistance. For example, according to Rambler Internet Holding LLC, today, during business hours, the Rambler search engine receives about 60 requests per second. Such workload requires reducing the processing time of an individual request. Here the interests of the user and the search engine coincide: the visitor wants to get results as quickly as possible, and the search engine must process the request as quickly as possible, so as not to slow down the calculation of subsequent queries.
Visibility

4. Brief history of the development of search engines

In the initial period of Internet development, the number of its users was small, and the amount of available information was relatively small. For the most part, only research staff had access to the Internet. At this time, the task of searching for information on the Internet was not as urgent as it is now.

One of the first ways to organize access to network information resources was the creation of open directories of sites, links to resources in which were grouped according to topic. The first such project was the Yahoo.com website, which opened in the spring of 1994. After the number of sites in the catalog increased significantly, the ability to search for the necessary information in the catalog was added. In the full sense, it was not yet a search engine, since the search area was limited only to the resources present in the catalog, and not to all Internet resources.

Link directories were widely used in the past, but have almost completely lost their popularity nowadays. Since even modern catalogs, huge in volume, contain information only about a negligible part of the Internet. The largest directory of the DMOZ network (also called the Open Directory Project) contains information about 5 million resources, while the Google search engine database consists of more than 8 billion documents.

In 1995, search engines Lycos and AltaVista appeared. The latter has been a leader in the field of information search on the Internet for many years.

In 1997, Sergey Brin and Larry Page created the Google search engine as part of a research project at Stanford University. Google is currently the most popular search engine in the world!

In September 1997, the Yandex search engine, which is the most popular on the Russian-language Internet, was officially announced.

Currently, there are three main search engines (international) - Google, Yahoo and, which have their own databases and search algorithms. Most other search engines (of which there are a large number) use in one form or another the results of the three listed. For example, AOL search (search.aol.com) uses the Google database, while AltaVista, Lycos and AllTheWeb use the Yahoo database.

5. Composition and principles of operation of the search system

In Russia, the main search engine is Yandex, followed by Rambler.ru, Google.ru, Aport.ru, Mail.ru. Moreover, at the moment, Mail.ru uses the Yandex search engine and database.

Almost all major search engines have their own structure, different from others. However, it is possible to identify the main components common to all search engines. Differences in structure can only be in the form of implementation of the mechanisms of interaction of these components.

Indexing module

The indexing module consists of three auxiliary programs (robots):

Spider is a program designed to download web pages. The spider downloads the page and retrieves all internal links from that page. The html code of each page is downloaded. Robots use HTTP protocols to download pages. The spider works as follows. The robot sends the request “get/path/document” and some other HTTP request commands to the server. In response, the robot receives a text stream containing service information and the document itself.

Page URL
date the page was downloaded
Server response http header
page body (html code)

Crawler (“traveling” spider) is a program that automatically follows all the links found on the page. Selects all links present on the page. Its job is to determine where the spider should go next, based on links or based on a predetermined list of addresses. Crawler, following the links found, searches for new documents that are still unknown to the search engine.

Indexer (robot indexer) is a program that analyzes web pages downloaded by spiders. The indexer parses the page into its component parts and analyzes them using its own lexical and morphological algorithms. Various page elements are analyzed, such as text, headings, links, structural and style features, special service HTML tags, etc.

Thus, the indexing module allows you to crawl a given set of resources using links, download encountered pages, extract links to new pages from received documents, and perform a complete analysis of these documents.

Database

A database, or search engine index, is a data storage system, an information array in which specially converted parameters of all documents downloaded and processed by the indexing module are stored.

Search server

The search server is the most important element of the entire system, since the quality and speed of the search directly depend on the algorithms that underlie its functioning.

The search server works as follows:

The request received from the user is subjected to morphological analysis. The information environment of each document contained in the database is generated (which will subsequently be displayed in the form, that is, text information corresponding to the request on the search results page).
The received data is passed as input parameters to a special ranking module. Data is processed for all documents, as a result of which each document has its own rating that characterizes the relevance of the query entered by the user and the various components of this document stored in the search engine index.
Depending on the user’s choice, this rating can be adjusted by additional conditions (for example, the so-called “advanced search”).
Next, a snippet is generated, that is, for each document found, the title, a short abstract that best matches the query, and a link to the document itself are extracted from the document table, and the words found are highlighted.
The resulting search results are transmitted to the user in the form of a SERP (Search Engine Result Page) – a search results page.

As you can see, all these components are closely related to each other and work in interaction, forming a clear, rather complex mechanism for the operation of the search system, which requires huge amounts of resources.

6. Conclusion

Now let's summarize all of the above.

The primary goal of any search engine is to deliver to people exactly the information they are looking for.
Main characteristics of search engines:
1. Completeness
2. Accuracy
3. Relevance
4. Search speed
5. Visibility
The first full-fledged search engine was the WebCrawler project, published in 1994.
The search system includes the following components:
1. Indexing module
2. Database
3. Search server

We hope that our master class will allow you to become more familiar with the concept of a search engine and better understand the main functions, characteristics and operating principles of search engines.