Internet Windows Android

The history of the development of search engines. History of the development of search engines History in the American search engine

What was the first search engine in Runet? Yandex, Aport or Rambler?

The very first Runet search engines (of which, according to one of the founders of Rambler, there were 2 or 3) very quickly sunk into oblivion. Among them were morphological extensions to the AltaVista system, which did not leave us their names. Therefore, we will have to choose from those who remain:

Rambler

The creation of "Rambler" began in 1996, when there were only a few dozen sites in the Russian segment of the Internet. Development ended by the autumn of that year. The rambler.ru domain was registered on September 26, and October 8, 1996, on the birthday of one of the creators, Rambler was open to users.

Rambler - the very first search engine in Runet from those currently in existence.

The Aport search engine was developed by February 1996, but at that time it searched only on the site russia.agama.com. Gradually, the number of sites grew and by the official opening November 11, 1997 years, "Aport" was already looking for 10,000 sites. Thus, "Aport" was one of the first search engines in Runet, but due to the limited scope of the search, it cannot be recognized as the oldest.

Yandex

CompTek, which developed Yandex, was founded in 1989. In 1993, CompTek developed Yandex, a hard drive search program. In 1996, the ability to search the Web was added to the program. In 1997, the first search robot was written, the Runet was indexed and September 23, 1997 year, the official presentation of Yandex took place.

Yandex by CompTek is not the oldest, but their search technologies and research in linguistics and morphology are the oldest in Russia.

In the early days of the Internet, users were a privileged minority and the amount of information available was relatively small. At that time, mainly employees of various large educational institutions and laboratories had access to it, and the data obtained were used for scientific purposes. At that time, the use of the Web was not as relevant as it is now.

In 1990 British scientist Tim Berners-Lee (who is also the inventor of URI, URL, HTTP, World Wide Web) created the website info.cern.ch, which is the world's first accessible directory of Internet sites. From that moment on, the Internet began to gain popularity not only among the scientific community, but also among ordinary owners of personal computers.

Thus, the first way to facilitate access to information resources on the Internet was the formation of catalogs of sites. Links to resources in them were grouped by topic.

The first project of this kind is considered to be Yahoo, open in April 1994. Due to the rapid growth in the number of sites in it, it soon became possible to search for the necessary information on demand. Of course, it was not yet a full-fledged search engine. The search was limited to only the data that was in the directory.

In the early stages of the development of the Internet, link directories were used very actively, but gradually lost their popularity. The reason is simple: even if there are many resources in modern directories, they still show only a small part of the information available on the Internet. For example, the largest network directory is − DMOZ(Open Directory Project). It contains information on a little over five million resources, which is incommensurable with the Google search base containing more than eight billion documents.

The largest Russian-language catalog is the Yandex catalog. It contains information about a little over one hundred and four thousand resources.

Timeline of search engine development

1945- American engineer Vannevar Bush published notes of the idea that later led to the invention of hypertext, and a discussion about the need to develop a system for quickly extracting data from information stored in this way (the equivalent of today's search engines). The concept of a memory expander device he introduced contained original ideas, which, in the end, were embodied on the Internet.

1960s Gerard Salton and his group at Cornell University developed the SMART information retrieval system. SMART is an acronym for Salton's Magic Automatic Retriever of Text. Gerard Salton is considered the father of modern search technology.

1987-1989 – developed Archie— search engine for indexing FTP archives. Archie was a script that automates the insertion into listings on ftp servers, which are then transferred to local files, and only then a quick search for the necessary information is carried out in local files. The search was based on the standard Unix grep command, and the user's access to the data was based on telnet.

In the next version, the data was divided into separate databases, one of which contained only text file names; and the other - entries with links to hierarchical directories of a thousand hosts; and another one that connects the first two. This version of Archie was more efficient than the previous one, as it only searched for filenames, eliminating many of the previous repetitions.

The search engine became more and more popular, and the developers thought about how to speed up its work. The database mentioned above has been replaced by another one based on the compressed tree theory. The new version essentially created a full text database instead of a list of filenames and was significantly faster than before. In addition, minor changes allowed the Archie system to index web pages. Unfortunately, for various reasons, work on Archie soon ceased.

In 1993 created the world's first search engine for the World Wide Web Wandex. It was based on the World Wide Web Wanderer bot developed by Matthew Gray of the Massachusetts Institute of Technology.

1993– Martin Coster creates Aliweb one of the first search engines on the World Wide Web. Site owners had to add them to the Aliweb index themselves in order for them to appear in the search. Since too few webmasters did it, Aliweb did not become popular.

April 20, 1994– Brian Pinkerton of the University of Washington released web crawler- the first bot that indexed the pages completely. The main difference between the search engine and its predecessors is the ability for users to search for any keyword on any web page. Today, this technology is the search standard of any search engine. The WebCrawler search engine was the first system that was known to a wide range of users. Alas, the throughput was not high and the system was often unavailable during the daytime.

July 20, 1994– opened Lycos- a serious development in search technology, created at Carnegie Melon University. Michael Maldin was in charge of this search engine and is still the lead person at Lycos Inc. Lycos opened with a catalog of 54,000 documents. And in addition to that, the results it provided were ranked, plus it took into account prefixes and approximate matches. But Lycos' main difference was its ever-growing catalog: by November 1996, 60 million documents had been indexed, more than any other search engine of the time.

January 1994- was founded infoseek. It was not truly innovative, but had a number of useful additions. One such popular addition was the ability to add your page in real time.

1995- launched AltaVista. Having appeared, the AltaVista search engine quickly gained recognition from users and became a leader among its own kind. The system had practically unlimited bandwidth at that time, it was the first search engine in which it was possible to formulate queries in natural language, as well as to formulate complex queries. Users were allowed to add or remove their own URLs within 24 hours. AltaVista also offered a lot of tips and tricks for searching. The main merit of the AltaVista system is the support for many languages, including Chinese, Japanese and Korean. Indeed, in 1997, not a single search engine on the Web worked with several languages, especially with rare ones.

1996- AltaVista search engine has launched a morphological extension for the Russian language. In the same year, the first domestic search engines, Rambler.ru and Aport.ru, were launched. The appearance of the first domestic search engines marked a new stage in the development of the Runet, allowing Russian-speaking users to make a request in their native language, as well as quickly respond to changes taking place within the Web.

May 20, 1996- the Inktomi corporation appeared along with its search engine hotbot. Its creators were two teams from the University of California. When the site appeared, it quickly became popular. In October 2001, Danny Sullivan wrote an article titled "Inktomi's Spam Database Opened to the Public" which described how Inktomi accidentally made its database of spam sites, which at that time already had about 1 million URLs, available to the public. general use.

1997- in Western countries, there is a turning point in the development of search engines, when S. Brin and L. Page from Stanford University founded Google(the original name of the BackRub project). They developed their own search engine, which gave users the opportunity to perform high-quality searches taking into account morphology, misspelled words, and also increase relevance in the search results.

September 23, 1997– announced Yandex, which quickly became the most popular search engine among Russian-speaking Internet users. With the launch of the Yandex search engine, domestic search engines began to compete with each other, improving the search system and indexing sites, issuing results, as well as offering new services and services.

Thus, the development of search engines and their formation can be characterized by the stages listed above.

To date, three leaders have settled in the global market - Google, Yahoo and Bing. They have their own databases and their own search algorithms. Many other search engines use the results of these three major search engines. For example, AOL uses the Google database while AltaVista, Lycos and AllTheWeb use the Yahoo database. All other search engines, in various combinations, use the results (results) of the listed engines.

If we conduct a similar analysis of search engines popular in the CIS countries, we will see that mail.ru broadcasts Google search, while imposing its new developments, Rambler, in turn, broadcasts Yandex. Therefore, the entire Runet market can be divided between these two giants.

That is why, in the CIS countries, website promotion, as a rule, is carried out only in these two PSs.

In the early years of the development of the Internet community, active Internet users were a minority, and the amount of information on Internet resources was relatively small. For the most part, only employees of scientific laboratories and large educational institutions had access to the world information network. In general, the use of a network resource was not as relevant as it is today.

History of search engines

A big step towards the spread of the Internet to the masses was the appearance in 1990 of the website info.centr.ch. This site was the first public directory of Internet sites. The creator is British scientist Tim Berners-Lee, who is also considered the creator of URI, HTTP, World Wide Web and URL. Since that moment, Internet sites have become relevant not only in specialized circles of users, but also among ordinary owners of home computers. In this directory, for convenience, information resources have been arranged on the basis of groups on similar topics, which greatly facilitates the search for information.

But progress did not stop there, and in 1994, the search technology developed by Carnegie University, known as Lucos, was born. This catalog, which was created by Michael Maldin, started with a resource of more than 50,000 documents. In Lucos, queries were considered approximate matches of the query, and the search result was ranked depending on the match between the input and output information. And also, there was a constant replenishment of the resource with new Internet pages. By November, Lucos had over 55 million pages and documents, far more than any document catalog of the day.

At the end of 1994, the Infosek resource appeared. It had a number of advantages over other resources. For example, adding sites by the user to the catalog database in real time.

The new search engine monster in 1995 was AltaVista. She quickly earned popularity among Internet users and took a leading position in her field. Its main feature was the ability to formulate queries in a natural, colloquial language, as well as users were allowed to add their own URL addresses. But still, the main merit of AltaVista was the support of multiple language packs, such as Korean, Japanese, and Chinese, as well as Russian.

A huge step in search technology was the emergence on the Internet of a new search engine, the name of which is now heard by every user, namely Google. In 1997, L. Page and S. Brin from Stanford University introduced new features into the search algorithms of their offspring. In the search, systems of relevance of the search results issued by the system were used, and morphology and possible spelling errors were taken into account when querying.

There are three major leaders in the search engine market these days: Bing, Google, and Yahoo. They have search algorithms and databases of their own production at their disposal. Many search engines, of which there are many, use the developments of these three titans among search engines.

Thanks to search engines, it has become easier for an ordinary person to discover the vast expanses of the information field. Without their development, it is impossible to improve the ways of exchanging information between people.

The architecture of a search engine typically includes:

Encyclopedic YouTube

    1 / 5

    ✪ Lesson 3: How a search engine works. Introduction to SEO

    ✪ Search engine from within

    ✪ Shodan - Black Google

    ✪ Cheburashka search engine will replace Google and Yandex in Russia

    ✪ Lesson 1 - How a search engine works

    Subtitles

History

Chronology
Year System Event
1993 W3Catalog?! launch
Aliweb launch
JumpStation launch
1994 web crawler launch
infoseek launch
Lycos launch
1995 AltaVista launch
Daum Base
open text web index launch
Magellan launch
Excite launch
SAPO launch
Yahoo! launch
1996 Dogpile launch
Inktomi Base
Rambler Base
hotbot Base
Ask Jeeves Base
1997 Northern Light launch
Yandex launch
1998 Google launch
1999 AlltheWeb launch
GenieKnows Base
Naver launch
Teoma Base
Vivisimo Base
2000 Baidu Base
Exalead Base
2003 info.com launch
2004 Yahoo! Search Final launch
A9.com launch
sogou launch
2005 MSN Search Final launch
Ask.com launch
Nigma launch
goodsearch launch
Search Me Base
2006 wikiseek Base
Quaero Base
Live Search launch
ChaCha Launch (beta)
Guruji.com Launch (beta)
2007 wikiseek launch
Sproose launch
Wikia Search launch
blackle.com launch
2008 DuckDuckGo launch
Tooby launch
Picollator launch
Viewzi launch
Cuil launch
Boogami launch
LeapFish Launch (beta)
Forestle launch
VADLO launch
powerset launch
2009 bing launch
KAZ.KZ launch
Yebol Launch (beta)
Mugurdy closure
Scout launch
2010 Cuil closure
Blekko Launch (beta)
Viewzi closure
2012 WAZZUB launch
2014 Satellite Launch (beta)

At an early stage in the development of the Internet, Tim Berners-Lee maintained a list of web servers posted on the CERN website. There were more and more sites, and manually maintaining such a list became more and more difficult. The NCSA website had a dedicated "What's New!" section. (eng. What's New!), where they published links to new sites.

The first computer program for searching the Internet was Archie(English archie - archive without the letter "c"). It was created in 1990 by Alan Emtage, Bill Heelan, and J. Peter Deutsch, computer science students at McGill University in Montreal. The program downloaded lists of all files from all available anonymous FTP servers and built a database that could be searched by filenames. However, Archie's program did not index the contents of these files, as the amount of data was so small that everything could be easily found by hand.

The development and dissemination of the Gopher network protocol, coined in 1991 by Mark McCahill at the University of Minnesota, has led to the creation of two new search programs, Veronica and Jughead. Like Archie, they looked up filenames and headers stored in Gopher's index systems. Veronica (English) Very Easy Rodent-Oriented Net-wide Index to Computerized Archives) allowed keyword searches for most Gopher menu headings across all Gopher lists. The Jughead Program Jonzy's Universal Gopher Hierarchy Excavation And Display) retrieved menu information from certain Gopher servers. Although the name of Archie's search engine was not related to the comic book series "Archie", nevertheless Veronica and Jughead are characters in these comics.

By the summer of 1993, there was not yet a single system for searching the web, although numerous specialized directories were manually maintained. Oscar Nierstrasz at the University of Geneva wrote a series of Perl scripts that periodically copied these pages and rewrote them to a standard format. This became the basis for W3Catalog?!, the first primitive web search engine, launched on September 2, 1993.

Probably the first search engine written in Perl was "World Wide Web Wanderer", a bot by Matthew Gray from June 1993. This robot created the search index "Wandex". The purpose of the Wanderer robot was to measure the size of the World Wide Web and find all web pages containing the words from the query. In 1993, the second search engine Aliweb appeared. Aliweb did not use a crawler, but instead waited for notifications from website administrators about the presence of an index file in a certain format on their sites.

JumpStation, created in December 1993 by Jonathan Fletcher, searched and indexed web pages using a crawler, and used a web form as an interface for formulating search queries. It was the first Internet search tool that combined the three essential functions of a search engine (verification, indexing, and actual search). Due to the limited computer resources of the time, indexing, and therefore search, was limited to only the titles and titles of the web pages found by the crawler.

Search engines participated in the Dot-com Bubble of the late 1990s. Several companies entered the market in spectacular fashion, generating record profits during their IPOs. Some have abandoned the public search engine market and only work with the corporate sector, such as Northern Light.

Google took up the idea of ​​selling keywords in 1998, when it was a small company running a search engine at goto.com. The move marked a shift for search engines from competing with each other to becoming one of the most profitable business ventures on the Internet. Search engines began to sell the first places in search results to individual companies.

The Google search engine has been in a prominent position since the early 2000s. The company has achieved a high position due to good search results using the PageRank algorithm. The algorithm was presented to the public in the article "The Anatomy of Search Engine" written by Sergey Brin and Larry Page, founders of Google. This iterative algorithm ranks web pages based on an estimate of the number of hyperlinks to a web page, assuming that "good" and "important" pages get more links than others. Google's interface is designed in a spartan style, where there is nothing superfluous, unlike many of its competitors who built the search engine into the web portal. The Google search engine has become so popular that imitators of it have appeared, for example, Mystery Seeker(secret search engine).

Search for information in Russian

In 1996, a search was implemented taking into account Russian morphology on the Altavista search engine and the original Russian search engines Rambler and Aport were launched. On September 23, 1997, the Yandex search engine was opened. On May 22, 2014, the national search engine Sputnik was opened by Rostelecom, which at the time of 2015 is in beta testing. On April 22, 2015, a new Sputnik service was launched. Children specially for children with increased safety.

The methods of cluster analysis and metadata search have gained great popularity. Of the international machines of this kind, the most famous was "Clusty" companies Vivisimo. In 2005, with the support of Moscow State University, the Nigma search engine was launched in Russia, which supports automatic clustering. In 2006, the Russian metamachine Quintura was opened, offering visual clustering in the form of a tag cloud. Nigma also experimented with visual clustering.

How the search engine works

The main components of a search engine: search robot, indexer, search engine.

As a rule, systems work in stages. First, the crawler gets the content, then the indexer generates a searchable index, and finally, the crawler provides the functionality to search the indexed data. To update the search engine, this indexing cycle is repeated.

Search engines work by storing information about many web pages that they get from HTML pages. Search robot or "crawler" (eng. Crawler) - a program that automatically follows all the links found on the page and highlights them. The crawler, based on links or based on a predefined list of addresses, searches for new documents that are not yet known to the search engine. The site owner can exclude certain pages using robots.txt , which can be used to prevent the indexing of files, pages or directories of the site.

The search engine analyzes the content of each page for further indexing. Words can be extracted from titles, page text or special fields - meta tags. An indexer is a module that analyzes a page, after splitting it into parts, using its own lexical and morphological algorithms. All elements of a web page are isolated and analyzed separately. Web page data is stored in an index database for use in subsequent requests. The index allows you to quickly find information on the user's request. A number of search engines, like Google, store all or part of the original page, the so-called cache, as well as various information about the web page. Other systems, like AltaVista, store every word of every page found. Using the cache helps speed up the extraction of information from already visited pages. Cached pages always contain the text that the user specified in the search query. This can be useful when the web page has been updated, that is, it no longer contains the text of the user's request, and the page in the cache is still old. This situation is related to the loss of links (Eng. linkrot) and Google's user-friendly (usability) approach. This involves issuing short chunks of text from the cache containing the query text. The principle of least surprise applies, the user usually expects to see the search words in the texts of the received pages ( user expectations). In addition to speeding up searches using cached pages, cached pages may contain information that is no longer available elsewhere.

The search engine works with output files received from the indexer. The search engine accepts user requests, processes them using an index, and returns search results.

When a user enters a query into a search engine (usually using keywords), the system checks its index and returns a list of the most relevant web pages (sorted by some criterion), usually with a brief annotation containing the title of the document and sometimes parts of the text. The search index is built according to a special technique based on information extracted from web pages. Since 2007, the Google search engine allows you to search based on time, create the documents you are looking for (call the "Search Tools" menu and specify the time range). Most search engines support the use of boolean AND, OR, NOT operators in queries, which allows you to refine or expand the list of searched keywords. In this case, the system will search for words or phrases exactly as entered. Some search engines allow approximate search, in this case, users expand the search area by specifying the distance to keywords . There are also conceptual search, which uses a statistical analysis of the use of the search words and phrases in the texts of web pages. These systems allow you to compose queries in natural language. An example of such a search engine is the ask com website.

The usefulness of a search engine depends on the relevance of the pages it finds. While millions of web pages may include a word or phrase, some may be more relevant, popular, or authoritative than others. Most search engines use ranking methods to bring the "best" results to the top of the list. Search engines decide which pages are more relevant and in what order results should be shown in different ways. Search methods, like the Internet itself, change over time. Thus, two main types of search engines appeared: systems of predefined and hierarchically ordered keywords and systems in which an inverted index is generated based on text analysis.

Most search engines are commercial enterprises that make a profit from advertising, in some search engines you can buy top positions in the search results for given keywords for a fee. Those search engines that do not take money for the order of results, earn on contextual advertising, while advertising messages correspond to the user's request. Such ads are displayed on the page with a list of search results, and search engines earn every time a user clicks on advertising messages.

Search Engine Types

There are four types of search engines: robotic, human-driven, hybrid, and meta-systems.

  • systems using search robots
They consist of three parts: a crawler ("bot", "robot" or "spider"), an index and a search engine software. The crawler is needed to bypass the network and create lists of web pages. An index is a large archive of copies of web pages. The purpose of the software is to evaluate search results. Due to the fact that the crawler in this mechanism is constantly exploring the network, the information is more up-to-date. Most modern search engines are systems of this type.
  • human-controlled systems (resource catalogs)
These search engines get lists of web pages. The directory contains the address, title, and a brief description of the site. The resource catalog looks for results only from page descriptions submitted to it by webmasters. The advantage of directories is that all resources are checked manually, therefore, the quality of the content will be better compared to the results obtained automatically by the system of the first type. But there is also a drawback - updating these directories is done manually and can significantly lag behind the real state of affairs. Page rankings cannot change instantly. Examples of such systems are Yahoo directory, dmoz and Galaxy.
  • hybrid systems
Search engines such as Yahoo, Google, MSN combine the functions of systems using search robots and human-controlled systems.
  • meta-systems
Metasearch engines combine and rank the results of several search engines at once. These search engines were useful when each search engine had a unique index and the search engines were less "smart". Since search has improved so much now, the need for them has decreased. Examples: MetaCrawler and MSN Search.

Search engine market

Google is the most popular search engine in the world with a market share of 68.69%. Bing occupies the second position, its share is 12.26%.

The most popular search engines in the world:

Search system Market share in July 2014 Market share in October 2014 Market share in September 2015
Google 68,69 % 58,01 % 69,24%
Baidu 17,17 % 29,06 % 6,48%
bing 6,22 % 8,01 % 12,26%
Yahoo! 6,74 % 4,01 % 9,19%
AOL 0,13 % 0,21 % 1,11%
Excite 0,22 % 0,00 % 0,00 %
Ask 0,13 % 0,10 % 0,24%

Asia

In East Asian countries and in Russia, Google is not the most popular search engine. In China, for example, more popular search engine Soso?!.

In South Korea, Naver's proprietary search portal is used by about 70% of Yahoo! Japan and Yahoo! Taiwan are the most popular search engines in Japan and Taiwan, respectively.

Russia and Russian-language search engines

According to LiveInternet data in June 2015 on the coverage of Russian-language search queries:

  • All-lingual:
    • Yahoo! (0.1%) and search engines owned by this company: Inktomi, AltaVista , Alltheweb
  • English-speaking and international:
    • AskJeeves(Teoma mechanism)
  • Russian-speaking - most "Russian-speaking" search engines index and search for texts in many languages ​​- Ukrainian, Belarusian, English, Tatar and others. They differ from “all-language” systems that index all documents in a row, in that they mainly index resources located in domain zones where the Russian language dominates, or otherwise limit their robots to Russian-language sites.

Some of the search engines use external search algorithms.

Quantitative Google Search Engine Data

The number of Internet users and search engines and user requirements for these systems is constantly growing. To increase the speed of searching for the necessary information, large search engines contain a large number of servers. Servers are usually grouped into server centers (data centers). Popular search engines have server centers scattered all over the world.

In October 2012, Google launched the Where the Internet Lives project, where users are given the opportunity to get acquainted with the company's data centers.

The Google search engine knows the following about the work of data centers:

  • The total capacity of all Google data centers, as of 2011, was estimated at 220 MW.
  • When Google planned to open a new 6.5 million m² three-building complex in Oregon in 2008, Harper's Magazine estimated that such a large complex would consume over 100 MW of electricity, which is comparable to the energy consumption of a city of 300,000 people. human.
  • The estimated number of Google servers in 2012 is 1,000,000.
  • Google's spending on data centers was $1.9 billion in 2006 and $2.4 billion in 2007.

The size of the World Wide Web indexed by Google as of December 2014 is approximately 4.36 billion pages.

Search engines that take into account religious prohibitions

The global spread of the Internet and the increasing popularity of electronic devices in the Arab and Muslim world, in particular in the countries of the Middle East and the Indian subcontinent, contributed to the development of local search engines that take into account Islamic traditions. Such search engines contain special filters that help users avoid accessing prohibited sites, such as sites with pornography, and allow them to use only those sites whose content is not contrary to the Islamic faith. Shortly before the Muslim month of Ramadan, in July 2013, the world was introduced Halal googling- a system that gives users only halal "correct" links by filtering search results received from other search engines such as Google and Bing. Two years earlier, in September 2011, the I'mHalal search engine was launched to serve users in the Middle East. However, this search service had to be closed shortly, according to the owner, due to lack of funding.

The lack of investment and the slow pace of technology diffusion in the Muslim world has hindered progress and hindered the success of a serious Islamic search engine. The failure of huge investments in Muslim lifestyle web projects, one of which was Muxlim. He has received millions of dollars from investors such as Rite Internet Ventures and is now - according to the latest post from I'mHalal before it shut down - coming up with the dubious idea that "the next Facebook or Google might only be in the Middle East. if you support our brilliant youth." Nevertheless, Islamic internet experts have been busy for years defining what is or is not in accordance with Sharia, and classify websites as "halal" or "haram". All former and current Islamic search engines are just a specially indexed set of data, or they are major search engines such as Google, Yahoo and Bing with some kind of filtering system used to prevent users from accessing haraam sites such as like sites about nudity, LGBT, gambling, and anything else that is considered anti-Islamic.

Among other religion-oriented search engines, Jewogle, the Jewish version of Google, and SeekFind.org, a Christian site that includes filters to keep users away from content that could undermine or weaken their faith, are common.

Personal results and filter bubbles

Many search engines, such as Google and Bing, use algorithms to selectively guess what information a user would like to see based on their past activities on the system. As a result, websites only show information that is consistent with the user's past interests. This effect is called "filter bubble".

All this leads to the fact that users receive much less information that contradicts their point of view and become intellectually isolated in their own "information bubble". Thus, the "bubble effect" can have negative consequences for the formation of civic opinion.

Search engine bias

Although search engines are programmed to rank websites based on some combination of their popularity and relevancy, the reality is that experimental research indicates that various political, economic, and social factors influence SERPs.

This bias can be a direct result of economic and commercial processes: companies that advertise on a search engine may become more popular in organic search results on that search engine. The removal of search results that do not comply with local laws is an example of the influence of political processes. For example, Google will not display some neo-Nazi websites in France and Germany, where Holocaust denial is illegal.

Bias can also be a consequence of social processes, as search engine algorithms are often designed to exclude unformatted points of view in favor of more "popular" results. The indexing algorithms of the major search engines prioritize American sites.

The search bomb is one example of an attempt to manipulate search results for political, social, or commercial reasons.

see also

  • Qwika
  • Electronic library#Lists libraries and search systems
  • Web developer toolbar

Notes

Literature

  • Ashmanov I. S. , Ivanov A.A. Website promotion in search engines. - M. : Williams, 2007. - 304 p. - ISBN 978-5-8459-1155-1.
  • Baikov V.D. Internet. Search for information. Website promotion. - St. Petersburg. : BHV-Petersburg, 2000. - 288 p. - ISBN 5-8206-0095-9.
  • Kolisnichenko D. N. Search engines and website promotion on the Internet. - M.: Dialectics, 2007. - 272 p. - ISBN 978-5-8459-1269-5.
  • Lande D.V. Search for knowledge on the Internet. - M. : Dialectics, 2005. - 272 p. - ISBN 5-8459-0764-0.
  • Lande D.V., Snarsky A. A., Bezsudnov I.V. Internet: Navigation in complex networks: models and algorithms. - M.: Librokom (Editorial URSS), 2009. - 264 p. - ISBN 978-5-397-00497-8.
  • Chu H., Rosenthal M.

The story of how search engines appeared begins in July 1945, when Vannevar Bush, an American scientist, was able to write the famous article “While We Think,” in which he was able to predict the emergence of personal computers, and was also able to formulate the idea of ​​​​hypertext. Note that Vanniver Bush himself participated in the creation of prototypes of the search engines that we use today. However, then, back in 1938, he was able to develop and patent a device that could quickly search for information on microfilm.

Despite the fact that although Vanniver Bush is considered the founder of search technologies and the idea of ​​the Internet, other scientists have put his ideas into practice. In 1958, the ADIP (Advanced Research Projects Agency, ARPA, Advanced Research Projects Agency) was created in the United States under the Department of Defense, in which, from 1963 to 1969, scientists could work on a completely new concept that allowed information to be transmitted via a computer network.

At first, this connection, which allowed the transfer of encrypted data, was planned to be used for military purposes, but the level of security for transmitting information turned out to be very low, so the military was asked to refuse to continue development.

However, it was not until the late 1980s that the idea of ​​creating a computer network was resurrected. This was helped by several US universities, which in their developments were able to combine their library of information, which was educational, by connecting networks.

The 1990s saw a dramatic development of the Internet. From February 1993, as soon as Mark Andressen of NCSA (National Center for Supercomputing Applications, www.ncsa.uiuc.edu) was able to finish the initial version of the program that rendered Mosaic hypertext under UNIX , because it was she who had a convenient graphical interface and she was able to become the prototype of the browsers that we use in our time. The Internet began to gain popularity.

In the mid-1990s, in order to find the information you needed, you had to use the directory in which the sites were located. At that time, there were not many of these directories, and they did not shine with an abundance of sites, but the information in them was ordered by headings and topics. It is worth noting that in 1993 three search engine bots were already on the web. These developments were non-commercial and after the influx of a large amount of information they could not cope with the work, so they disappeared due to the rapid development of the Internet.

Since 1995, the main place in the global Internet has been occupied by search engines, which later became very large, in the West - Google, Yachoo, Alta Vista, and in Russia - Yandex, Rambler, Aport.

Let's digress to the history of the development of search engines in Russia. Here, not such an easy path was waiting for our search engines. There were also victories and defeats.

The Yandex company began to develop in 1990, but only in 1997 it became the search engine that we know very well.

Yandex is considered the undisputed leader in Russia, because the coverage of the Yandex audience for the month, according to the estimates of leading experts, amounted to approximately half of the regular Internet audience in Russia. These figures are head and shoulders above the potential audience of Aport and Rambler. Recently, a fairly powerful Go Mail search was born from another large electronic mail service, but in this case, the company was able to use the Yandex algorithm, and because of this, we can attribute search from the pages of the Mail system to Yandex search. But the latest scandal forced Mail Group to move away from Yandex search. No one knows the exact reasons for the quarrel until now.

Yandex search takes into account headings, as well as the obligatory presence of a word in the body of a document. Preference is given to those words that are a phrase, are located close to each other and are in the same paragraph. The search in Yandex takes into account the morphology of the Russian language, this is its distinctive feature, that is, in the case of the request "photo nature" or "photo nature" it will issue both those and other documents that meet these words.

Rambler is the first Runet search service, opened in the fall of 1997 by a group of scientists from the Research Institute of Microbiology in the city of Pushchino, Moscow Region. Search in Rambler was built by indexing the main words on the page that were in bold (strong and b tags) and if they appeared frequently in headings (h1 tags). Unlike Yandex, Rambler search can ignore keyword tags, which is why they like to call it pure search, but at the same time, the proper purity of the search has not yet been noticeable. This problem flashes in other search engines as well. At the moment, the search positions of Rambler have fallen a lot and experts and forecasters predict that this system will be retrained into a regular entertainment portal. The only thing that makes this system afloat is Blogun's own advertising network.

The Aport search engine was first demonstrated in February 1996 during the Agama press conference in honor of the opening of the Russian Club, at that time it was not yet a large-scale search engine throughout the Internet. The difference between Aport and other search engines is that it can search for given keywords not only in keywords, but also in image captions (alt) and description. But this innovation did not last long. Other search engines repeated the same thing and now Aport has nothing more to surprise its users with. For 2011, the Aport search engine is likely to be taken over by larger players in the search market.

Search Disadvantages

At this time, search engines by any means continue to improve their search technology. But, unfortunately, none of them can boast of a perfect search, no matter how highly developed they are. Nowadays, the main disadvantages of search engines may include query generalization systems that are underdeveloped and a huge dependence on the choice of information sources. In case of insufficient information content, it can still be somehow compensated by the abundance of the choice of search results. But to explain to the computer in human language what people want to find has not yet been translated into reality. Because of this, none of the search engines can call themselves an encyclopedia. However, it is no longer a secret that the future is definitely in informative search, which will be focused on the processing of human concepts.