Internet Windows Android

pagerank calculation. Explained PageRank

To calculate the PageRank for a page, you need to take into account all internal and external links to that page. Below is the equation for calculating the PageRank value of page A.

PR(A)=(1-d) + d(PR(t1)/C(t1) + … + PR(tn)/C(tn))

PR(t1…tn)- weight of the page linking to page A

C- number of outgoing links from page A

d is the attenuation coefficient, usually taken as 0.85.

A page "votes" its PageRank value on every page it links to. The voting value for a page is the sum of that page's own PageRank value * 0.85. This value is distributed evenly among all pages to which outgoing links lead.

The equation implies that a single link from a PR4 page with five outbound links will transfer more weight than a link from a PR8 page with 100 outbound links. The more outbound links on a page, the less PageRank will be passed on to that link.

Note that when a page votes its PageRank on other pages, that page's own PageRank is not reduced. The voting page does not contribute its PageRank value. It is like a meeting of shareholders, where each shareholder votes according to the number of shares he owns, but does not give them away. Further we will see that all the same pages indirectly lose some PageRank.

The equation clearly shows where the PageRank value for any page comes from. Suppose we have 2 pages, A and B, which links to each other, there are no other links on these pages. Here's what happens:

Calculating Google PageRank for Page A

Step 1: Calculate the PageRank value for page A

The page now has the new PageRank value. The weight of the outbound link from page B was used for the calculation. But page B also has an outbound link to page A, and the resulting PageRank value cannot be accurate until the PageRank value for page B is known.

Calculating Google PageRank for Page B

Step 2: Calculate the PageRank value for page B

Page B now has a new PageRank value, which cannot be accurate because the PageRank value from Page A is inaccurate for the calculation.

We can't calculate the exact PageRank for page A until we know the PageRank for page B, and we can't calculate the exact PageRank for page B until we know the PageRank for page A.

It is possible to recalculate the PageRank value for pages A and B over and over again, and each time the result will be different from the previous one and will be inaccurate. We can repeat the calculations again using the values ​​obtained in the previous step. But we always use imprecise values ​​for calculations, so the results will always be imprecise.

You can overcome the problem by repeating the calculations many times. Each time we will get slightly more accurate results. In fact, accuracy can never be achieved, since calculations are always based on imprecise inputs.

Sooner or later we will reach a point where further iteration will have little effect on the results of the calculations. This explains why recalculating PageRank values ​​for all pages in Google takes so much time and computing resources.

The only thing we can be sure of is that a link from any source increases PageRank for our site.

What is the best way to manage the indexing of internal links on the site in order to increase the PR of its individual pages? Consider the formula that calculates PR for the current page A:

here d- the attenuation coefficient of the reference weight, its exact value is hidden by Google, it is usually taken as 0.85. In the context of our question, this is not significant, since we want to evaluate the PR of selected pages on the site relative to all others;
T 1 ,…, T n - pages linking to BUT;
PR(T 1 ) ,…, PR(T n ) – PR of referring pages;
C(T 1 ) ,…, C(T n ) - the number of links on referring pages.

Peculiarities:

  1. If a page contains a link to itself, then this link is not taken into account in the calculation.
  2. Links to pages that themselves do not have links are also not taken into account.
  3. Two or more identical links from the same page count as one.
  4. Google may impose filters on some sites that worsen the flow of link weight and introduce distortions into the formula for determining PR, we do not consider this effect here.

How to use this formula, because the right side shows the PR of the pages that are also to be calculated? Let's take all the pages on the Internet indexed by Google and take the initial PR of each of them as one, then sequentially calculate the Page Rank for all. This was the first iteration in which each page received some kind of PR value. We repeat the calculations by this algorithm many times, using the values ​​obtained in the previous step as PR pages. The peculiarity of the algorithm is that no matter what initial PR we take and in whatever order we calculate it, for a sufficiently large number of iterations we will come to the same numbers.

However, the usual integer PR from 0 to 10 is not what we got in the previous paragraph. PR 0…10 – the so-called "Toolbar" PR ( Toolbar PageRank), it was introduced in order to be able to represent all PR values ​​in absolute terms, regardless of the number of pages in the network. Here he is:

where base is a number that depends on the number of pages in the Google index and other factors, usually taken base equal to 7;
a– reduction factor, 0< a≤ 1, most often taken as 1.

Odds base and a, as well as the formula for TLPR itself, are not important to us now, the main thing is that an increase in TLPR is always associated with an increase in PR, so we will concentrate on the latter. Let's forget about external links to other resources and try to calculate PR based only on internal factors. Let's say we have a website with six pages:

Each has a menu: "Main page", "About the site", "List of articles". Menu items are referenced on all pages of the site. "List of Articles" also refers to pages with articles. Page Rank with such a link distribution is indicated on the diagram above. When calculating PR, I made 100 iterations, taking one as the initial value and rounded the resulting numbers to hundredths after the decimal point.

Let's say we want to promote only the main page. To increase its PR, it would be logical to allow indexing only those links that lead to it. At the same time, we take into account that not a single page should be cut off from the site, that is, an indexed link should be placed on each page:

Well, the PR of the requested page has gone up. Now let's try to put a link to "Article 1" from it and see how the distribution changes:

It would seem that by placing an extra link on the main page, we should take away link weight from it and thereby weaken it. But in fact, it turns out quite the opposite - the reference weight returns with an increase! By this action, we simultaneously raise "Article 1".

Let now we change our minds and decide to promote only the list of articles:

We've just managed to get the highest PR of all previously calculated, equal to 2.8 for the list of articles. As this example shows, it is easier to increase the PR of a page that has many internal links, provided, of course, that backlinks are installed on it. The same effect was demonstrated when we made a link to "Article 1" from the main one.

And now we will break the logical structure of the site: we will put links from the main page to all the others, and from all pages to the main one. Other links will be closed from indexing.

  1. The best way to increase the Page Rank of pages with many links is to install backlinks. Such pages include forums, lists of articles, sitemaps, etc.
  2. The PR of a page rises great if you put a link to it from the pages from p.1, accumulating Page Rank.
  3. To increase the PR of the main page, it will be useful to place announcements of articles, news, etc. leading to pages with full text on it. Again, don't forget about backlinks.

And here is a script that will help you with the calculation of PR. Experiment with different options for indexing links on the site.

    // array of site pages: the first element in the array of each page is its name,

    // all other elements are indices of pages in the array that are linked from the current

    $pages = array

    array( "Main page", 1 , 2 ) ,

    array ("About site" , 0 , 2 ) ,

    array ("List of articles" , 0 , 1 , 3 , 4 , 5 ) ,

    array("Item 1" , 0 , 1 , 2 ) ,

    array("Item 2" , 0 , 1 , 2 ) ,

    array("Item 3" , 0 , 1 , 2 )

    // set pages initial value PR = 1

    for ($i = 0 ; $i< count ($pages ) ; $i ++ ) $pr [ $i ] = 1 ;

    // number of iterations = 100

    for ($i = 0 ; $i< 100 ; $i ++ )

    for ($j = 0 ; $j< count ($pages ) ; $j ++ )

    $add = 0 ; // growth from external links

    for ($k = 0 ; $k< count ($pages ) ; $k ++ )

    if ($k == $j ) continue ;

Colleagues, we are finally ready to present to your attention a revolutionary function of Netpeak Spider - calculation of internal PageRank! There was nothing left of the old calculation mechanism, and in order to introduce the new one, we were forced to carry out the previous release, which radically changed the scanning algorithm inside the program. We have prepared this post-instruction for you, to which you can return directly from the interface of the new internal PageRank calculation tool.

What is PageRank

PageRank is the relative weight of the page, calculated by the formula:

PR(A) = (1 - d) / N + d * (PR(B) / L(B) + PR(C) / L(C) + ...)

  • N is the total number of active nodes (pages) involved in the calculation;
  • d– attenuation factor (typically 0.85 is used);
  • L- the number of outgoing links.

It is generally accepted that at the zero (0) iteration the PageRank of each page is the same and equals 1 / N. At the next iterations, the weight of all incoming links is used, which is the weight from the previous iteration divided by the number of outgoing links (in the formula - L).

Especially for you, we have prepared several tables that clearly show the operation of the algorithm:

Google calculates this parameter for each page on the Internet, while Netpeak Spider allows you to calculate internal PageRank , which is limited to the crawled site or list of URLs.

Why Calculate Internal PageRank

This feature is revolutionary at least because it allows you to get real insights about your project:

1. Understand exactly how link juice is distributed throughout the site and where it is concentrated.

2. Determine which pages that are not important for search engine promotion are getting overweight.

3. Know which pages are "dangling nodes" and are simply "burning" incoming link juice.

Assuming that external links lead to your site, just imagine how much SEO budget can be saved by implementing a more effective internal linking scheme.

How to Calculate Internal PageRank

Netpeak Spider provides 2 ways to calculate internal PageRank:

1. Automatic

Just select the special parameter "Internal PageRank" in the crawl settings on the "Parameters" tab and it will be calculated automatically when the crawl process is paused or after it has successfully completed.

Please note that in order to calculate this indicator, it is necessary to enable the “Outgoing links” parameter, since it is outgoing links that are the basis for obtaining backlinks, without which internal PageRank cannot be calculated.

2. Manual (using a separate tool)

To call a special tool, go to the menu "Tools" → "Internal PageRank Calculation".

Here you will see the following blocks:

2.1. Settings that are also used for the automatic calculation method:

  • number of iterations [from 5 to 50]→ a larger number of iterations ensures higher accuracy of calculations, however, according to our observations, about 15 iterations is the most appropriate value, allowing you to quickly get the desired result, so 15 iterations are set by default in Netpeak Spider;
  • only internal links→ a setting that allows you to disable the influence of all external outgoing links on calculations;
  • only links in the [All results] / [Filters] tab→ a setting that allows you to limit calculations only to those links that are on the corresponding tabs: use [Filters] in cases where you need to calculate PageRank only within a certain category of the analyzed site;
  • results display mode→ "Real" shows the exact PageRank values, but may be inconvenient for sites with a large number of pages; "Adaptive" mode allows you to see the same data, but multiplied by a special coefficient, allowing you to conveniently work with large sites.

Please note that if you uncheck "only internal links" and "only links in the [All results] / [Filters] tab" checkboxes at the same time, Netpeak Spider will start downloading and analyzing all outgoing links from all crawled pages during calculations. In this case, links with the status code "Not Crawled" (not crawled) may appear in the report - this is necessary in order to calculate the internal PageRank as correctly as possible, based on actual outgoing links.

2.2. The formula by which the internal PageRank is calculated, as well as the above parameters N, d and a link to this article.

2.3. Ignored URL List: You can add a link to this list to completely exclude it from PageRank analysis. This function allows you to work with calculations very flexibly, changing internal linking directly in the program.

Note that it is not a single link on a particular page that is excluded, but the entire node: imagine that there is not a single link to this page from the entire site (inbound links) and not a single link from this page to other pages of the site (outbound links).

2.4. Export data from table to file in CSV/Excel format.

2.5. The resulting table that contains the following columns:

  • block "Pages"→ serial number (#) and link to the page;
  • block "Iterations"→ after starting the calculations, the corresponding columns with data for each iteration will appear here;
  • block "Relationships"→ this shows the number of outgoing and incoming links that can be opened by double-clicking the left mouse button or by calling the context menu: a convenient view of these reports has been developed especially for you with the ability to go deeper and return using the usual "Back" / "Forward" buttons, receiving full access to the connection graph;
  • block "Algorithmic Analysis"→ here are collected parameters that are determined specifically using the PageRank algorithm, namely “Link status” (read more about this parameter below) and “Final link” - shown in cases where a redirect was found as a result of the algorithm;
  • block "Main parameters"→ allows you to see the server response code and the content type of the corresponding pages;
  • block "Indexing Options"→ combines parameters that critically affect the distribution of link weight on the site: instructions from Robots.txt, Canonical, X-Robots-Tag, Meta Robots, as well as the final URL of the redirect and the Refresh tag, if they are present on the page.

At the bottom of the table, the “Sum of all PageRanks” is calculated → at each iteration, the sum should be equal to 1 (in “Real” mode) or 10 to the appropriate power (in “Adaptive” mode). If the sum differs from the specified values, then the analyzed site has dangling nodes on which you are losing link juice.

2.6. Status panel, which, together with the resulting table, shows all the steps of the algorithm, allowing users to see the dynamics of the calculations.

When exiting the "Internal PageRank Calculation" tool, the data of the last iteration will be automatically placed in the main table of the program in the corresponding column. If before that there was already some data in the main table, then more recent ones will overwrite them.

Calculation algorithm

Once again, to calculate internal PageRank, you must enable the Outbound Links parameter, which covers all relationships between pages, allowing you to take into account basic indexing instructions, link attributes, and link weight redirect options.

The whole process consists of 2 consecutive stages:

1. Building a connection graph → the purpose of this stage is to build the relationship of links and set their status:

1.2. Initial analysis → splitting links into OK, Hanging Node, and Redirecting statuses (read more about link statuses below).

1.4. Counting incoming links.

1.5. Final analysis → detailed analysis of outgoing and incoming links, as well as the definition of "End Links" and links in the status of "Unrelated node".

2. Internal PageRank Calculation → starting from iteration 0 and up to the one specified in the settings.

Link statuses

The most interesting part of the PageRank algorithm is that all links are logically divided into 4 statuses:

1. OK

These are HTML pages with a server response code of "200 OK", which contain outgoing links and can be:

  • noindex, that is, non-indexed → yes, it didn’t seem to you: non-indexed pages also carry link weight
  • with the Canonical tag pointed to itself
  • with the Refresh tag pointed to itself

2. Hanging knot

Pages with 0 outbound links, meaning these pages don't pass on link juice, losing it entirely.

This type includes:

  • 2xx pages that simply don't contain outbound links
  • 2xx pages closed in Robots.txt
  • 2xx nofollow pages in X-Robots-Tag or Meta Robots instructions
  • 2xx pages, but not HTML and thus no outgoing links
  • 3xx links closed in Robots.txt
  • 3xx links with infinite redirect (status code "3xx Redirect Loop")
  • 4xx pages
  • 5xx pages
  • pages returning any other server response code
  • redirect pages (Canonical or Refresh) that did not reach the target page: in this case, the status code "Endless Redirected" will be displayed, that is, an endless redirect
  • outgoing links that are not in the “All results” table → note that by default, with the “only internal links” and “only links on the [All results] / [Filters] tab” checkboxes disabled, Netpeak Spider will try to find all links that is on the site regardless of the crawl settings - this is necessary in order to get a complete and accurate picture of the transfer of link weight

3. Redirect

This type includes:

  • 3xx pages
  • 2xx pages with Canonical tag pointed to another page
  • 2xx pages with Refresh tag pointed to another page

4. Unrelated node

Links that do not have incoming links.

  • crawling a site with indexing instructions disabled (Robots.txt, Canonical, Refresh, X-Robots-Tag, Meta Robots and the nofollow attribute on links) → note that when these instructions are disabled, Netpeak Spider crawls the site differently than this is done by search engine robots, but the PageRank algorithm always works according to these instructions, so some links obtained as a result of crawling may be unreachable for the PageRank algorithm.
  • crawling your own list of URLs → links that are not related in any way.

3 new bugs

Immediately after the automatic or manual calculation of the internal PageRank, 3 types of errors will get into the main interface of the program, if they are present on the site:

  • PageRank: dangle→ as mentioned above, these are pages without outgoing links that do not pass link weight, thereby violating the natural distribution of link weight across the site;
  • PageRank: redirect→ Pages that redirect link juice - these can be pages that return a 3xx redirect or contain Canonical / Refresh tags pointing to a different URL.
  • PageRank: missing links→ these are unreachable pages to which no incoming links were found.

Briefly about the main

Colleagues, we managed to implement the most accurate algorithm for calculating internal PageRank, which allows you to find out a number of insights about the site being analyzed: how exactly the link weight is distributed over the pages, which pages that are unnecessary for SEO get overweight, which “dangling nodes” are present on the site, and, finally how to fix these errors.

Try a new unique feature, experiment with different settings and implement new and more efficient internal linking schemes! :)

PageRank is one of the main external indicators of the site, which significantly affects the popularity of your resource on the Internet and significantly affects the potential income that you can receive (for example, by selling links on the pages of your site).
In this article, I want to describe in detail all the points that relate to PageRank from Google.

What is PageRank and what is it for?
As you know, PageRank is a numerical indicator of the relative authority of a website page among all other pages on the Internet, used by the Google search engine. PageRank is based on the principle of calculating the credibility of a scientist in scientific circles by who and how often from other scientists refers to the work of this one.
PageRank features:
- the indicator is assigned not to the resource as a whole, but to a separate page of the site (as a rule, the main page has the highest PageRank level, since the largest number of links to it);
- the link leading from the page does not reduce the PageRank (static weight) of this page;
- The level of PageRank does not affect the relevance of the page, that is, it will not get to the first positions in search queries, just because it has more weight. To some extent, this certainly affects the position, but Google gives preference to the quality content of the page that meets the search query.

What is PageRank for? After all, it does not affect relevance.
Webmasters need it to increase the cost of placing links to their resources. If the price of a link on a page (not the main one) with PR = 0 costs a maximum of 10 cents, then with PR = 4 it costs many times more.
Also, a high level of PageRank indicates the authority of the page, its full perception by the Google search engine. The combination of such pages allows Google to form a thematic opinion about the resource. I won't say it, but I think that quite often Google fails to find the specific information requested, and gives answers to similar thematic resources and accordingly ranks it depending on the PageRank level. As if prompting the user where he could find the information of interest to him.

How to calculate PageRank?
To calculate PageRank for a page, you need to take into account all internal and external links to this page:
- the more external links to the page, the more PageRank weight is transferred to this page;
- the more internal links on the page (including external links to other resources), the more PageRank weight is distributed evenly over each link. Thus, all links will receive the same weight.

Based on this, you must create an internal linking of the site, so that PageRank is transferred to all pages, but not immediately, but in a chain. And the longer the chain, the more weight the pages in it receive (you can disable the transfer of PageRank to links by adding the rel=nofollow attribute to them).

The following equation can be used to calculate the PageRank for a page:

PR(A) = (1-d) + d(PR(t1)/C(t1) +... + PR(tn)/C(tn))

PR() - PageRank of the page as a numeric number (floating point number);
A - page PageRank which we determine;
t1...tn - page linking to page A;
C - the number of outgoing links from page A;
d is the damping factor, usually taken as 0.85.

The page passes the PageRank value to all the pages it links to. In this case, the PageRank value is calculated as the page's own PageRank value multiplied by 0.85. Then, this value is distributed evenly among all the pages to which it refers.

With the help of the table, we can approximately calculate what PageRank our page will receive with a certain number of links to it:

Number of links: PageRank of pages that link to ours:
0 1 2 3 4 5 6 7 8 9 10
1 0 0 0 +1 +2 +3 +4 +5 +6 +7 +8
4 0 0 +1 +2 +3 +4 +5 +6 +7 +8 +9
19 0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +10
101 +1 +2 +3 +4 +5 +6 +7 +8 +9 +10 -
555 +2 +3 +4 +5 +6 +7 +8 +9 +10 - -
3 055 +3 +4 +5 +6 +7 +8 +9 +10 - - -
16 803 +4 +5 +6 +7 +8 +9 +10 - - - -
92 414 +5 +6 +7 +8 +9 +10 - - - - -
508 277 +6 +7 +8 +9 +10 - - - - - -
2 795 522 +6 +7 +8 +9 +10 - - - - - -
15 375 379 +7 +8 +9 +10 - - - - - - -
84 564 584 +8 +9 +10 - - - - - - - -
449 527 525 +9 +10 - - - - - - - - -

You can check the value of PR pages on

Everyone uses it, but few people know how it works. Google PageRank is one of the most important parameters for web developers.

Searching through the billions of existing and millions of pages created every day is more difficult than you might imagine. PageRank is just one of hundreds of factors used by Google to improve search quality. But how does it work, and what factors influence it and what do not, and what do we know about PageRank?

In this article, we present only the facts.

Over the past week, we have looked at a lot of facts and assumptions that seemed realistic to us. In addition, we've compiled some academic research on search and 16 useful PageRank tools.

The most important facts are briefly described at the beginning of the article.

How does PageRank work?

  1. PageRank one of the many methods used by Google to determine the relevance and importance of the page.
  2. Google interprets a link from page A to page B as voice A in favor of B, of course, not only the number of votes is taken into account, but also the quality of the voting pages.
  3. PageRank based on the number of incoming links, but not only on it, relevance and quality are also important.
  4. Not all links contribute equally to PageRank.
  5. If there is only one link on a page with PR8, then the site to which it links will receive all the PR that it can transmit, but if there are 100 links, then each link will transmit only a part of this PR.
  6. Bad incoming links do not affect PR.
  7. PR takes into account the lifetime of the site, the relevance of incoming links and the time of their existence.
  8. When calculating PR, content is not taken into account.
  9. PR is calculated not for the site as a whole, but for each page separately.
  10. Every incoming link counts, with the exception of links from banned sites.
  11. PR is not only integer values ​​from 0 to 10, it is a real number.
  12. It is more and more difficult to achieve each next level of PR, presumably a logarithmic scale is used.
  13. PR is constantly recalculated, but the data for the toolbar is updated every few months.
  14. Google tries to find pages that are solid and relevant at the same time.

Factors Affecting PageRank

  1. Frequent site updates do not automatically increase PR.
  2. High PR does not guarantee high positions in search results.
  3. DMOZ and Yahoo! do not automatically increase PR.
  4. .edu and .gov sites do not automatically increase PR.
  5. Internal pages do not necessarily have a lower PR than the main one.
  6. Links from Wikipedia do not automatically increase PR.
  7. Links with the nofollow attribute do not affect PR.
  8. Effective internal links affect PR.
  9. Links from thematic sites have a stronger effect.
  10. The text used in a link can often be more important than the linking page's PR.
  11. Outgoing and incoming links to high-quality thematic sites have a positive effect on PR.
  12. Several identical links from one page are considered as one.
  13. The site can be banned for links to banned sites.

1.1 What is PageRank?

  • PR is just one of the methods used by Google to determine the relevance and importance of the page. [PageRank Explained Correctly 6 ]
  • Google uses many factors to rank pages, PageRank is one of the best. PR reflects two important points, how many pages link to a given page and what level of pages link to it. Five to six links from sites such as www.cnn.com 7 or www.nytimes.com 8 may be more useful than many more links from less established sites. [ Google Librarian Central 9 ]
  • PR can only reflect the approximate quality of a page, but has nothing to do with its topical relevance, which can only be determined by taking into account the context of links, and factors such as keyword density, page title, etc. [PageRank: An Essay 10 ]

1.2 How does PageRank work?

  • Nobody knows exactly how Google calculates PR.[Google PageRank Explained 11]
  • PR(A) = (1-d) + d(PR(t1)/C(t1) + … + PR(tn)/C(tn)). This is how the approximate formula for calculating PR looks like, where t1-tn pages linking to A, C(tn) is the number of outgoing links to the corresponding page, d coefficient is usually equal to 0.85.
  • We can assume that PR is calculated by the formula PR = 0.15 + 0.85 * (the part of the PR of each linking page transmitted by ours). The amount of PR a page can use to vote for others is slightly less than its own PR, 0.85 * PR to be exact, and is divided between the pages it links to. [Google's PageRank 12]
  • The PR calculation algorithm is based on the distribution of the page's own PR, between the pages to which it links. For example, if there is only one link on a page with PR8, then the page it links to will receive all available PR, but if there are 100 links on this page, then each of them will receive only a hundredth of the available PR. [The Importance of PageRank 13]
  • As a result of this PR calculation algorithm, a link from a page with PR4 and 5 external links is more effective than a link from a page with PR8 and 100 external links. The PR of linking pages is important, but just as important is the number of outbound links they contain, the more outbound links the less PR each will go through. [Google's PageRank 12]
  • PR uses incoming links as an indicator of the page's importance. Google interprets a link from page A to page B as a vote of page A in favor of page B. Not only the number of votes is taken into account, but also the quality of the voting pages. The higher the PR of a page, the more important its vote is. [ Google: technology 14 ]
  • Not all links are equally useful. The higher the PR of the linking page, the more PR it transmits, but you need to take into account the fact that this PR is shared equally between all the pages it links to. Therefore, a link from a page with PR4 and a single outbound link can yield more than a link from a page with PR5 and 100 outbound links. A typical example of the well-known million-dollar master pages, such a page with PR7 and hundreds of outbound links, despite its importance, passes insignificant PR to other pages. [Google PageRank Explained 11]
  • Each next level of PR is achieved much more difficult than the previous one. The calculation of PR uses a logarithmic scale, which means that it takes one step to go from PR0 to PR1, PR3 is somewhat harder to score, PR4 is even harder, and PR5 is significantly harder. [Google Page Rank FAQ 15]
  • PR is calculated not for the site as a whole, but for each individual page and is recursively linked to the PR of the pages that link to it. [The Page Rank algorithm 17 ]
  • Google combines PR with sophisticated text search techniques, many aspects of the content of the page and the pages that link to it are analyzed to find better pages than others that match the user's query. [What Is Google PageRank? eighteen ]
  • PR is constantly recalculated, but toolbar data is updated every few months, new sites are assigned PR0. [Google PageRank Explained 11]
  • PR is not only integer values ​​from 0 to 10, PR is a real number. It's correct to think of PR as a real number, because in internal calculations we use many gradations, and not just the values ​​​​from 0 to 10 displayed in the toolbar. [Matt Cutts 19]
  • The robot does not analyze sites instantly. It often takes two full updates for all inbound links to be detected, counted, and displayed as inbound links. [Google FAQ 20]

1.3 Factors affecting PageRank

  • Every incoming link counts, with the exception of links from banned sites. PR is a kind of voting system, each link to a page is a vote in its favor. High PR pages are considered more important and their votes matter more in some cases, but generally the more inbound links the better. [Google PageRank FAQ 21]
  • Adding new pages can decrease PR. This effect is that the total PR of the site increases, but one or more old pages lose part of the PR, due to which new ones receive it, the more pages are added, the more PR the existing ones lose. On large sites, this effect is invisible, but on small sites it can sometimes be observed. [PageRank Explained 12 ]
  • Decreased PR. A page's PR may decrease due to the disappearance of some important links that gave it PR, or a drop in the PR of pages linking to it. [Google PageRank FAQ 22]
  • Headings (h1, … , h6) and strong tags are important but don't affect PR. Use meta tags, titles and b, strong tags, but keep the content readable and useful. Pay attention to the text surrounding keywords, search engines are getting better at semantics, so the context of keywords is very important.
  • Of great importance is the effectiveness of the internal structure of the site. Pages on the site should be linked in the simplest possible way, ideally there should be no pages more than three clicks away from the main page. [ 23 ]
  • Links from and to high PR related sites are very important. The closer the theme of the pages, the more PR the link transmits. Links to reputable sites with similar topics show search engines that the site is useful to visitors, this is not always true for sites that have been around for several years and have a high Google ranking. Referring only to high-quality sites, you can get some advantage over competitors. [Let Google's Algorithm Show You The Traffic 23 , FAQ 15 ]
  • Link text matters. The more specific the link text, the better Google can relate it to user queries.
  • Link farms (link cleaning stations) are penalized. Google is interested in pages containing less than 100 outbound links, pages with a large number of links are considered link farms and penalized. [Google FAQ 24]
  • Incoming links from popular sites are very important. If a page is linked to by high PR pages, it gets a portion of their reputation.
  • A site can be banned if it links to banned sites. Be very careful with outgoing links, do not link to suspicious sites (link scams, banned sites, etc.), Google can penalize your site for such links, always check the PR of the sites you link to. [SiteProNews 25]
  • Fraud is punishable by PR penalization and may result in a ban. Hidden text, redirects, cloaking, automated link exchange and other actions that contradict Google's quality guidelines 26 may result in a site being banned by Google.
  • Google takes into account the lifetime of the site, the relevance of incoming links, and the time of their existence if the incoming link is not relevant it won't generate much PR.
  • Myth: The higher the PR, the higher the position in the search results. Of course, pages with a high PR in the search results are located higher than competitors with a lower PR, but we must not forget that Google takes into account the context of incoming links, and only those links that are related to the words in the query can rank high in the search results for this query . [