How is Pr Calculated?
To calculate the PageRank for a page, all of its inbound
links are taken into account. These are links from within
the site and links from outside the site.
PR(A) = (1-d) + d(PR(t1)/C(t1) + ... + PR(tn)/C(tn))
That's the equation that calculates a page's PageRank. It's
the original one that was published when PageRank was being
developed, and it is probable that Google uses a variation
of it but they aren't telling us what it is. It doesn't matter
though, as this equation is good enough.
In the equation 't1 - tn' are pages linking to page A, 'C'
is the number of outbound links that a page has and 'd' is
a damping factor, usually set to 0.85.
We can think of it in a simpler way:-
a page's PageRank = 0.15 + 0.85 * (a "share" of
the PageRank of every page that links to it)
"share" = the linking page's PageRank divided by
the number of outbound links on the page.
A page "votes" an amount of PageRank onto each
page that it links to. The amount of PageRank that it has
to vote with is a little less than its own PageRank value
(its own value * 0.85). This value is shared equally between
all the pages that it links to.
From this, we could conclude that a link from a page with
PR4 and 5 outbound links is worth more than a link from a
page with PR8 and 100 outbound links. The PageRank of a page
that links to yours is important but the number of links on
that page is also important. The more links there are on a
page, the less PageRank value your page will receive from
it.
If the PageRank value differences between PR1, PR2,.....PR10
were equal then that conclusion would hold up, but many people
believe that the values between PR1 and PR10 (the maximum)
are set on a logarithmic scale, and there is very good reason
for believing it. Nobody outside Google knows for sure one
way or the other, but the chances are high that the scale
is logarithmic, or similar. If so, it means that it takes
a lot more additional PageRank for a page to move up to the
next PageRank level that it did to move up from the previous
PageRank level. The result is that it reverses the previous
conclusion, so that a link from a PR8 page that has lots of
outbound links is worth more than a link from a PR4 page that
has only a few outbound links.
Whichever scale Google uses, we can be sure of one thing.
A link from another site increases our site's PageRank. Just
remember to avoid links from link farms.
Note that when a page votes its PageRank value to other pages,
it's own PageRank is not reduced by the value that it is voting.
The page doing the voting doesn't give away its PageRank and
end up with nothing. It isn't a transfer of PageRank. It is
simply a vote according to the page's PageRank value. It's
like a shareholders meeting where each shareholder votes according
to the number of shares held, but the shares themselves aren't
given away. Even so, pages do lose some PageRank indirectly,
as we'll see later.
Ok so far? Good. Now we'll look at how the calculations are
actually done.
For a page's calculation, its existing PageRank (if it has
any) is abandoned completely and a fresh calculation is done
where the page relies solely on the PageRank "voted"
for it by its current inbound links, which may have changed
since the last time the page's PageRank was calculated.
The equation shows clearly how a page's PageRank is arrived
at. But what isn't immediately obvious is that it can't work
if the calculation is done just once. Suppose we have 2 pages,
A and B, which link to each other, and neither have any other
links of any kind.
This is what happens:-
Step 1: Calculate page A's PageRank from the value of its
inbound links
Page A now has a new PageRank value. The calculation used
the value of the inbound link from page B. But page B has
an inbound link (from page A) and it's new PageRank value
hasn't been worked out yet, so page A's new PageRank value
is based on inaccurate data and can't be accurate.
Step 2: Calculate page B's PageRank from the value of its
inbound links
Page B now has a new PageRank value, but it can't be accurate
because the calculation used the new PageRank value of the
inbound link from page A, which is inaccurate.
It's a Catch 22 situation. We can't work out A's PageRank
until we know B's PageRank, and we can't work out B's PageRank
until we know A's PageRank.
Now that both pages have newly calculated PageRank values,
can't we just run the calculations again to arrive at accurate
values? No. We can run the calculations again using the new
values and the results will be more accurate, but we will
always be using inaccurate values for the calculations, so
the results will always be inaccurate.
The problem is overcome by repeating the calculations many
times. Each time produces slightly more accurate values. In
fact, total accuracy can never be achieved because the calculations
are always based on inaccurate values. 40 to 50 iterations
are sufficient to reach a point where any further iterations
wouldn't produce enough of a change to the values to matter.
This is precisiely what Google does at each update, and it's
the reason why the updates take so long.
One thing to bear in mind is that the results we get from
the calculations are proportions. The figures must then be
set against a scale (known only to Google) to arrive at each
page's actual PageRank. Even so, we can use the calculations
to channel the PageRank within a site around its pages so
that certain pages receive a higher proportion of it than
others.
NOTE:
You may come across explanations of PageRank where the same
equation is stated but the result of each iteration of the
calculation is added to the page's existing PageRank. The
new value (result + existing PageRank) is then used when sharing
PageRank with other pages. These explanations are wrong for
the following reasons:-
1. They quote the same, published equation - but then change
it
from PR(A) = (1-d) + d(......) to PR(A) = PR(A) + (1-d)
+ d(......)
It isn't correct, and it isn't necessary.
2. We will be looking at how to organize links so that certain
pages end up with a larger proportion of the PageRank than
others. Adding to the page's existing PageRank through the
iterations produces different proportions than when the equation
is used as published. Since the addition is not a part of
the published equation, the results are wrong and the proportioning
isn't accurate.
According to the published equation, the page being calculated
starts from scratch at each iteration. It relies solely on
its inbound links. The 'add to the existing PageRank' idea
doesn't do that, so it's results are necessarily wrong.
|