Interesting things regarding google Page Rank algorithm

Discussion about the Google Search Engine. SEO, PageRank, AdSense, AdWords, Use and other Services.

Moderator: Moderators

Interesting things regarding google Page Rank algorithm

Postby Darth Pincho » Thu Oct 12, 2006 3:45 pm

Well friends ive found some really interesting and simple pdf that explains quite well how google PR works. Ive uploaded to my site to share with the people of this post.

[MOD] Please provide us with the original link for this article[/MOD]

It was downloaded using emule. What can i do?

Lets start to discuss some interesting things. Suposing this document is true and this is part of the real google PR algorithm i notice some interesting things.

Is clear to me that the more inbound links to my page the more PR i will achieve due to the sum of all PRn/Cn of all pages. But what happens if the page linking to my page have PR0? mmmmm.... Reading further i could find that all orphan pages or whatever pages have at least 0,15 PR and google toolbar only shows INT numbers from 0 to 10, so a page with 0,99999999 PR still is a 0 PR page to our google toolbar.

if we supose that a page like this post have about 70 outbound links, and one inbound link to my page, having in mind that the page rank is some value betwen 0,15 and 0,99 acoording to my google bar right now, the contribution to the page rank in my signature is betwen 0,15/70 = 0,00214 and 0,99/70 = 0,0141 mmm quite interesting uh? :wink:

This could be insignificant to some readers. But if you are intelligent and you think the same that Im thinking imagine the effect of multiply those minimal decimals by 1000 posts like this one? Your PR will increase some value betwen 2,14 and 14,1.... Now i understand why people spam on forums. Thing with im strongly disagree by the way.

Some people could think my calculations are so optimistic so lets think that this page have 140 outbound links and the page ranks will be divided into those 140 links. So our previous numbers down by a half. Still are good numbers, 1,07 and 7,05 pagerank increase are great too.

I conclude this effect is the called pagerank leaking defined in the pdf document i offered. Is like take a litte bit of the page ranks of every forum where i post link to my forum. Well DCZ dont get mad for that i put a php-seo.com link in my forum home page (pr2 now by the way) so i help you too jejeje.

According to this a good strategy will be find the best page ranked post in realted boards to your page and post USEFULL INFO HELPING USERS, pleaso do not spam. If you achieve 1000 posts and those pages get spidered by google you will have a PR increase in next google dance. Please correct me if im wrong DCZ!!!

The most exiting think i discover here (remember im a newbie in SEO stuff) if that a link in DMOZ category can boost a lot you PR, Beacuse DMOZ have great PR. But i really destroy my happiness when find that the bodybuilding pages in DMOZ bodybuilding are PR 0 :cry:

Surfing thought DMOZ i find that i will obtain more benefit being linked from:
http://www.dmoz.com/World/Espa%c3%b1ol/ ... ulturismo/ (22 sites listed and standard outbound links in header and footer)
Than from:
http://www.dmoz.com/Sports/Strength_Spo ... ybuilding/ (462 sites listed and standard outbound links in header and footer)

This is due to the fact that /culturismo/ category have less Cn links to divide the PR than /bodybuilding/ one. This is clear to me.

So i have a big question: Why im suposed to get more PR increase benefit from DMOZ than the one obtanied in this simple post?

other important think is the page rank transmission from your internal pages inbound links to you home page. For example if i start offering links everywhere in the bodybuilding comunity on internet to my internal pages some internet page rank leaking is gonna be transferred to my internal pages. And later will be trasferred to my home page, after many google PR calculation iterations. Because all my internal pages links to HOME page. Besides the fact that only little decimals will benefit my home page (around 0,001 numbers i think) having 5000+ content pages indexed in many directories of serarch engines will increase my home PR. This is another reason to ADD CONTENT TO YOUR SITE. The more content you have the more chances to get visited for some user, and more chance to transfer PR to your home page.

In conclusion, according to this calculations DCZ do you think that with 3000 posts in different related and not related pages to mine, and having links in my signature on all of them, how much increase my home page PR can I expect? actually i have PR3. Do you think i could achieve 7? or is too much difficult get to 7 besides the theory in this post?

3000 links on the worst page on internet (assuming 150 outbound links per page) means 3000* ( 0,15 / 150 ) = 10 This is 10 points PR increase. I Take en consideration that google needs to index all those 3000 pages to.

Well i hope this post be usefull to everyone here. :wink:
Last edited by Darth Pincho on Mon Oct 16, 2006 12:53 pm, edited 1 time in total.
Webmaster at:
http://www.tupincho.net
Discussion board at:
http://www.tupincho.net/foro/
Darth Pincho
PR1
PR1
 
Posts: 154
Joined: Fri Sep 22, 2006 8:37 pm

Advertisement

Postby Darth Pincho » Thu Oct 12, 2006 7:05 pm

Another thing. I was looking how much sites links to www.phpbb.com, i find: 2,670,000 aproximately.

If we use the above mentioned algorithm, lets supose 500 uotbound links per each page (exagerated), and 0,15 PR for each page (obviously too low, out there are great phpbb boards linking to phpbb.com) we get:
2670000 * (0,15/500) = 801 this means that phpbb would have 801 PR, I mean 10. But is not true, have a PR 9. WHY?

The damping factor is variable? is not always 0,85? that means that the minimum PR value for one pgae is a lot less than 0,15.

Now lets do the calculation in reverse. We can start by hte point of phpbb have a 9 page rank. Ok

(1 - Damping) + Damping * Sum (pr/outbound links) = 9

if we speculate the damping is about 0,95 That means:

(1 - 0,95) + Damping * Sum (pr/outbound links) = 9

(0,05) + Damping * Sum (pr/outbound links) = 9

Damping * Sum (pr/outbound links) = 8,95

0,95 * Sum (pr/outbound links) = 8,95

Sum (pr/outbound links) = 8,95 / 0,95 = 9,42

pr = (9,46 / 2670000) * 500

pr = 0,0017 This is this average PR of all that 2 millon pages is we use a damping factor of 0,95.

ANY IDEA HOW TO KNOW REAL VALUES?
Webmaster at:
http://www.tupincho.net
Discussion board at:
http://www.tupincho.net/foro/
Darth Pincho
PR1
PR1
 
Posts: 154
Joined: Fri Sep 22, 2006 8:37 pm

Re: Interesting things regarding google Page Rank algorithm

Postby dcz » Sun Oct 15, 2006 1:38 pm

Sorry for delay, but your long post deserved a proper answer thus more than 5 minutes to answer ;)

Then, two things :
First, I'd prefer you to provide us with the original link for your article, because even if it's tagged "sample", I am not sure you have the right to distribute it.
Second, please do not spam yourself to much, even if what you say is interesting, the edit function is enough before you get answers ;)

So you like maths, I do too, but my theoretical physics background also tough me to always keep a global view of any phenomenon.
Mathematics is a great tool, but can as well lead us to miss some parameters when we try to describe everything through functions and formulas.

All you describe here is very true, even though I am sure the Google formula is more complex, it's more or less the PageRank theory.

I think the Google formula is more complex because there are many known filters applied to PageRank, such as sandbox, blacklisting, keyword density analysis and, I am quite sure about it, deep linguistic analysis and etc ...

From what I observed, I really think we can talk about linguistic analysis. First Google can figure out which language is used on a web site even though there is no lang meta tag.

Then, another thing make me think this way, let me explain.
It's about the underscore "_" well known problematic, correctly stating we'd better use hyphens "-" to separate keywords in our URLs than underscores.
The experience made several time was to post two random words separated with an underscore, something like kjsdfhpkohfd_kkdkhdfkh, and then two other ones separated this time with hyphen, let's say reoaeoay-poiypoy.
After this page was crawled, the conclusion was we could only search for the two random words separated with hyphens separately (eg search for reoaeoay or poiypoy alone). The two random words separated with underscore weren't search-able alone.

So many SEOer claimed only hyphen was a separator. This was misunderstanding half the experience I think.
Because by the same time, I was running a site map for a web site of mine, located in a folder called site_map/, and guess what, I was able to perform search query like "name_of_my_site" plus "map" (and "map" was not part of the title ;) ), and was finding my site map, with map highlighted in the result URL.

So this clearly means half the experience was missed. We can conclude that Google is performing quite a deep language analysis, because in the underscore case, it will still be able to find out existing words.
Now you'll say, why the hyphen and underscore difference ?
Because the hyphen is actually used in many languages as a separator, the underscore is not, so when Google found two random words, obviously not listed in any dictionary, it still analyzed the hyphen as a text separator, and even though it was not able to find out any entry in any dictionaries for the two random words, it treated them as separable.
For the underscore, as it's not used as a separator in any language, Google treated the two ununderstandable random words as a single one, something like a symbol or a script file name for example.

So this shows Google is doing many many things when analysing content, and I am sure PageRank also depends on this analysis, and that this is going to be more and more important as Google will use better and better tools to do this. Remember the goal is to find the best content ;)

Then if you add the fact Google is as well performing such deep analysis on the web site you are linked from, you understand there is no way to be as accurate as the example you present us is.

I think that the whole process must not be this far from chaotic.
The chaos theory tells us almost all dynamic equilibrium's ( auto regulated ) are chaotic : population growth, sugar rate regulation in blood, earth cycle around the sun etc ... And this is far from meaning it must be a total mess, actually it's possible to demonstrate the earth would have gone out of the solar system long long ago if it's path was not chaotic.

Chaos means we cannot accurately predict anything, as few changes in the system can lead to major ones in the future, but it allows as well pretty stable regulation of auto regulated and dynamical systems. Lasted for about 4 000 000 000 years with earth trajectory so far ;)

The Google indexing phenomenon is highly dynamic (time depending) and depends on many parameters, with many filter and regulation algorithms, this can be said without looking at his code, it's stating the obvious.
Then, the PageRank being calculated recurrently, this means that your PageRank depends on the PageRank of all web sites linked together with at least one linked to yours, which I am sure can go up to all listed web sites for many cases as for example, a link from DMOZ will give you a PageRank based on DMOZ's PageRank based on the PageRank of all web site linking to it which depends on the PageRank of the web site linked to them ...

So talking about chaos theory in such matter is far from crazy.

This mean we can follow basic and general principles to Search Engine Optimize our web sites, but cannot perform any prediction of the type you suggested, thing are way more complex and simpler as well in a way.

All you said could be resumed in :

The more backlinks and deep backlinks from the Highest and more related pages and with the best content, the better PageRank. Simple isn't it ? :D

Then we should keep in mind Google must (or will soon) use some kind of categories of his own to sort web site, for example depending on the language analysis performed in it and the backlinks and the links in it etc ...

Same story always, a never ending loop between all params : PageRank Depends on number, type and quality of backlinks, content and the way it is analysed (thus rated), the number of pages your web site has, the way it is internally linked, the number of duplicates, the PageRank of your web site's page (thus your web site's PageRank) , the PageRank of the web site you're linked from (thus the PageRank of the web sites linking to them) ...
All we get in the end, when PageRank is set for a given page is a snapshot of this never ending procedure started the day Google first started PageRanking.

So you cannot say 3000 post will give this PageRank, but you can easily work on optimizing it.

You can as well read this post, in which I showed, among other things, that you could have a PR 6 with 44 backlinks ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21325
Joined: Fri Apr 28, 2006 9:03 pm

Postby Darth Pincho » Mon Oct 16, 2006 12:52 pm

Thank you for your time explaining so very well interesting thinkg, i will read the post you suggested.
Webmaster at:
http://www.tupincho.net
Discussion board at:
http://www.tupincho.net/foro/
Darth Pincho
PR1
PR1
 
Posts: 154
Joined: Fri Sep 22, 2006 8:37 pm

Postby dcz » Mon Oct 16, 2006 2:11 pm

You're welcome :D

After all, if I did not like to elaborate on such matters, I would not have started this project :D

Then, for your link, I understand the matter, maybe you could just tell the title of the book and it's author, just for reference, if some user would like to know more about it they'll still be able to search for it this way ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21325
Joined: Fri Apr 28, 2006 9:03 pm

Postby nims » Fri Nov 24, 2006 7:20 am

Hi dcz,

I was going thru the discussion above and found it to be really interesting and knowledgeable.

You really have a good hold on SEO techniques, spiders, SEs etc etc. so why dont you write a book on this, it wud be a great help to others and will get you more name , fame and money :wink:

I just keep studying your posts all day long... I guess I am addicted to SEO now or may be your explanations :shock: just great !!!
Hire Seo Staff overseas and Website designing and development in India for best results at affordable prices.
nims
PR2
PR2
 
Posts: 245
Joined: Wed Oct 11, 2006 9:31 am
Location: New Delhi, India

Postby dcz » Fri Nov 24, 2006 2:38 pm

Well thanks ;)

As I said, I really want to build up a good SEO sharing community. So I am willing to share all I know, to talk about it, and of course, the most important, to experiment and find out things about search engines.

As you know now, I like experimentation's, and I really think SEO is a great topic to start many.
This part of the project is not started yet, experimentation will start after team recruiting.
I have several domains and unlimited sub domains as a play ground for the future team wonders.

This part, sharing and experiment, is the part I prefer, actually more than the idea for writing one more web marketing book. Besides, SEO standards are always moving, and book don't. What is true when writing may be less interesting when reading.

Anyway, happy to be useful :D

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21325
Joined: Fri Apr 28, 2006 9:03 pm


Return to Google Forums

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 7 guests