PageRank is an algorithm used by Google Search to rank websites in their search engine results. PageRank is a way to measure the importance of website pages.
This is not the only algorithm used by Google to order search engine results, but it is the first algorithm used by the company it is best known.
The PageRank of a page is calculated from the sum of the PageRank of pages with a link entering the calculated page that is divided by the number of outgoing pages of the page, a mitigating factor is applied to symbolize the probability that The user surfs another page.
I install networkx, it is a Python package for the creation, manipulation and study of structure, dynamics and complex network functions.
Networkx provides data structures and methods for storing graphs that I use for the pagerank algorithm.
import networkx as nx import numpy as np graphe-nx. DiGraph() TablePages - ["A","B","C"]page rank #Exemple with 3 pages graph.add_nodes_from (tablePages) #Ajout tops of the graph #on adds bows, we have: #la Page A has a link to B #la page B has a link to C #la Page C has a link to B #la page C has a link to A Page B has 2 incoming link Page C has an incoming link 2 links out Page A has a link entering an outgoing link graph.add_edges_from([('A','B'), ('C','A'),('B','C'), ('C','B')]) print ("Graphe Summits:") print (graphe.nodes)) print ("Stop the graph:") print (graph.edges) #Si an attenuation factor of 0.85 'd' is considered The page rank formula is: #PR (1-d)/n - Sum of all pages (PR(i) of incoming links to p/number of link coming out of the page that reference p) PR(A) - (1-0.85)/3 - 0.85 - (PR(C)/2) PR(B) - (1-0.85)/3 - 0.85 - (PR(A)/1 - PR(C)/2) PR(C) - (1-0.85)/3 - 0.85 - (PR(B)/1) pagerank - nx.pagerank print (pagerank)