Home > Computers > Software > Information Retrieval > Ranking > By Context
Link popularity ranking. Algorithms are eigenvector methods for identifying "authoritative" or "influential" articles, given hyperlink or citation information.
http://www2003.org/cdrom/papers/refereed/p007/p7-abiteboul.html
A good explanation about the convergence of various algorithms. This paper also describes an adaptive and on-line algorithm for computing the page importance. It can be used for focus crawling as well as for search engine's ranking.
http://www.cs.cornell.edu/home/kleinber/auth.pdf
HITs is a link-structure analysis algorithm which ranks pages by "authorities" (pages which have many incoming links and provide the best source of information on a given topic) and "hubs" (pages which have many outgoing links and provide useful lists of possibly relevant pages). Ranking is performed at query time.
http://www.research.rutgers.edu/~davison/discoweb/
This paper describes a prototype system, later known as the Teoma Search Engine. It performs a Link Analysis, loosely based on the Kleimberg method, and computed at query time.
http://www10.org/cdrom/papers/314/
A survey on PageRank, Hits and SALSA. It also describes two Bayesian statistical algorithms for ranking of hyperlinked documents and the concepts of monotonicity and locality, as well as various concepts of distance and similarity between ranking algorithms.
http://www-db.stanford.edu/~backrub/pageranksub.ps
Postscript-format slides which introduces citation importance ranking by Larry Page, Google's founder.
http://www-db.stanford.edu/~taherh/papers/encoding-pagerank.pdf
Lossy encoding for large scale PageRank calculation.
http://patft.uspto.gov/netacgi/nph-Parser?patentnumber=6285999
Lawrence Page's PageRank Patent.
http://www.cs.technion.ac.il/~moran/r/PS/lm-feb01.ps
A focused search algorithm (SALSA) based on Markov chains. It starts with a query on a broad topic, discards useless links, and then weights the remaining terms. A stochastic crawl is used to discover the authorities on this topic. [PS format]
http://pr.efactory.de/
Information on the algorithm, how to increase PageRank, what diminishes it and how to distribute PageRank within a website.
http://www.almaden.ibm.com/cs/k53/clever.html
The CLEVER search engine incorporates several algorithms that make use of hyperlink structure for discovering information on the Web. It is an extension of Hits method.
http://www.cs.washington.edu/homes/pedrod/papers/nips01b.pdf
This method uses query dependent importance scores and a probabilistic approach to improve upon PageRank. It pre-computes importance scores offline for every possible text query.
http://www.cs.cmu.edu/~cohn/papers/nips00.pdf
This paper describes a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives.
http://ilpubs.stanford.edu:8090/422/
First Stanford paper about PageRank. It is a static ranking, performed at indexing time, which interprets a link from page A to page B as a vote, by page A, for page B. Web is seen as a direct graph and votes recursively propagate from nodes to nodes. Ranking is performed at indexing time. Used by Google.
http://trec.nist.gov/pubs/trec8/papers/acsys.pdf
About the using of PageRank in Web Track 8 "large" and "small" datasets.
http://trec.nist.gov/pubs/trec9/papers/unine9.pdf
About the using of Link Popularity in Web Track 9 datasets.
http://www.cs.ualberta.ca/~drafiei/papers/www9.ps
PageRank and Hub and Authority generalization based on the topic of Web Pages. Definition of a model where a surfer can move forward (following an out-going link) and backward (following an in-going link in the inverse direction). [PS format]
http://www2003.org/cdrom/papers/refereed/p185/html/p185-jeh.html
Presentation paper. Link Popularity algorithms biased according to a user-specified set of given interesting pages.
Home > Computers > Software > Information Retrieval > Ranking > By Context
Thanks to DMOZ, which built a great web directory for nearly two decades and freely shared it with the web. About us