Arnetminer
Encyclopedia

Overview

Arnetminer is designed to search and perform data mining
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

 operations against academic publications on the Internet, using social network analysis to identify connections between researchers, conferences, and publications. This allows it to provide services such as expert finding, association search, course search, academic evaluation, and topic modeling.

Arnetminer was created as a research project in social influence analysis, social network ranking, and social network extraction. A number of peer-reviewed papers have been published arising from the development of the system. It has been in operation for more than three years, and has indexed 700,000 researchers and more than three million publications. The research was funded by the Chinese National High-tech R&D Program and the National Science Foundation of China.

Arnetminer is commonly used in academia to identify relationships between and draw statistical correlations about research and researchers. The product was used in a study aimed at verifying the popular notion that no more than six degrees of separation
Six degrees of separation
Six degrees of separation refers to the idea that everyone is on average approximately six steps away, by way of introduction, from any other person on Earth, so that a chain of, "a friend of a friend" statements can be made, on average, to connect any two people in six steps or fewer...

 connect any two people on Earth.

Operation

Arnetminer automatically extracts the researcher profile from the web. It collects and identifies the relevant pages, then uses a unified approach to extract data from the identified documents. It also extracts publications from online digital libraries using heuristic rules.

It integrates the extracted researchers’ profiles and the extracted publications. It employs the researcher name as the identifier. A probabilistic framework has been proposed to deal with the name ambiguity problem in the integration. The integrated data is stored into a researcher network knowledge base (RNKB).

The principal other product in the area are Google Scholar, Elsevier's Scirus, and the open source project CiteSeer.

See also

  • CiteSeerX
    CiteSeerX
    CiteSeerX is a public search engine and digital library and repository for scientific and academic papers with a focus on computer and information science. It is loosely based on the previous CiteSeer search engine and digital library and is built with a new open source infrastructure, SeerSuite,...

  • Digital Bibliography & Library Project
    Digital Bibliography & Library Project
    DBLP is a computer science bibliography website hosted at Universität Trier, in Germany. It was originally a database and logic programming bibliography site, and has existed at least since the 1980s. DBLP listed more than 1.3 million articles on computer science in January 2010...

  • Google Scholar
    Google Scholar
    Google Scholar is a freely accessible web search engine that indexes the full text of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes most peer-reviewed online journals of Europe and America's largest...

  • Libra
    Libra
    -Science and technology:* Libra , a star constellation in the sky* Libra , an ancient Roman unit of weight* Libra , a public search engine for academic papers and literature* Libra , a media cataloguing software...

  • List of academic databases and search engines
  • Scirus
    Scirus
    Scirus is a comprehensive science-specific search engine. Like CiteSeerX and Google Scholar, it is focused on scientific information. Unlike CiteSeerX, Scirus is not only for computer sciences and IT and not all of the results include full text. It also sends its scientific search results to...

  • Scopus
    Scopus
    Scopus, officially named SciVerse Scopus, is a bibliographic database containing abstracts and citations for academic journal articles. It covers nearly 18,000 titles from over 5,000 international publishers, including coverage of 16,500 peer-reviewed journals in the scientific, technical, medical,...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK