ChemXSeer
Encyclopedia
ChemXSeer project, funded by the National Science Foundation, is a public integrated digital library
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...

, database, and search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...

 for scientific papers in chemistry
Chemistry
Chemistry is the science of matter, especially its chemical reactions, but also its composition, structure and properties. Chemistry is concerned with atoms and their interactions with other atoms, and particularly with the properties of chemical bonds....

. It is being developed by a multidisciplinary team of researchers at the Pennsylvania State University
Pennsylvania State University
The Pennsylvania State University, commonly referred to as Penn State or PSU, is a public research university with campuses and facilities throughout the state of Pennsylvania, United States. Founded in 1855, the university has a threefold mission of teaching, research, and public service...

. ChemXSeer was conceived by Dr. Prasenjit Mitra, Dr. Lee Giles
Lee Giles
C. Lee Giles is the David Reese Professor at the College of Information Sciences and Technology at the Pennsylvania State University. He is also Professor of Computer Science and Engineering, Professor of Supply Chain and Information Systems, and Director of the Intelligent Systems Research...

 and Dr. Karl Mueller as a way to integrate the chemical scientific literature with experimental, analytical, and simulation data from different types of experimental systems. The goal of the project is to create an intelligent search and database which will provide access to relevant data to a diverse community of users who have a need for chemical information. It is hosted on the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

 at the College of Information Sciences and Technology, The Pennsylvania State University
Pennsylvania State University
The Pennsylvania State University, commonly referred to as Penn State or PSU, is a public research university with campuses and facilities throughout the state of Pennsylvania, United States. Founded in 1855, the university has a threefold mission of teaching, research, and public service...

.

Features

In order to provide access to relevant data to users ChemXSeer provides new features that are not available in traditional search engines or digital libraries.
  1. Chemical Entity Search: A tool capable of identifying chemical formuale
    Chemical formula
    A chemical formula or molecular formula is a way of expressing information about the atoms that constitute a particular chemical compound....

     and chemical names, and extracting and disambiguating them from general terms within documents. Those disambiguated terms are used for performing searches.
  2. TableSeer: In scholarly articles Tables are used to present, list, summarize, and structure important data. TableSeer automatically identifies tables in digital documents, extracts the table Metadata
    Metadata
    The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

     as well as the cells content, and stores them in such a way that allows users to either query the table content or search for tables in a large set of documents.
  3. Dataset search: ChemXSeer provides tools to incorporate datasets from different experiments sources. The system is able to manipulate results from multiple formats such as XML
    XML
    Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

    , Microsoft Excel
    Microsoft Excel
    Microsoft Excel is a proprietary commercial spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications...

    , Gaussian, and CHARMM
    CHARMM
    CHARMM is the name of a widely used set of force fields for molecular dynamics as well as the name for the molecular dynamics simulation and analysis package associated with them...

    , create databases, to allow direct queries over the data, create Metadata
    Metadata
    The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

    , using an annotation tool, which will allow users to search over the datasets, as well as a way to create links among datasets and/or between datasets and documents.


In addition to these tools, ChemXSeer will integrate the advances made by its sister project CiteSeerX
CiteSeer
CiteSeer was a public search engine and digital library for scientific and academic papers. It is often considered to be the first automated citation indexing system and was considered a predecessor of academic search tools such as Google Scholar and Microsoft Academic Search. It was replaced by...

to provide:
  • Full text search
  • Author, affiliation, title and venue search
  • Citation and acknowledgement search
  • Citation linking and statistics

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK