Dendral
Encyclopedia
Dendral was an influential pioneer project in artificial intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

 (AI) of the 1960s, and the computer software
Computer software
Computer software, or just software, is a collection of computer programs and related data that provide the instructions for telling a computer what to do and how to do it....

 expert system
Expert system
In artificial intelligence, an expert system is a computer system that emulates the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning about knowledge, like an expert, and not by following the procedure of a developer as is the case in...

 that it produced. Its primary aim was to study hypothesis formation and discovery in science. For that, a specific task in science was chosen: help organic chemists
Organic chemistry
Organic chemistry is a subdiscipline within chemistry involving the scientific study of the structure, properties, composition, reactions, and preparation of carbon-based compounds, hydrocarbons, and their derivatives...

 in identifying unknown organic molecules, by analyzing their mass spectra
Mass spectrometry
Mass spectrometry is an analytical technique that measures the mass-to-charge ratio of charged particles.It is used for determining masses of particles, for determining the elemental composition of a sample or molecule, and for elucidating the chemical structures of molecules, such as peptides and...

 and using knowledge of chemistry. It was done at Stanford University
Stanford University
The Leland Stanford Junior University, commonly referred to as Stanford University or Stanford, is a private research university on an campus located near Palo Alto, California. It is situated in the northwestern Santa Clara Valley on the San Francisco Peninsula, approximately northwest of San...

 by Edward Feigenbaum
Edward Feigenbaum
Edward Albert Feigenbaum is a computer scientist working in the field of artificial intelligence. He is often called the "father of expert systems."...

, Bruce Buchanan, Joshua Lederberg
Joshua Lederberg
Joshua Lederberg ForMemRS was an American molecular biologist known for his work in microbial genetics, artificial intelligence, and the United States space program. He was just 33 years old when he won the 1958 Nobel Prize in Physiology or Medicine for discovering that bacteria can mate and...

, and Carl Djerassi
Carl Djerassi
Carl Djerassi is an Austrian-American chemist, novelist, and playwright best known for his contribution to the development of the first oral contraceptive pill . Djerassi is emeritus professor of chemistry at Stanford University.He participated in the invention in 1951, together with Mexican Luis E...

, along with a team of highly creative research associates and students. It began in 1965 and spans approximately half the history of AI research.

The software program Dendral is considered the first expert system because it automated the decision-making process and problem-solving behavior of organic chemists. The project consisted of research on two main programs Heuristic Dendral and Meta-Dendral, and several sub-programs. It was written in Lisp (programming language), which was considered the language of AI because of its flexibility.

Many systems were derived from Dendral, including MYCIN
Mycin
In artificial intelligence, MYCIN was an early expert system designed to identify bacteria causing severe infections, such as bacteremia and meningitis, and to recommend antibiotics, with the dosage adjusted for patient's body weight — the name derived from the antibiotics themselves, as many...

, MOLGEN, MACSYMA
Macsyma
Macsyma is a computer algebra system that was originally developed from 1968 to 1982 at MIT as part of Project MAC and later marketed commercially...

, PROSPECTOR, XCON
Xcon
The R1 program was a production-rule-based system written in OPS5 by John P. McDermott of CMU in 1978 to assist in the ordering of DEC's VAX computer systems by automatically selecting the computer system components based on the customer's requirements...

, and STEAMER.

The name Dendral is a portmanteau of the term "Dendritic Algorithm".

Heuristic Dendral

Heuristic Dendral is a program that uses mass spectra or other experimental data together with knowledge base of chemistry, to produce a set of possible chemical structures that may be responsible for producing the data. A mass spectrum of a compound is produced by a mass spectrometer, and is used to determine its molecular weight, the sum of the masses of its atomic constituents. For example, the compound water (H2O), has a molecular weight of 18 since hydrogen has a mass of 1.01 and oxygen 16.00, and its mass spectrum has a peak at 18 units. Heuristic Dendral would use this input mass and the knowledge of atomic mass numbers and valence rules, to determine the possible combinations of atomic constituents whose mass would add up to 18. As the weight increases and the molecules become more complex, the number of possible compounds increases drastically. Thus, a program that is able to reduce this number of candidate solutions through the process of hypothesis formation is essential.

New graph-theoretic algorithms were invented by Lederberg, Harold Brown, and others that generate all graphs with a specified set of nodes and connection-types (chemical atoms and bonds) -- with or without cycles. Moreover, the team was able to prove mathematically that the generator is complete, in that it produces all graphs with the specified nodes and edges, and that it is non-redundant, in that the output contains no equivalent graphs (e.g., mirror images). The CONGEN program, as it became known, was developed largely by computational chemists Ray Carhart, Jim Nourse, and Dennis Smith. It was useful to chemists as a stand-alone program to generate chemical graphs showing a complete list of structures that satisfy the constraints specified by a user.

Meta-Dendral

Meta-Dendral is a machine learning system that receives the set of possible chemical structures and corresponding mass spectra as input, and proposes a set of rules of mass spectrometry that correlate structural features with processes that produce the mass spectrum. These rules would be fed back to Heuristic Dendral (in the planning and testing programs described below) to test their applicability. Thus, "Heuristic Dendral is a performance system and Meta-Dendral is a learning system". The program is based on two important features: the plan-generate-test paradigm and knowledge engineering.

Plan-generate-test paradigm

The plan-generate-test paradigm is the basic organization of the problem-solving method, and is a common paradigm used by both Heuristic Dendral and Meta-Dendral systems. The generator (later named CONGEN) generates potential solutions for a particular problem, which are then expressed as chemical graphs in Dendral. However, this is feasible only when the number of candidate solutions is minimal. When there are large numbers of possible solutions, Dendral has to find a way to put constraints that rules out large sets of candidate solutions. This is the primary aim of Dendral planner, which is a “hypothesis-formation” program that employs “task-specific knowledge to find constraints for the generator”. Last but not least, the tester analyzes each proposed candidate solution and discards those that fail to fulfill certain criteria. This mechanism of plan-generate-test paradigm is what holds Dendral together.

Knowledge Engineering

The primary aim of knowledge engineering is to attain a productive interaction between the available knowledge base and problem solving techniques. This is possible through development of a procedure in which large amounts of task-specific information is encoded into heuristic programs. Thus, the first essential component of knowledge engineering is a large “knowledge base.” Dendral has specific knowledge about the mass spectrometry technique, a large amount of information that forms the basis of chemistry and graph theory, and information that might be helpful in finding the solution of a particular chemical structure elucidation problem. This “knowledge base” is used both to search for possible chemical structures that match the input data, and to learn new “general rules” that help prune searches. The benefit Dendral provides the end user, even a non-expert, is a minimized set of possible solutions to check manually.

Heuristics

A heuristic
Heuristic
Heuristic refers to experience-based techniques for problem solving, learning, and discovery. Heuristic methods are used to speed up the process of finding a satisfactory solution, where an exhaustive search is impractical...

 is a rule of thumb, an algorithm that does not guarantee a solution, but reduces the number of possible solutions by discarding unlikely and irrelevant solutions. The use of heuristics to solve problems is called "heuristics programming", and was used in Dendral to allow it to replicate in machines the process through which human experts induce the solution to problems via rules of thumb and specific information.

Heuristics programming was a major approach and a giant step forward in artificial intelligence, as it allowed scientists to finally automate certain traits of human intelligence. It became prominent among scientists in the late 1940s through George Polya
George Pólya
George Pólya was a Hungarian mathematician. He was a professor of mathematics from 1914 to 1940 at ETH Zürich and from 1940 to 1953 at Stanford University. He made fundamental contributions to combinatorics, number theory, numerical analysis and probability theory...

’s book, How to Solve It: A New Aspect of Mathematical Method. As Herbert Simon
Herbert Simon
Herbert Alexander Simon was an American political scientist, economist, sociologist, and psychologist, and professor—most notably at Carnegie Mellon University—whose research ranged across the fields of cognitive psychology, cognitive science, computer science, public administration, economics,...

 said in The Sciences of the Artificial, "if you take a heuristic conclusion as certain, you may be fooled and disappointed; but if you neglect heuristic conclusions altogether you will make no progress at all."

History

During mid 20th century, the question "can machines think?" became intriguing and popular among scientists, primarily to add humanistic characteristics to machine behavior. John McCarthy
John McCarthy (computer scientist)
John McCarthy was an American computer scientist and cognitive scientist. He coined the term "artificial intelligence" , invented the Lisp programming language and was highly influential in the early development of AI.McCarthy also influenced other areas of computing such as time sharing systems...

, who was one of the prime researchers of this field, termed this concept of machine intelligence as "artificial intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

" (AI) during the Dartmouth summer in 1956. AI is usually defined as the capacity of a machine to perform operations that are analogous to human cognitive capabilities. Much research to create AI was done during the 20th century.

Also around mid 20th century, science, especially biology, faced a fast-increasing need to develop a "man-computer symbiosis", to aid scientists in solving problems. For example, the structural analysis of myogoblin, hemoglobin
Hemoglobin
Hemoglobin is the iron-containing oxygen-transport metalloprotein in the red blood cells of all vertebrates, with the exception of the fish family Channichthyidae, as well as the tissues of some invertebrates...

, and other protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s relentlessly needed instrumentation development due to its complexity.

In the early 1960s, Joshua Lederberg started working with computers and quickly became tremendously interested in creating interactive computers to help him in his exobiology research. Specifically, he was interested in designing computing systems that to help him study alien organic compounds. As he was not an expert in either chemistry or computer programming, he collaborated with Stanford chemist Carl Djerassi to help him with chemistry, and Edward Feigenbaum with programming, to automate the process of determining chemical structures from raw mass spectrometry data. Feigenbaum was an expert in programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s and heuristics, and helped Lederberg design a system that replicated the way Carl Djerassi solved structure elucidation problems. They devised a system called Dendritic Algorithm (Dendral) that was able to generate possible chemical structures corresponding to the mass spectrometry data as an output.

Dendral then was still very inaccurate in assessing spectra of ketone
Ketone
In organic chemistry, a ketone is an organic compound with the structure RCR', where R and R' can be a variety of atoms and groups of atoms. It features a carbonyl group bonded to two other carbon atoms. Many ketones are known and many are of great importance in industry and in biology...

s, alcohols, and isomer
Isomer
In chemistry, isomers are compounds with the same molecular formula but different structural formulas. Isomers do not necessarily share similar properties, unless they also have the same functional groups. There are many different classes of isomers, like stereoisomers, enantiomers, geometrical...

s of chemical compounds. Thus, as seen in figure 1, Djerassi "taught" general rules to Dendral that could help eliminate most of the "chemically implausible" structures, and produce a set of structures that could now be analyzed by a "non-expert" user to determine the right structure. Figure 2 shows how Dendral operates without an expert, after all the general rules were entered into Dendral's knowledge base.

The Dendral team recruited Bruce Buchanan to extend the Lisp program initially written by Georgia Sutherland. Buchanan had similar ideas and interests as Feigenbaum and Lederberg, but his special interests were scientific discovery and hypothesis formation. As Joseph November said in Digitizing Life: The Introduction of Computers to Biology and Medicine, "(Buchanan) wanted the system (Dendral) to make discoveries on its own, not just help humans make them". Buchanan, Lederberg and Feigenbaum designed "Meta-Dendral", which was a "hypothesis maker". Heuristic Dendral "would serve as a template for similar knowledge-based systems in other areas" rather than just concentrating in the field of organic chemistry. Meta-Dendral was a model for knowledge-rich learning systems that was later codified in Tom Mitchell's influential Version Space Model
Version space
A version space in concept learning or induction is the subset of all hypotheses that are consistent with the observed training examples. This set contains all hypotheses that have not been eliminated as a result of being in conflict with observed data....

of learning.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK