Ray Solomonoff
Encyclopedia
Ray Solomonoff was the inventor of algorithmic probability
Algorithmic probability
In algorithmic information theory, algorithmic probability is a method of assigning a probability to each hypothesis that explains a given observation, with the simplest hypothesis having the highest probability and the increasingly complex hypotheses receiving increasingly small probabilities...

, and founder of algorithmic information theory, He was an originator of the branch of artificial intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

 based on machine learning
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

, prediction and probability. He circulated the first report on non-semantic machine learning in 1956.

Solomonoff first described algorithmic probability in 1960, publishing the crucial theorem that launched Kolmogorov complexity
Kolmogorov complexity
In algorithmic information theory , the Kolmogorov complexity of an object, such as a piece of text, is a measure of the computational resources needed to specify the object...

 and algorithmic information theory. He first described these results at a Conference at Caltech in 1960, and in a report, Feb. 1960, "A Preliminary Report on a General Theory of Inductive Inference." He clarified these ideas more fully in his 1964 publications, "A Formal Theory of Inductive Inference," Part I and Part II.

Algorithmic probability is a mathematically formalized combination of Occam's razor
Occam's razor
Occam's razor, also known as Ockham's razor, and sometimes expressed in Latin as lex parsimoniae , is a principle that generally recommends from among competing hypotheses selecting the one that makes the fewest new assumptions.-Overview:The principle is often summarized as "simpler explanations...

, and the Principle of Multiple Explanations.
It is a machine independent method of assigning a probability value to each hypothesis (algorithm/program) that explains a given observation, with the simplest hypothesis (the shortest program) having the highest probability and the increasingly complex hypotheses receiving increasingly small probabilities.

Although he is best known for algorithmic probability
Algorithmic probability
In algorithmic information theory, algorithmic probability is a method of assigning a probability to each hypothesis that explains a given observation, with the simplest hypothesis having the highest probability and the increasingly complex hypotheses receiving increasingly small probabilities...

 and his general theory of inductive inference
Inductive inference
Around 1960, Ray Solomonoff founded the theory of universal inductive inference, the theory of prediction based on observations; for example, predicting the next symbol based upon a given series of symbols...

, he made many other important discoveries throughout his life, most of them directed toward his goal in artificial intelligence: to develop a machine that could solve hard problems using probabilistic methods.

Life History Through 1964

Ray Solomonoff was born on July 25, 1926, in Cleveland, Ohio
Cleveland, Ohio
Cleveland is a city in the U.S. state of Ohio and is the county seat of Cuyahoga County, the most populous county in the state. The city is located in northeastern Ohio on the southern shore of Lake Erie, approximately west of the Pennsylvania border...

, son of the Russian
Russians
The Russian people are an East Slavic ethnic group native to Russia, speaking the Russian language and primarily living in Russia and neighboring countries....

 immigrants Phillip Julius and Sarah Mashman Solomonoff. He attended Glenville High School, graduating in 1944. In 1944 he joined the United States Navy
United States Navy
The United States Navy is the naval warfare service branch of the United States Armed Forces and one of the seven uniformed services of the United States. The U.S. Navy is the largest in the world; its battle fleet tonnage is greater than that of the next 13 largest navies combined. The U.S...

 as Instructor in Electronics. From 1947-1951 he attended the University of Chicago
University of Chicago
The University of Chicago is a private research university in Chicago, Illinois, USA. It was founded by the American Baptist Education Society with a donation from oil magnate and philanthropist John D. Rockefeller and incorporated in 1890...

, studying under Professors such as Rudolf Carnap
Rudolf Carnap
Rudolf Carnap was an influential German-born philosopher who was active in Europe before 1935 and in the United States thereafter. He was a major member of the Vienna Circle and an advocate of logical positivism....

 and Enrico Fermi
Enrico Fermi
Enrico Fermi was an Italian-born, naturalized American physicist particularly known for his work on the development of the first nuclear reactor, Chicago Pile-1, and for his contributions to the development of quantum theory, nuclear and particle physics, and statistical mechanics...

, and graduated with an M.S. in Physics in 1951.

From his earliest years he was motivated by the pure joy of mathematical discovery and by the desire to explore where no one had gone before. At age of 16, in 1942, he began to search for a general method to solve mathematical problems.

In 1952 he met Marvin Minsky
Marvin Minsky
Marvin Lee Minsky is an American cognitive scientist in the field of artificial intelligence , co-founder of Massachusetts Institute of Technology's AI laboratory, and author of several texts on AI and philosophy.-Biography:...

, John McCarthy
John McCarthy (computer scientist)
John McCarthy was an American computer scientist and cognitive scientist. He coined the term "artificial intelligence" , invented the Lisp programming language and was highly influential in the early development of AI.McCarthy also influenced other areas of computing such as time sharing systems...

 and others interested in machine intelligence. In 1956 Minsky and McCarthy and others organized the Dartmouth Summer Research Conference on Artificial Intelligence, where Ray was one of the original 10 participants --- he and Minsky were the only ones to stay all summer. It was for this group that Artificial Intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

 was first named as a science. Computers at the time could solve very specific mathematical problems, but not much else. Ray wanted to pursue a bigger question, how to make machines more generally intelligent, and how computers could use probability for this purpose.

Work History Through 1964

He wrote three papers, two with Anatol Rapoport
Anatol Rapoport
Anatol Rapoport was a Russian-born American Jewish mathematical psychologist. He contributed to general systems theory, mathematical biology and to the mathematical modeling of social interaction and stochastic models of contagion.-Biography:...

, in 1950-52, that are regarded as the earliest statistical analysis of networks.

He was one of the 10 attendees at the 1956 Dartmouth Summer Research Conference on Artificial Intelligence, the seminal event for artificial intelligence as a field. He wrote and circulated a report among the attendees: "An Inductive Inference Machine". It viewed machine learning as probabilistic, with an emphasis on the importance of training sequences, and on the use of parts of previous solutions to problems in constructing trial solutions for new problems. He published a version of his findings in 1957. These were the first papers to be written on probabilistic Machine Learning.

In the late 1950s, he invented probabilistic languages and their associated grammars. A probabilistic language assigns a probability value to every possible string.

Generalizing the concept of probabilistic grammars led him to his breakthrough discovery in 1960 of Algorithmic Probability.

Prior to the 1960s, the usual method of calculating probability was based on frequency: taking the ratio of favorable results to the total number of trials. In his 1960 publication, and, more completely, in his 1964 publications, Solomonoff seriously revised this definition of probability. He called this new form of probability "Algorithmic Probability."

The basic theorem of what was later called Kolmogorov Complexity
Kolmogorov complexity
In algorithmic information theory , the Kolmogorov complexity of an object, such as a piece of text, is a measure of the computational resources needed to specify the object...

 was part of his General Theory. Writing in 1960, he begins: "Consider a very long sequence of symbols ...We shall consider such a sequence of symbols to be 'simple' and have a high a priori probability, if there exists a very brief description of this sequence - using, of course, some sort of stipulated description method. More exactly, if we use only the symbols 0 and 1 to express our description, we will assign the probability 2-N to a sequence of symbols if its shortest possible binary description contains N digits."

The probability is with reference to a particular Universal Turing machine
Universal Turing machine
In computer science, a universal Turing machine is a Turing machine that can simulate an arbitrary Turing machine on arbitrary input. The universal machine essentially achieves this by reading both the description of the machine to be simulated as well as the input thereof from its own tape. Alan...

. Solomonoff showed and in 1964 proved that the choice of machine, while it could add a constant factor would not change the probability ratios very much. These probabilities are machine independent.

In 1965, the Russian mathematician Kolmogorov independently
Multiple discovery
The concept of multiple discovery is the hypothesis that most scientific discoveries and inventions are made independently and more or less simultaneously by multiple scientists and inventors...

 published similar ideas. When he became aware of Solomonoff's work, he acknowledged Solomonoff, and for several years, Solomonoff's work was better known in the Soviet Union than in the Western World. The general consensus in the scientific community, however, was to associate this type of complexity with Kolmogorov, who was more concerned with randomness of a sequence. Algorithmic Probability became associated with Solomonoff, who was focused on prediction - the extrapolation of a sequence.

Later in the same 1960 publication Solomonoff describes his extension of the single-shortest-code theory. This is Algorithmic
Probability. He states: "It would seem that if there are several different methods of describing a sequence, each of these methods should be given some weight in determining the probability of that sequence." He then shows how this idea can be used to generate the universal a priori probability distribution and how it enables the use of Bayes rule in inductive inference. Inductive inference, by adding up the predictions of all models describing a particular sequence, using suitable weights based on the lengths of those models, gets the probability distribution for the extension of that sequence. This method of prediction has since become known as Solomonoff induction.

He enlarged his theory, publishing a number of reports leading up to the publications in 1964. The 1964 papers give a more detailed description of Algorithmic Probability, and Solomonoff Induction, presenting 5 different models, including the model popularly called the Universal Distribution.

Work History from 1964 to 1984

Other scientists who had been at the 1956 Dartmouth Summer Conference (such as Newell and Simon) were developing the branch of Artificial Intelligence which used machines governed by if-then rules, fact based. Solomonoff was developing the branch of Artificial Intelligence that focussed on probability and prediction; his specific view of A.I. described machines that were governed by the Algorithmic Probability distribution. The machine generates theories together with their associated probabilities, to solve problems, and as new problems and theories develop, updates the probability distribution on the theories.

In 1968 he found a proof for the
efficacy of Algorithmic Probability, but mainly because of lack of general interest at that time, did not publish it until 10 years later. In his report, he published the proof for the convergence theorem.

In the years following his discovery of Algorithmic Probability he focused on how to use this probability and Solomonoff Induction in actual prediction and problem solving for A.I. He also wanted to understand the deeper implications of this probability system.

One important aspect of Algorithmic Probability is that it is complete and incomputable.

In the 1968 report he shows that Algorithmic Probability is complete; that is, if there is any describable regularity in a body of data, Algorithmic Probability will eventually discover that regularity, requiring a relatively small sample of that data. Algorithmic Probability is the only probability system known to be complete in this way. As a necessary consequence of its completeness it is incomputable. The incomputability is because some algorithms - a subset of those that are partially recursive - can never be evaluated fully because it would take too long. But these programs will at least be recognized as possible solutions. On the other hand, any computable system is incomplete. There will always be descriptions outside that system's search space which will never be acknowledged or considered, even in an infinite amount of time. Computable prediction models hide this fact by ignoring such algorithms.

In many of his papers he described how to search for solutions to problems and in the 1970s and early 1980s developed what he felt was the best way to update the machine.

The use of probability in A.I., however, did not have a completely smooth path. In the early years of A.I., the relevance of probability was problematic. Many in the A.I. community felt probability was not usable in their work. The area of pattern recognition did use a form of probability, but because there was no broadly based theory of how to incorporate probability in any A.I. field, most fields did not use it at all.

There were, however, researchers such as Judea Pearl and Peter Cheeseman who argued that probability could be used in artificial intelligence.

About 1984, at an annual meeting of the American Association for Artificial Intelligence (AAAI), it was decided that probability was in no way relevant to A.I.

A protest group formed, and the next year there was a workshop at the AAAI meeting devoted to "Probability and Uncertainty in AI." This yearly workshop has continued to the present day.

As part of the protest at the first workshop, Solomonoff gave a paper on how to apply the universal distribution to problems in A.I. This was an early version of the system he has been developing since that time.

In that report, he described the search technique he had developed. In search problems, the best order of search, is time , where is the time needed to test the trial and is the probability of success of that trial. He called this the "Conceptual Jump Size" of the problem. Levin's search technique approximates this order, and so Solomonoff, who had studied Levin's work, called this search technique Lsearch.

Work History — The Later Years

In other papers he explored how to limit the time needed to search for solutions, writing on resource bounded search. The search space is limited by available time or computation cost rather than by cutting out search space as is done in some other prediction methods, such as Minimum Description Length.

Throughout his career Solomonoff was concerned with the potential benefits and dangers of A.I., discussing it in many of his published reports. In 1985 he analyzed a likely evolution of A.I., giving a formula predicting when it would reach the "Infinity Point". This Infinity Point is an early version of the "Singularity" later made popular by Ray Kurzweil.

Originally algorithmic induction methods extrapolated ordered sequences of strings. Methods were needed for dealing with other kinds of data.

A 1999 report, generalizes the Universal Distribution and associated convergence theorems to unordered sets of strings and a 2008 report, to unordered pairs of strings.

In 1997, 2003 and 2006 he showed that incomputability and subjectivity are both necessary and desirable characteristics of any high performance induction system.

In 1970 he formed his own one man company, Oxbridge Research, and has continued his research there except for periods at other
institutions such as MIT, University of Saarland in Germany and the Dalle Molle Institute for Artificial Intelligence in Lugano, Switzerland. In 2003 he was the first recipient of the Kolmogorov Award by The Computer Learning Research Center at the Royal Holloway, University of London, where he gave the inaugural Kolmogorov Lecture. Solomonoff was most recently a visiting Professor at the CLRC.

In 2006 he spoke at AI@50
AI@50
AI@50, formally known as the "Dartmouth Artificial Intelligence Conference: The Next Fifty Years" , was a conference commemorating the 50th anniversary of the Dartmouth Conferences which effectively inaugurated the history of artificial intelligence...

, "Dartmouth Artificial Intelligence Conference: the Next Fifty Years" commemorating the fiftieth anniversary
of the original Dartmouth summer study group. Solomonoff was one of five original participants to attend.

In Feb. 2008, he gave the keynote address at the Conference "Current Trends in the Theory and Application of Computer Science" (CTTACS), held at Notre Dame University in Lebanon. He followed this with a short series of lectures, and began research on new applications of Algorithmic Probability.

Algorithmic Probability and Solomonoff Induction have many advantages for Artificial Intelligence. Algorithmic Probability gives extremely accurate probability estimates. These estimates can be revised by a reliable method so that they continue to be acceptable. It utilizes search time in a very efficient way. In addition to probability estimates, Algorithmic Probability "has for AI another important value: its multiplicity of models gives us many different ways to understand our data;

A very conventional scientist understands his science using a single 'current paradigm' --- the way of understanding that is most in vogue at the present time. A more creative scientist understands his science in very many ways, and can more easily create new theories, new ways of understanding, when the 'current paradigm' no longer fits the current data".

A description of Solomonoff's life and work prior to 1997 is in "The Discovery of Algorithmic Probability", Journal of Computer and System Sciences, Vol 55, No. 1, pp 73–88, August 1997. The paper, as well as most of the others mentioned here, are available on his website at the publications page.

See also

  • Ming Li and Paul Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, N.Y., 2008, includes historical notes on Solomonoff as well as a description and analysis of his work.
  • Marcus Hutter
    Marcus Hutter
    Marcus Hutter is a German computer scientist and professor at the Australian National University. Hutter was born and educated in Munich, where he studied physics and computer science...

    's Universal Artificial Intelligence

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK