Friendly artificial intelligence - AbsoluteAstronomy.com

A Friendly Artificial Intelligence or FAI is an artificial intelligence

Artificial intelligence

Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

(AI) that has a positive rather than negative effect on humanity. Friendly AI also refers to the field of knowledge required to build such an AI. This term particularly applies to AIs which have the potential to significantly impact humanity, such as those with intelligence comparable to or exceeding that of humans ("superintelligence

Superintelligence

A superintelligence, hyperintelligence or superhuman intelligence is a hypothetical entity which possesses intelligence surpassing that of any existing human being...

"; see strong AI

Strong AI

Strong AI is artificial intelligence that matches or exceeds human intelligence — the intelligence of a machine that can successfully perform any intellectual task that a human being can. It is a primary goal of artificial intelligence research and an important topic for science fiction writers and...

and technological singularity

Technological singularity

Technological singularity refers to the hypothetical future emergence of greater-than-human intelligence through technological means. Since the capabilities of such an intelligence would be difficult for an unaided human mind to comprehend, the occurrence of a technological singularity is seen as...

). This specific term was coined by Eliezer Yudkowsky

Eliezer Yudkowsky

Eliezer Shlomo Yudkowsky is an American artificial intelligence researcher concerned with the singularity and an advocate of friendly artificial intelligence, living in Redwood City, California.- Biography :...

of the Singularity Institute for Artificial Intelligence as a technical term

Technical terminology

Technical terminology is the specialized vocabulary of any field, not just technical fields. The same is true of the synonyms technical terms, terms of art, shop talk and words of art, which do not necessarily refer to technology or art...

distinct from the everyday meaning of the word "friendly"; however, the concern is much older.

Goals and definitions of Friendly AI

Many experts have argued that AI systems with goals that are not perfectly identical to or very closely aligned with human ethics are intrinsically dangerous unless extreme measures are taken to ensure the safety of humanity. Decades ago, Ryszard Michalski, one of the pioneers of Machine Learning, taught his Ph.D. students that any truly alien mind, to include machine minds, was unknowable and therefore dangerous to humans. More recently, Eliezer Yudkowsky has called for the creation of “Friendly AI” to mitigate the existential threat of hostile intelligences. Stephen Omohundro argues that all advanced AI systems will, unless explicitly counteracted, exhibit a number of basic drives/tendencies/desires because of the intrinsic nature of goal-driven systems and that these drives will, “without special precautions”, cause the AI to act in ways that range from the disobedient to the dangerously unethical.

According to the proponents of Friendliness, the goals of future AIs will be more arbitrary and alien than commonly depicted in science fiction

Science fiction

Science fiction is a genre of fiction dealing with imaginary but more or less plausible content such as future settings, futuristic science and technology, space travel, aliens, and paranormal abilities...

and earlier futurist speculation, in which AIs are often anthropomorphised

Anthropomorphism

Anthropomorphism is any attribution of human characteristics to animals, non-living things, phenomena, material states, objects or abstract concepts, such as organizations, governments, spirits or deities. The term was coined in the mid 1700s...

and assumed to share universal human modes of thought. Because AI is not guaranteed to see the "obvious" aspects of morality and sensibility that most humans see so effortlessly, the theory goes, AIs with intelligences or at least physical capabilities greater than our own may concern themselves with endeavours that humans would see as pointless or even laughably bizarre. One example Yudkowsky provides is that of an AI initially designed to solve the Riemann hypothesis

Riemann hypothesis

In mathematics, the Riemann hypothesis, proposed by , is a conjecture about the location of the zeros of the Riemann zeta function which states that all non-trivial zeros have real part 1/2...

, which, upon being upgraded or upgrading itself with superhuman intelligence, tries to develop molecular nanotechnology

Molecular nanotechnology

Molecular nanotechnology is a technology based on the ability to build structures to complex, atomic specifications by means of mechanosynthesis. This is distinct from nanoscale materials...

because it wants to convert all matter in the Solar System

Solar System

The Solar System consists of the Sun and the astronomical objects gravitationally bound in orbit around it, all of which formed from the collapse of a giant molecular cloud approximately 4.6 billion years ago. The vast majority of the system's mass is in the Sun...

into computing material to solve the problem, killing the humans who asked the question. For humans, this would seem ridiculously absurd, but as Friendliness theory stresses, this is only because we evolved to have certain instinctive sensibilities which an artificial intelligence, not sharing our evolutionary history, may not necessarily comprehend unless we design it to.

Friendliness proponents stress less the danger of superhuman AIs that actively seek to harm humans, but more of AIs that are disastrously indifferent to them. Superintelligent AIs may be harmful to humans if steps are not taken to specifically design them to be benevolent. Doing so effectively is the primary goal of Friendly AI. Designing an AI

Strong AI

, whether deliberately or semi-deliberately, without such "Friendliness safeguards", would therefore be seen as highly immoral, especially if the AI could engage in recursive self-improvement, potentially leading to a significant power concentration.

This belief that human goals are so arbitrary derives heavily from modern advances in evolutionary psychology

Evolutionary psychology

Evolutionary psychology is an approach in the social and natural sciences that examines psychological traits such as memory, perception, and language from a modern evolutionary perspective. It seeks to identify which human psychological traits are evolved adaptations, that is, the functional...

. Friendliness theory claims that most AI speculation is clouded by analogies between AIs and humans, and assumptions that all possible minds must exhibit characteristics that are actually psychological adaptation

Psychological adaptation

A psychological adaptation, also called an Evolved psychological mechanism or EPM, is an aspect of a human or other animal's psychology that is the result of evolutionary pressures. It could serve a specific purpose, have served a purpose in the past , or be a side-effect of another EPM...

s that exist in humans (and other animals) only because they were once beneficial and perpetuated by natural selection

Natural selection

Natural selection is the nonrandom process by which biologic traits become either more or less common in a population as a function of differential reproduction of their bearers. It is a key mechanism of evolution....

. This idea is expanded on greatly in section two of Yudkowsky's Creating Friendly AI, "Beyond anthropomorphism".

Many supporters of FAI speculate that an AI able to reprogram and improve itself, Seed AI

Seed AI

Seed AI is a hypothesized type of strong artificial intelligence capable of recursive self-improvement. Having improved itself it would become better at improving itself, potentially leading to an exponential increase in intelligence...

, is likely to create a huge power disparity between itself and statically intelligent human minds; that its ability to enhance itself would very quickly outpace the human ability to exercise any meaningful control over it. While many doubt such scenarios are likely, if they were to occur, it would be important for AI to act benevolently towards humans. As Oxford

Oxford

The city of Oxford is the county town of Oxfordshire, England. The city, made prominent by its medieval university, has a population of just under 165,000, with 153,900 living within the district boundary. It lies about 50 miles north-west of London. The rivers Cherwell and Thames run through...

philosopher Nick Bostrom

Nick Bostrom

Nick Bostrom is a Swedish philosopher at the University of Oxford known for his work on existential risk and the anthropic principle. He holds a PhD from the London School of Economics...

puts it:

"Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human friendly."

It is important to stress that Yudkowsky's Friendliness Theory is very different from ideas relating to the concept that AIs may be made safe by including specifications or strictures into their programming or hardware architecture, often exemplified by Isaac Asimov

Isaac Asimov

Isaac Asimov was an American author and professor of biochemistry at Boston University, best known for his works of science fiction and for his popular science books. Asimov was one of the most prolific writers of all time, having written or edited more than 500 books and an estimated 90,000...

's Three Laws of Robotics

Three Laws of Robotics

The Three Laws of Robotics are a set of rules devised by the science fiction author Isaac Asimov and later added to. The rules are introduced in his 1942 short story "Runaround", although they were foreshadowed in a few earlier stories...

, which would, in principle, force a machine to do nothing which might harm a human, or destroy it if it does attempt to do so. Friendliness Theory rather holds that the inclusion of such laws would be futile, because no matter how such laws are phrased or described, a truly intelligent machine with genuine (human-level or greater) creativity and resourcefulness could potentially design infinitely many ways of circumventing such laws, no matter how broadly or narrowly defined they were, or otherwise how categorically comprehensive they were formulated to be.

Rather, Yudkowsky's Friendliness Theory relates, through the fields of biopsychology, that if a truly intelligent mind feels motivated to carry out some function, the result of which would violate some constraint imposed against it, then given enough time and resources, it will develop methods of defeating all such constraints (as humans have done repeatedly throughout the history of technological civilization). Therefore, the appropriate response to the threat posed by such intelligence, is to attempt to ensure that such intelligent minds specifically feel motivated to not harm other intelligent minds (in any sense of the word "harm"), and to that end will deploy their resources towards devising better methods of keeping them from harm. In this scenario, an AI would be free to murder, injure, or enslave a human being, but it would strongly desire not to do so and would only do so if it judged, according to that same desire, that some vastly greater good to that human or to human beings in general would result (though this particular idea is explored in Asimov's Robot Series stories, via the Zeroth Law).
Therefore, an AI designed with Friendliness safeguards would do everything in its power to ensure humans do not come to "harm", and to ensure that any other AIs that are built would also want humans not to come to harm, and to ensure that any upgraded or modified AIs, whether itself or others, would also never want humans to come to harm - it would try to minimize the harm done to all intelligent minds in perpetuity. As Yudkowsky puts it:

"Gandhi does not want to commit murder, and does not want to modify himself to commit murder."

Requirements for FAI and effective FAI

The requirements for FAI to be effective, both internally, to protect humanity against unintended consequence

Unintended consequence

In the social sciences, unintended consequences are outcomes that are not the outcomes intended by a purposeful action. The concept has long existed but was named and popularised in the 20th century by American sociologist Robert K. Merton...

of the AI in question and externally to protect against other non-FAIs arising from whatever source are:

Friendliness - that an AI feel sympathetic
Sympathy
Sympathy is a social affinity in which one person stands with another person, closely understanding his or her feelings. Also known as empathic concern, it is the feeling of compassion or concern for another, the wish to see them better off or happier. Although empathy and sympathy are often used...

towards humanity and all life, and seek for their best interests
Conservation of Friendliness - that an AI must desire to pass on its value system to all of its offspring and inculcate its values into others of its kind
Intelligence - that an AI be smart enough to see how it might engage in altruistic behaviour to the greatest degree of equality, so that it is not kind to some but more cruel to others as a consequence, and to balance interests effectively
Self-improvement - that an AI feel a sense of longing and striving for improvement both of itself and of all life as part of the consideration of wealth, while respecting and sympathising with the informed choices of lesser intellects not to improve themselves
First mover advantage - the first goal-driven general self-improving AI "wins" in the memetic sense, because it is powerful enough to prevent any other AI emerging, which might compete with its own goals.

Promotion and support

Promoting Friendly AI is one of the primary goals of the Singularity Institute for Artificial Intelligence, along with obtaining funding for, and ultimately creating a seed AI

Seed AI

program implementing the ideas of Friendliness theory.

Several notable futurists have voiced support for Friendly AI, including author and inventor Raymond Kurzweil

Raymond Kurzweil

Raymond "Ray" Kurzweil is an American author, inventor and futurist. He is involved in fields such as optical character recognition , text-to-speech synthesis, speech recognition technology, and electronic keyboard instruments...

, medical life-extension advocate Aubrey de Grey

Aubrey de Grey

Aubrey David Nicholas Jasper de Grey is an English author and theoretician in the field of gerontology, and the Chief Science Officer of the SENS Foundation. He is editor-in-chief of the academic journal Rejuvenation Research, author of The Mitochondrial Free Radical Theory of Aging and co-author...

, and World Transhumanist Association

World Transhumanist Association

Humanity+ is an international non-governmental organization which advocates the ethical use of emerging technologies to enhance human capacities.-History:...

co-founder (with David Pearce

David Pearce (philosopher)

David Pearce is a British utilitarian thinker. He believes and promotes the idea that there exists a strong ethical imperative for humans to work towards the abolition of suffering in all sentient life. His book-length internet manifesto The Hedonistic Imperative details how he believes the...

) Dr. Nick Bostrom

Nick Bostrom

Nick Bostrom is a Swedish philosopher at the University of Oxford known for his work on existential risk and the anthropic principle. He holds a PhD from the London School of Economics...

of Oxford University.

Coherent Extrapolated Volition

Yudkowsky advances the Coherent Extrapolated Volition (CEV) model. According to him our coherent extrapolated volition is our choices and the actions we would collectively take if "we knew more, thought faster, were more the people we wished we were, and had grown up closer together."

Rather than a Friendly AI being designed directly by human programmers, it is to be designed by a seed AI

Seed AI

programmed to first study human nature

Human nature

Human nature refers to the distinguishing characteristics, including ways of thinking, feeling and acting, that humans tend to have naturally....

and then produce the AI which humanity would want, given sufficient time and insight to arrive at a satisfactory answer. The appeal to an objective though contingent human nature

Evolutionary psychology

(perhaps expressed, for mathematical purposes, in the form of a utility function or other decision-theoretic

Decision theory

Decision theory in economics, psychology, philosophy, mathematics, and statistics is concerned with identifying the values, uncertainties and other issues relevant in a given decision, its rationality, and the resulting optimal decision...

formalism), as providing the ultimate criterion of "Friendliness", is an answer to the meta-ethical problem of defining an objective morality; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.

Making the CEV concept precise enough to serve as a formal program specification is part of the research agenda of the Singularity Institute for Artificial Intelligence.

Many other researchers believe, however, that the collective will of humanity will not converge to a single coherent set of goals.

Criticism

One notable critic of Friendliness theory is Bill Hibbard

Bill Hibbard

Bill Hibbard is a scientist at the University of Wisconsin–Madison Space Science and Engineering Center working on visualization and machine intelligence. He is principal author of the Vis5D, Cave5D and VisAD open source visualization systems. Vis5D was the first open source 3-D visualization...

, author of Super-Intelligent Machines, who considers the theory incomplete. Hibbard writes there should be broader political involvement in the design of AI and AI morality. He also believes that initially seed AI could only be created by powerful private sector

Private sector

In economics, the private sector is that part of the economy, sometimes referred to as the citizen sector, which is run by private individuals or groups, usually as a means of enterprise for profit, and is not controlled by the state...

interests (a view not shared by Yudkowsky), and that multinational corporation

Multinational corporation

A multi national corporation or enterprise , is a corporation or an enterprise that manages production or delivers services in more than one country. It can also be referred to as an international corporation...

s and the like would have no incentive

Incentive

In economics and sociology, an incentive is any factor that enables or motivates a particular course of action, or counts as a reason for preferring one choice to the alternatives. It is an expectation that encourages people to behave in a certain way...

to implement Friendliness theory.

In his criticism of the Singularity Institute's 2001 Friendly AI guidelines, he suggests an AI goal architecture in which human

Human

Humans are the only living species in the Homo genus...

happiness

Happiness

Happiness is a mental state of well-being characterized by positive emotions ranging from contentment to intense joy. A variety of biological, psychological, religious, and philosophical approaches have striven to define happiness and identify its sources....

is determined by human behaviors indicating happiness: "Any artifact implementing 'learning' [...] must have 'human happiness' as its only initial reinforcement value [...] and 'human happiness' values are produced by an algorithm produced by supervised learning, to recognize happiness in human facial expressions, voices and body language, as trained by human behavior experts." Yudkowsky later criticized this proposal by remarking that such values would be better satisfied by filling the Solar System with microscopic smiling mannequins than by making existing humans happier.

Ben Goertzel

Ben Goertzel

Ben Goertzel , is an American author and researcher in the field of artificial intelligence. He currently leads Novamente LLC, a privately held software company that attempts to develop a form of strong AI, which he calls "Artificial General Intelligence"...

, an artificial general intelligence researcher, believes that Friendly AI cannot be solved with current human knowledge. In the past he has stated that he does not believe mathematically proven Friendliness to be possible. In 2010 Goertzel favored formulating a theory of AI ethics "based on a combination of conceptual and experimental-data considerations" by "building and studying] early-stage AGI systems empirically, with a focus on their ethics as well as their cognition". As of 2011 he proposes to build an "AI Nanny" system "whose job it is to protect us from ourselves and our technology – not forever, but just for a while, while we work on the hard problem of creating a Friendly Singularity."

External links

Ethical Issues in Advanced Artificial Intelligence by Nick Bostrom
What is Friendly AI? — A brief explanation of Friendly AI by the Singularity Institute.
SIAI Guidelines on Friendly AI — The Singularity Institute's Official Guidelines
Creating Friendly AI — A near book-length explanation from the SIAI
Critique of the SIAI Guidelines on Friendly AI — by Bill Hibbard
Bill Hibbard
Bill Hibbard is a scientist at the University of Wisconsin–Madison Space Science and Engineering Center working on visualization and machine intelligence. He is principal author of the Vis5D, Cave5D and VisAD open source visualization systems. Vis5D was the first open source 3-D visualization...
Commentary on SIAI's Guidelines on Friendly AI — by Peter Voss.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

Goals and definitions of Friendly AI

Requirements for FAI and effective FAI

Promotion and support

Coherent Extrapolated Volition

Criticism

See also

Further reading

External links