From test@demedici.ssec.wisc.edu Mon May 12 05:19:26 2003 Date: Sun, 11 May 2003 18:14:37 -0500 (CDT) From: Bill Hibbard Reply-To: agi@v2.listbox.com To: sl4@sl4.org, agi@v2.listbox.com Subject: [agi] Re: SIAI's flawed friendliness analysis On Sat, 10 May 2003, Eliezer S. Yudkowsky wrote: > Bill Hibbard wrote: > > This critique refers to the following documents: > > > > GUIDELINES: http://www.singinst.org/friendly/guidelines.html > > FEATURES: http://www.singinst.org/friendly/features.html > > CFAI: http://www.singinst.org/CFAI/index.html > > > > 1. The SIAI analysis fails to recognize the importance of > > the political process in creating safe AI. > > > > This is a fundamental error in the SIAI analysis. CFAI 4.2.1 > > says "If an effort to get Congress to enforce any set of > > regulations were launched, I would expect the final set of > > regulations adopted to be completely unworkable." It further > > says that government regulation of AI is unnecessary because > > "The existing force tending to ensure Friendliness is that > > the most advanced projects will have the brightest AI > > researchers, who are most likely to be able to handle the > > problems of Friendly AI." History vividly teaches the danger > > of trusting the good intentions of individuals. > > ...and, of course, the good intentions and competence of governments. Absolutely. I never claimed that AI safety is a sure thing. But without a broad political movement for safe AI and its success in elective democratic government, unsafe AI is a sure thing. > . . . > Your political recommendations appear to be based on an extremely > different model of AI. Specifically: > > 1) "AIs" are just very powerful tools that amplify the short-term goals > of their users, like any other technology. I never said "short-term" - you are putting words into my mouth. A key property of intelligence is understanding the long-term effects of behavior on satisfying values (goals). > 2) AIs have power proportional to the computing resources invested in > them, and everyone has access to pretty much the same theoretical model > and class of AI. AI power does depend on computing resources and efficiency of algorithms. Important algorithms have proved impossible to keep secret for any length of time. Whether or not this continues in the future, essential algorithms for intelligence will not be secret from powerful organizations. > 3) There is no seed AI, no rapid recursive self-improvement, no hard > takeoff, no "first" AI. AIs are just new forces in existing society, > coming into play a bit at a time, as everyone's AI technology improves at > roughly the same rate. Where did this come from? I am very clear in my book about the importance of proper training for young AIs, and the issues involved in AI evolution. > 4) Anyone can make an AI that does anything. AI morality is an easy > problem with fully specifiable arbitrary solutions that are reliable and > humanly comprehensible. I never said building safe AI is easy. > 5) Government workers can look at an AI design and tell what the AI's > morality does and whether it's safe. We certainly expect government workers regulate nuclear energy designs and operation to ensure their safety. And because of their doubts about safety, people in the US have decided through their democratic political process to stop building new nuclear energy plants. Nowhere do I claim that regulation of safe AI will be simple. But if we don't have government workers implementing regulation of AI under democratic political control, then we will have unsafe AIs. > 6) There are variables whose different values correlate to socially > important differences in outcomes, such that government workers can > understand the variables and their correlation to the outcomes, and such > that society expects to have a conflict of interest with individuals or > organizations as to the values of those variables, with the value to > society of this conflict of interest exceeding the value to society of the > outcome differentials that depend on the greater competence of those > individuals or organizations. Otherwise there's nothing worth voting on. There will be organizations with motives to build AIs with values that will correlate with important social differences in outcome. AIs with values to maximize profits may end up empowering their owners at everyone else's expense. AIs with values for military victory may end up killing lots of people. > I disagree with all six points, due to a different model of AI. I think we do have different models of AI. I think an AI is an information process that has some values that it tries to satisfy (positive values) and avoid (negative values). It does this via reinforcement learning and a simulation model of the world that it uses to solve the credit assignment problem (i.e., to understand the long term consequences of its behaviors on its values). Of course, actually doing this in general circumstances is very difficult, requiring pattern recognition to greatly reduce the volume of sensory information, and the equivalent to human conscious thought to reflect on situations and find analogies. The SIAI guidelines involve digging into the AI's reflective thought process and controlling the AI's thoughts, in order to ensure safety. My book says the only concern for AI learning and reasoning is to ensure they are accurate, and that the teachers of young AIs be well-adjusted people (subject to public monitoring and the same kind of screening used for people who control major weapons). Beyond that, the proper domain for ensuring AI safety is the AI's values rather than the AI's reflective thought processes. In my second and third points I described the lack of rigorous standards for certain terms in the SIAI Guidelines and for initial AI values. Those rigorous standards can only come from the AI's values. I think that in your AI model you feel the need to control how they are derived via the AI's reflective thought process. This is the wrong domain for addressing AI safety. Clear and unambiguous initial values are elaborated in the learning process, forming connections via the AI's simulation model with many other values. Human babies love their mothers based on simple values about touch, warmth, milk, smiles and sounds (happy Mother's Day). But as the baby's mind learns, those simple values get connected to a rich set of values about the mother, via a simulation model of the mother and surroundings. This elaboration of simple values will happen in any truly intelligent AI. I think initial AI values should be for simple measures of human happiness. As the AI develops these will be elaborated into a model of long-term human happiness, and connected to many derived values about what makes humans happy generally and particularly. The subtle point is that this links AI values with human values, and enables AI values to evolve as human values evolve. We do see a gradual evolution of human values, and the singularity will accelerate it. Morality has its roots in values, especially social values for shared interests. Complex moral systems are elaborations of such values via learning and reasoning. The right place to control an AI's moral system is in its values. All we can do for an AI's learning and reasoning is make sure they are accurate and efficient. > . . . > > 3. CFAI defines "friendliness" in a way that can only > > be determined by an AI after it has developed super- > > intelligence, and fails to define rigorous standards > > for the values that guide its learning until it reaches > > super-intelligence > > > > The actual definition of "friendliness" in CFAI 3.4.4 > > requires the AI to know most humans sufficiently well > > to decompose their minds into "panhuman", "gaussian" and > > "personality" layers, and to "converge to normative > > altruism" based on collective content of the "panhuman" > > and "gaussian" layers. This will require the development > > of super-intelligence over a large amount of learning. > > The definition of friendliness values to reinforce that > > learning is left to "programmers". As in the previous > > point, this will allow wealthy organizations to define > > intial learning values for their AIs as they like. > > I don't believe a young Friendly AI should be meddling in the real world > at all. If for some reason this becomes necessary, it might as well do > what the programmer says, maybe with its own humane veto. I'd trust a > programmer more than I'd trust an infant Friendly AI, because regardless > of its long-term purpose, during infancy the FAI is likely to have neither > a better approximation to humaneness, nor a better understanding of the > real world. I agree that young AIs should have limited access to senses and actions. But in order to "converge to normative altruism" based on collective content of the "panhuman" and "gaussian" layers as described in CFAI 3.4.4, the AI is going to need access to large numbers of humans. > . . . > > 4. The CFAI analysis is based on a Bayesian reasoning > > model of intelligence, which is not a sufficient model > > for producing intelligence. > > > > While Bayesian reasoning has an important role in > > intelligence, it is not sufficient. Sensory experience > > and reinforcement learning are fundamental to > > intelligence. Just as symbols must be grounded in > > sensory experience, reasoning must be grounded in > > learning and emerges from it because of the need to > > solve the credit assignment problem, as discussed at: > > > > http://www.mail-archive.com/agi@v2.listbox.com/msg00390.html > > Non-Bayesian? I don't think you're going to find much backing on this > one. If you've really discovered a non-Bayesian form of reasoning, write > it up and collect your everlasting fame. Personally I consider such a > thing almost exactly analogous to a perpetual motion machine. Except that > a perpetual motion machine is merely physically impossible, while > "non-Bayesian reasoning" appears to be mathematically impossible. Though > of course I could be wrong. I never said "Non-Bayesian", although I find Pei's and Ben's examples of Non-Bayesian logic in their systems interesting. What I really meant by my fourth point is that because your model of intelligence is incomplete, there are things in your model of friendliness that really belong in your model of intelligence. For example, recommendation 5 from GUIDELINES 3 "requires that the AI model the causal process that led to the AI's creation and that the AI use its existing cognitive complexity (or programmer assistance) to make judgements about the validity or invalidity of factors in that causal process." Any sufficiently intelligent brain will have a simulation model of the world that includes the events that led to its creation, and will make value judgements about those events. The failure to do so would be a failure of intelligence rather than a failure of safety. I think this confusion between model of intelligence and model of safety leads to the difficulty of finding rigorous standards for terms described in my second point, and the difficulty of finding initial values described in my third point. > Reinforcement learning emerges from Bayesian reasoning, not the other way > around. Sensory experience likewise. > > For more about Bayesian reasoning, see: > http://yudkowsky.net/bayes/bayes.html > http://bayes.wustl.edu/etj/science.pdf.html > > Reinforcement, specifically, emerges in a Bayesian decision system: > http://singinst.org/CFAI/design/clean.html#reinforcement This describes a Bayesian mechanism for reinforcement learning, but does not show that reinforcement learning emerges from Bayesian reasoning. In fact, learning precedes reasoning in brain evolution. Reasoning (i.e., a simulation model of the world) evolved to solve the credit assignment problem of learning. ---------------------------------------------------------- Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI 53706 test@demedici.ssec.wisc.edu 608-263-4427 fax: 608-263-6738 http://www.ssec.wisc.edu/~billh/vis.html ------- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?listname=agi@v2.listbox.com