From test@demedici.ssec.wisc.edu Mon Jun 5 13:13:38 2006 Date: Mon, 5 Jun 2006 13:01:05 -0500 (CDT) From: Bill Hibbard Reply-To: sl4@sl4.org To: sl4@sl4.org Cc: wta-talk@transhumanism.org, extropy-chat@lists.extropy.org, agi@v2.listbox.com Subject: Re: Two draft papers: AI and existential risk; heuristics and biases Eliezer, > These are drafts of my chapters for Nick Bostrom's forthcoming edited > volume _Global Catastrophic Risks_. I may not have much time for > further editing, but if anyone discovers any gross mistakes, then > there's still time for me to submit changes. > > The chapters are: > . . . > _Artificial Intelligence and Global Risk_ > http://singinst.org/AIRisk.pdf > The new standard introductory material on Friendly AI. Any links to > _Creating Friendly AI_ should be redirected here. In Section 6.2 you quote my ideas written in 2001 for hard-wiring recognition of expressions of human happiness as values for super-intelligent machines. I have three problems with your critique: 1. Immediately after my quote you discuss problems with neural network experiments by the US Army. But I never said hard-wired learning of recognition of expressions of human happiness should be done using neural networks like those used by the army. You are conflating my idea with another, and then explaining how the other failed. 2. In your section 6.2 you write: If an AI "hard-wired" to such code possessed the power - and [Hibbard, B. 2001. Super-intelligent machines. ACM SIGGRAPH Computer Graphics, 35(1).] spoke of superintelligence - would the galaxy end up tiled with tiny molecular pictures of smiley-faces? When it is feasible to build a super-intelligence, it will be feasible to build hard-wired recognition of "human facial expressions, human voices and human body language" (to use the words of mine that you quote) that exceed the recognition accuracy of current humans such as you and me, and will certainly not be fooled by "tiny molecular pictures of smiley-faces." You should not assume such a poor implementation of my idea that it cannot make discriminations that are trivial to current humans. 3. I have moved beyond my idea for hard-wired recognition of expressions of human emotions, and you should critique my recent ideas where they supercede my earlier ideas. In my 2004 paper: Reinforcement Learning as a Context for Integrating AI Research, Bill Hibbard, 2004 AAAI Fall Symposium on Achieving Human-Level Intelligence through Integrated Systems and Research http://www.ssec.wisc.edu/~billh/g/FS104HibbardB.pdf I say: Valuing human happiness requires abilities to recognize humans and to recognize their happiness and unhappiness. Static versions of these abilities could be created by supervised learning. But given the changing nature of our world, especially under the influence of machine intelligence, it would be safer to make these abilities dynamic. This suggests a design of interacting learning processes. One set of processes would learn to recognize humans and their happiness, reinforced by agreement from the currently recognized set of humans. Another set of processes would learn external behaviors, reinforced by human happiness according to the recognition criteria learned by the first set of processes. This is analogous to humans, whose reinforcement values depend on expressions of other humans, where the recognition of those humans and their expressions is continuously learned and updated. And I further clarify and update my ideas in a 2005 on-line paper: The Ethics and Politics of Super-Intelligent Machines http://www.ssec.wisc.edu/~billh/g/SI_ethics_politics.doc Please adjust your discussion of my ideas to: 1. Not conflate my ideas with others. 2. Not assume a poor implementation of my ideas. 3. Not critique my old ideas when they have been replaced by newer ideas in my publications. Thank you, Bill