[Note this critique of SIAI's more recent Collective Volition (CV) theory of friendliness.]

Critique of the SIAI Guidelines on Friendly AI

Bill Hibbard     9 May 2003

This critique refers to the following Singularity Institute documents:

1. The SIAI analysis fails to recognize the importance of the political process in creating safe AI.

This is a fundamental error in the SIAI analysis. CFAI 4.2.1 says "If an effort to get Congress to enforce any set of regulations were launched, I would expect the final set of regulations adopted to be completely unworkable." It further says that government regulation of AI is unnecessary because "The existing force tending to ensure Friendliness is that the most advanced projects will have the brightest AI researchers, who are most likely to be able to handle the problems of Friendly AI." History vividly teaches the danger of trusting the good intentions of individuals.

The singularity will completely change power relations in human society. People and institutions that currently have great power and wealth will know this, and will try to manipulate the singularity to protect and enhance their positions. The public generally protects its own interests against the narrow interests of such powerful people and institutions via widespread political movements and the actions of democratically elected government. Such political action has never been more important than it will be in the singularity.

The reinforcement learning values of the largest (and hence most dangerous) AIs will be defined by the corporations and governments that build them, not the AI researchers working for those orgnaizations. Those organizations will give their AIs values that reflect the organizations' values: profits in the case of corporations, and political and military power in the case of governments. Only a strong public movement driving government regulation will be able to coerce these organizations to design AI values to protect the interests of all humans. This government regulation must include an agency to monitor AI development and enforce regulations.

The breakthrough ideas for achieving AI will come from individual researchers, many of whom will want their AI to serve the broad human interest. But their breakthrough ideas will become known to wealthy organizations. Their research will either be in the public domain, done for hire by wealthy organizations, or will be sold to such organizations. Breakthrough research may simply be seized by governments and the researchers prohibited from publishing, as was done for research on effective cryptography during the 1970s. The most powerful AIs won't exist on the $5,000 computers on researchers' desktops, but on the $5,000,000,000 computers owned by wealthy organizations. The dangerous AIs will be the ones capable of developing close personal relations with huge numbers of people. Such AIs will be operated by wealthy organizations, not individuals.

Individuals working toward the singularity may resist regulation as interference with their research, as was evident in the SL4 discussion of testimony before Congressman Brad Sherman's committee. But such regulation will be necessary to coerce the wealthy organizations that will own the most powerful AIs. These will be much like the regulations that restrain powerful organizations from building dangerous products (cars, household chemicals, etc), polluting the environment, and abusing citizens.

2. The design recommendations in GUIDELINES 3 fail to define rigorous standards for "changes to supergoal content" in recommendation 3, for "valid" and "good" in recommendation 4, for "programmers' intentions" in recommendation 5, and for "mistaken" in recommendation 7.

These recommendations are about the AI learning its own supergoal. But even digging into corresponding sections of CFAI and FEATURES fails to find rigorous standards for defining critical terms in these recommendations. Determination of their meanings is left to "programmers" or the AI itself. Without rigorous standards for these terms, wealthy organizations constructing AIs will be free to define them in any way that serves their purposes and hence to construct AIs that serve their narrow interests rather than the general public interest.

3. CFAI defines "friendliness" in a way that can only be determined by an AI after it has developed super- intelligence, and fails to define rigorous standards for the values that guide its learning until it reaches super-intelligence.

The actual definition of "friendliness" in CFAI 3.4.4 requires the AI to know most humans sufficiently well to decompose their minds into "panhuman", "gaussian" and "personality" layers, and to "converge to normative altruism" based on collective content of the "panhuman" and "gaussian" layers. This will require the development of super-intelligence over a large amount of learning. The definition of friendliness values to reinforce that learning is left to "programmers". As in the previous point, this will allow wealthy organizations to define intial learning values for their AIs as they like.

4. The CFAI analysis is based on a Bayesian reasoning model of intelligence, which is not a sufficient model for producing intelligence.

While Bayesian reasoning has an important role in intelligence, it is not sufficient. Sensory experience and reinforcement learning are fundamental to intelligence. Just as symbols must be grounded in sensory experience, reasoning must be grounded in learning and emerges from it because of the need to solve the credit assignment problem, as discussed at:

Effective and general reinforcement learning requires simulation models of the world, and sets of competing agents. Furthermore, intelligence requires a general ability to extract patterns from sense data and internal information. An analysis of safe AI should be based on a sufficient model of intelligence.

[As an example to make this criticism clear, design recommendation 5 in GUIDELINES 3 says "Causal validity semantics requires that the AI model the causal process that led to the AI's creation and ..." But an adequate model of intelligence would include modelling the world. Recommendation 5 puts this modelling requirement in the friendliness model because of the inadequacy of the CFAI intelligence model.]

I offer an alternative analysis of producing safe AI in my book at


This critique was posted as a message to the AGI and SL4 mailing lists.

Here is the response from the SIAI and my reply back to the SIAI.

Here is a defense that the SIAI guidelines are about minds that derive their own values and my response that values can only be derived from values.

There followed a lively discussion, that can be found on the SL4 mailing list archive.

Ben Goertzel suggested that a key point in the discussion was differences about how quickly the singularity would develop, and here is my response, which included answers to many of the questions raised during the discussion.

The SIAI requested this draft enforceable regulation for safe AI, along with a clarification.

One thing that stands out in the debate with the SIAI is their contempt for the democratic process, based on contempt for the general public. This reminds me of debates with Marxists during the 1960s and 1970s. For example, here is a message from Samantha Atkins and another from Eliezer Yudkowsky. There is a real irony in movements that claim to be in humanity's best interests, but have contempt for ordinary humans. History teaches that when such movements get power, they always lead to human misery. If the general public doesn't assert its interests via democratic processes, then the singularity will certainly serve narrow interests.