On benevolence and friendly AI theory

Jame5 it not just a science fiction novel – it is a science fiction novel with a cause. Ensuring the creation of a friendly AI is hard for many reasons:

  • Creating an AGI is hard
  • Goal retention is hard
  • Recursive self improvement is hard

The question what friendliness means however existed before all of those problems, is a separate one and needs to be answered before the creation of a friendly AI can be attempted. Coherent Extrapolated Volition in short CEV is Eliezer S. Yudkowsky’s take on Friendliness.

While CEV is great to describe what a friendly AIG will do, my critique of CEV is that it postpones answering the question of what friendliness is specifically until after we have an AIG that will answer that question for us.

Yes – a successfully implemented friendly AGI will do ‘good’ stuff and act in our ‘best interest’. But what is good and what is our best interest? In Jame5 I provide a different solution to the friendliness issue and suggest to skip right to the end of chapter 9 for anyone who would like to get right to the meat.

In addition I have summarized my core friendliness concepts in a paper called ‘Benevolence – a Materialist Philosophy of Goodness‘ (2007/11/09 UPDATE: latest version here) and in the end formulate the following friendly AGI supergoal:

Definitions:

  • Suffering: negative subjective experience equivalent to the subjective departure from an individual’s model of optimal fitness state as encoded in its genome/memome
  • Growth: absolute increase in individual’s fitness
  • Joy: positive subjective experience equivalent to the subjective contribution to moving closer towards an individual’s model of optimal fitness state as encoded in its genome/memome

Derived friendly AGI super goal: “Minimize all involuntary human suffering, direct all
unavoidable suffering towards growth, and reward all voluntary suffering contributing
to an individual’s growth with an equal or greater amount of joy.”

Comments (3)