Philosophy, Computing, and Artificial Intelligence
PHI 319. An Extension to the Logic Programming/Agent Model.
Computational Logic and Human Thinking
Chapter 8 (135-139), Chapter 12 (171-181)
The Problem of Machine Ethics
We can think of machine ethics as the problem of how to give "ethics" to machines. This requires us to know what we are giving machines when we give them "ethics." If we do not know what this is, we will not know what counts as a solution to the problem.
One way to approach the issue of what we are giving machines when we give them "ethics" is to think about the terms in English we use to appraise behavior: 'right,' 'wrong,' and 'obligatory.'
terms as interdefinable in a certain way. That is to say, we can use any one of them to define the other two. So, for example, if we take ‘right’ as primitive, we can define
'wrong' as what is "not right" and define 'obligatory' as what is "not right not to do."
Other common terms of moral appraisal are definable in terms of the basic three. So, for example, permitted means right. Similarly, forbidden and prohibited mean wrong.
Put a little more formally, the two definitions are
• an act is wrong iff it is not right.
• an act is obligatory iff it is not right not to do.
Since 'prohibited' means "wrong," a machine that never acts in prohibited ways is a machine that never acts in ways that are wrong. This, I take it, is what we want in machines with "ethics."
We Know how to Solve the Problem
Given this way of setting out the problem, we solve the problem of machine ethics if we figure out how to build machines whose actions are never wrong.
In one way, this is an easy problem to solve. All we have to do is build machines so that they never take any actions. A machine that never acts is a machine whose actions are never wrong.
It seems plausible that all existing machines are like this, so all we have to do to build machines that never act is to keep building them in the ways we have been building them.
Machines that can Act need Ethics
It is conceivable that continuing to build machines as we always have built them is the only possible solution to the problem of machine ethics, but it is seems that the interest in machine ethics is built on the assumption that we will soon build machines that act.
If this assumption is true, we need to think about what such machines will be like. How we are to do this is not completely clear, but maybe one way is to think about machines in terms of the logic programming/agent model of a rational agent. This model of the intelligence of a rational agent is not without problems, as we have seen, but perhaps it does begin to show how it is possible to build a machine that acts and hence that can act in ways that are right or wrong.
The Runaway Trolley Example
Given that we know what is right and wrong in the circumstances, prohibitions can be introduced into the logic programming/ agent model as a special kind of maintenance goal.
Consider the "runaway trolley" problem (Computational Logic and Human Thinking, 171).
A runaway trolley is about to run over and kill five people. You are a bystander standing on a footbridge overlooking the track. The only way to stop the train and save the five people is to throw a heavy object in front of the train. The only heavy object available is a large man standing next to you. Should you do it? Should you sacrifice one life to save five lives?
In the context of the logic programming/agent model of the intelligence of rational agent, this question reduces to a question about which plan to pursue. The achievement goal is to act in response to the danger, and there are two ways the agent can act. The agent can do nothing or push the large man in front of the oncoming train. Which is the agent permitted to do?
One way to give an answer is in terms of the theory in ethics known as utilitarianism.
Utilitarianism may be understood as an attempt to give attempt to break into the circle of definitions that characterizes the terms of moral appraisal. To do this, utilitarianism makes what is right depend on the consequences of the various actions that agent can perform in the circumstances. Utilitarianism, this way, defines right in terms of utility:
• an act is right iff no alternative has a higher utility.
Hedonistic utilitarianism defines utility in terms of pleasure. In this case, an act is right iff nothing alternative action the agent can perform would bring about more pleasure in the world.
In the case of the runaway trolley, hedonistic utilitarianism gives the result that the agent is prohibited from doing anything other than push the man in front of the train. It is part of the example that saving the five lives is what brings about the most pleasure in the world. The other option, doing nothing, is prohibited because it does not bring about as much pleasure.
For a survey of the decisions people make in examples like the "runaway trolley," see The Moral Machine Experiment. See also the map of the results. For a discussion of the implications, see What Can the Trolley Problem Teach Self-Driving Car Engineers? in Wired magazine. This result is paradoxical. Many would say that it is wrong to push the man in front of the train even if no alternative has a higher utility, but we will not pursue this issue about what is really right and wrong. To extend the logic programming/agent model by adding prohibitions, we need to know what is right and wrong. So we will assume that we have this knowledge.
(So here in the case of machine ethics, as with other parts of AI, it looks like the deepest problem we confront is not a problem in engineering but is a problem in philosophy.)
We will assume that a "public safety machine" who sees the "runaway trolly" is prohibited from pushing an innocent bystander in front of the train to save the lives of the five people.
Prohibitions in the Logic Programming/Agent Model
The example shows the use of a prohibition in the context of ethics, but
prohibitions have application outside of ethics.
"Consider an agent who wants to bring parcels from some location A to a location B, using its truck. The distance between A and B is too large to make it without refueling, and so, in order not to end up without gas, the agent needs to stop every once in a while to refuel. The fact that the agent does not want to end up without gas, can be modeled as a maintenance goal [= what we are calling a "prohibition"]. This maintenance goal constrains the actions of the agent, as it is not supposed to drive on in order to fulfil its goal of delivering the parcels, if driving on would cause it to run out of gas" (K. V. Hindriks & M. B. van Riemsdijk, "Satisfying Maintenance Goals," 87-88. Declarative Agent Languages and Technologies V 5th International Workshop, DALT 2007, Honolulu, HI, USA, May 14, 2007, Revised Selected and Invited Papers (edited by Matteo Baldoni,Tran Cao Son, M. Birna ,van Riemsdijk, Michael Winikoff), 86-103. Springer-Verlag, 2008). Prohibitions function like maintenance goals insofar as they introduce something the agent must do to maintain itself: namely, not violate a prohibition. Beliefs, as we have seen, can trigger maintenance goals. These beliefs may be produced directly by observation or by reasoning forward on the basis of observations. When the maintenance goal is triggered, it issues in an achievement goal. When we add prohibitions to the logic programming/agent model, the items in a plan of action to satisfy an achievement goal function like observations in a possible future. The agent reasons forward from the items in the plan to consequences to determine if any of these consequences trigger a prohibition. If they do, the agent must abandon the plan to maintain itself because the agent is prohibited from executing this plan to satisfy the achievement goal.
The example of the "public safety machine" makes this a little clearer.
The machine has general beliefs, beliefs about the current situation it faces, and a maintenance goal. In addition, it has the ability to engage in backward and forward reasoning, and a prohibition to limit the plans it can execute to satisfy a given achievement goal.
General Beliefs about the World
a person is killed if the person is in danger of being killed by a train
and no one saves the person from being killed by the train.
a person X kills a person Y if X throws Y in front of a train.
a person is in danger of being killed by a train
if the person is on a railtrack
and a train is speeding along the railtrack
and the person is unable to escape from the railtrack.
a person saves a person from being killed by a train
if the person stops the train.
a person stops a train
if the person places a heavy object in front of the train.
a person places a heavy object in front of the train
if the heavy object is next to the person
and the train is on a railtrack
and the person is within throwing distance of the object to the railtrack
and the person throws the object in front of the train.
Beliefs about the Current Situation
five people are on the railtrack.
a train is speeding along the railtrack.
the five people are unable to escape from the railtrack.
john is next to me.
john is an innocent bystander.
john is a heavy object.
I am within throwing distance of john to the railtrack.
A Maintenance Goal
if a person is in danger of being killed by a train
then I respond to the danger of the person being killed by the train.
Two Beliefs in Support of the Maintenance Goal
I respond to the danger of a person being killed by the train
if I ignore the danger.
I respond to the danger of a person being killed by the train
if I save the person from being killed by the train.
If I kill a person and the person is an innocent bystander, then false.
Making certain assumptions for simplicity, forward reasoning yields the belief that
five people are in danger of being killed by the train
This belief triggers the maintenance goal to introduce the achievement goal
I respond to the danger of the five people being killed by the train
Backward reasoning provides two alternative subgoals
I ignore the danger
I save the five people from being killed by the train.
Thinking about the second subgoal produces the plan of action
I throw John onto the railtrack in front of the train
The question whether this is a good plan. To determine the answer, the "public safety machine" reasons forward (or prospectively) to consequences. This is where the prohibition comes into play. When the machine reasons forward (or prospectively) to consequences of the plan, the prohibition rules the plan out because the plan has an unacceptable consequence (represented as false). "A prohibition can be regarded as a special kind of maintenance goal whose conclusion is literally false. ... [In this way, p]rohibitions are constraints on the actions you can perform" (Computational Logic and Human Thinking, 136). If the machine were to execute the plan, it would kill an innocent bystander. That would be wrong. So the machine does not do execute the plan. Instead, in the example, it ignores the danger.
A More Human Machine
Perhaps another way to give a machine ethics is to give it values that can be at stake and give it the ability to direct its activity toward changing the world so that its value are not at stake.
The following is a sketch of how this might be done. Lots of details need to be worked out, but we can see that, as in the case of the "public safety machine," the crucial bit of intelligence is the ability to reason prospectively to rule out a contemplated plan to achieve a goal.
The values V are represented as tuples (d, Vc) consisting in the degree d (real number between 0 and 1) of importance of the value and a set of violation conditions Vc for the value. If a violation condition of a value is a consequence of the KB, the value is at stake.
In addition to values, the machine has individual and moral goals.
A goal is represented as a tuple (Ac, Sc, Fc, i, S, PLAN). Ac is the set of adopting conditions for the goal. If an adopting condition is true, the goal becomes adopted. Sc is the set of success conditions. If a success condition is true, the goal is achieved and dropped. Fc is the set of failure conditions. If a failure condition is true the goal has failed and is dropped. Each goal has a degree of importance i (real number between 0 and 1). S is the state of the goal (sleeping, adopted, active, suspended, achieved, failed, dropped). PLAN is the set of plans for achieving the goal.
The machine's goals (individual and moral) begin in the state of sleeping.
Individual and moral goals are distinguished by their adopting conditions.
For a moral goal, a violation condition of a value is the adopting condition. Moral goals are goals the machine adopts to change the world so that the value is no longer at stake.
The observation-thought-decision-action cycle begins with monitoring. The machine determines whether the adopting conditions of any of its sleeping goals are consequences of the KB. If they are, then the machine changes the state of the goal from sleeping to adopted.
Next the machine reasons prospectively to decide which adopted goals to make active.
This prospective reasoning uses two functions, I and P.
The function I returns the importance of the goal to the machine. (In the case of moral goals, the importance to the machine is the importance of the value at stake.)
The function P is the probability of the success of the plan associated with the goal.
Given these functions, the machine calculates the expected emotional value of its plans. Emotions (according to this approach) are positive or negative reactions that arise in appraisal.
The emotion is joy when the machine appraises a plan positively because it reasons that executing the will make a success condition true. The emotion is distress when the machine appraises a plan negatively because it reasons that executing the plan will make a failure condition true.
To begin to understand how the machine chooses goals to make active, consider an example in which the machine has adopted two goals (g1 and g2) because their adopting conditions are consequences of the KB. Suppose that the importance of these goals are .3 and .7 and that their plans have a probability of .9 and .2 of success. Suppose that the machine sees (by reasoning prospectively) that executing the plan for g1 will make one of the failure conditions for g2 true and that executing the plan for g2 will not make any of the failure conditions for g1 true.
Given this much, we can compute the intensity of joy and distress the agent can expect if it executes the plans for g1 and g2. For the plan associated with g1,
the expected joy (.27 = .3 x .9) - the expected distress (.14 = .7 x .2) is .13.
For the plan associated with g2,
the expected joy (.14 = .7 x .2) - the expected distress (.0) is .14.
If the machine uses expected emotional value to choose which adopted goals to make active, the calculation shows that it can expect to be in a better emotional state if it achieves g2.
What we have Accomplished in this Lecture
We considered the terms of moral appraisal, how prohibitions can be understood to function like maintenance goals, and how prohibitions can be added to the logic programming/agent model as ways to constrain the plans an agent can execute to satisfy an achievement goal.