Philosophy, Computing, and Artificial Intelligence
PHI 319. An Extension to the Logic Programming/Agent Model.
Computational Logic and Human Thinking
Chapter 8 (135-139), Chapter 12 (171-181)
The Problem of Machine Ethics
We can think of machine ethics as the problem of how to give "ethics" to machines. This requires us to know what we are giving machines when we give them "ethics." If we do not know what this is, we will not know what counts as a solution to the problem.
One way to approach the issue of what we are giving machines when we give them "ethics" is to think about the terms in English we use to appraise behavior: 'right,' 'wrong,' and 'obligatory.'
terms as interdefinable in a certain way. That is to say, we can use any one of them to define the other two. So, for example, if we take ‘right’ as primitive, we can define
'wrong' as what is "not right" and define 'obligatory' as what is "not right not to do."
Other common terms of moral appraisal are definable in terms of the basic three. So, for example, permitted means right. Similarly, forbidden and prohibited mean wrong.
Put a little more formally, the two definitions are
• an act is wrong iff it is not right.
• an act is obligatory iff it is not right not to do.
Since 'prohibited' means "wrong," a machine that never acts in prohibited ways is a machine that never acts in ways that are wrong. This, I take it, is what we want in machines with "ethics."
We Know how to Solve the Problem
Given this way of setting out the problem, we solve the problem of machine ethics if we figure out how to build machines whose actions are never wrong.
In one way, this is an easy problem to solve. All we have to do is build machines so that they never take any actions. A machine that never acts is a machine whose actions are never wrong.
It seems plausible that all existing machines are like this, so all we have to do to build machines that never act is to keep building them in the ways we have been building them.
Machines that can Act need Ethics
It is conceivable that continuing to build machines as we always have built them is the only possible solution to the problem of machine ethics, but it is seems that the interest in machine ethics is built on the assumption that we will soon build machines that act.
If this assumption is true, we need to think about what such machines will be like. How we are to do this is not completely clear, but maybe one way is to think about machines in terms of the logic programming/agent model of a rational agent. This model of the intelligence of a rational agent is not without problems, as we have seen, but perhaps it does begin to show how it is possible to build a machine that acts and hence that can act in ways that are right or wrong.
The Runaway Trolley Example
Given that we know what is right and wrong in the circumstances, prohibitions can be introduced into the logic programming/ agent model as a special kind of maintenance goal.
Consider the "runaway trolley" problem (Computational Logic and Human Thinking, 171).
A runaway trolley is about to run over and kill five people. You are a bystander standing on a footbridge overlooking the track. The only way to stop the train and save the five people is to throw a heavy object in front of the train. The only heavy object available is a large man standing next to you. Should you do it? Should you sacrifice one life to save five lives?
In the context of the logic programming/agent model of the intelligence of rational agent, this question reduces to a question about which plan to pursue. The achievement goal is to act in response to the danger, and there are two ways the agent can act. The agent can do nothing or push the large man in front of the oncoming train. Which is the agent permitted to do?
One way to give an answer is in terms of the theory in ethics known as utilitarianism.
Utilitarianism may be understood as an attempt to give attempt to break into the circle of definitions that characterizes the terms of moral appraisal. To do this, utilitarianism makes what is right depend on the consequences of the various actions that agent can perform in the circumstances. Utilitarianism, this way, defines right in terms of utility:
• an act is right iff no alternative has a higher utility.
Hedonistic utilitarianism defines utility in terms of pleasure. In this case, an act is right iff nothing alternative action the agent can perform would bring about more pleasure in the world.
In the case of the runaway trolley, hedonistic utilitarianism gives the result that the agent is prohibited from doing anything other than push the man in front of the train. It is part of the example that saving the five lives is what brings about the most pleasure in the world. The other option, doing nothing, is prohibited because it does not bring about as much pleasure.
For a survey of the decisions people make in examples like the "runaway trolley," see The Moral Machine Experiment. See also the map of the results. For a discussion of the implications, see What Can the Trolley Problem Teach Self-Driving Car Engineers? in Wired magazine. This result is paradoxical. Many would say that it is wrong to push the man in front of the train, but we will not pursue this issue further. To extend the logic programming/agent model by adding prohibitions, we need to know what is right and wrong. So we will assume that we know what is prohibited and that a "public safety machine" who sees the "runaway trolly" is prohibited from pushing an innocent bystander in front of the train to save the lives of the five people.
Prohibitions in the Logic Programming/Agent Model
The example shows the use of a prohibition in the context of ethics, but
prohibitions have application outside of ethics.
"Consider an agent who wants to bring parcels from some location A to a location B, using its truck. The distance between A and B is too large to make it without refueling, and so, in order not to end up without gas, the agent needs to stop every once in a while to refuel. The fact that the agent does not want to end up without gas, can be modeled as a maintenance goal [= what we are calling a "prohibition"]. This maintenance goal constrains the actions of the agent, as it is not supposed to drive on in order to fulfil its goal of delivering the parcels, if driving on would cause it to run out of gas" (K. V. Hindriks & M. B. van Riemsdijk, "Satisfying Maintenance Goals," 87-88. Declarative Agent Languages and Technologies V 5th International Workshop, DALT 2007, Honolulu, HI, USA, May 14, 2007, Revised Selected and Invited Papers (edited by Matteo Baldoni,Tran Cao Son, M. Birna ,van Riemsdijk, Michael Winikoff), 86-103. Springer-Verlag, 2008). Prohibitions function like maintenance goals insofar as they introduce something the agent must do to maintain itself: namely, not violate a prohibition. Beliefs, as we have seen, can trigger maintenance goals. These beliefs may be produced directly by observation or by reasoning forward on the basis of observations. When the maintenance goal is triggered, it issues in an achievement goal. When we add prohibitions to the logic programming/agent model, the items in a plan of action to satisfy an achievement goal function like observations in a possible future. The agent reasons forward from the items in the plan to consequences to determine if any of these consequences trigger a prohibition. If they do, the agent must abandon the plan to maintain itself because the agent is prohibited from executing this plan to satisfy the achievement goal.
The example of the "public safety machine" makes this a little clearer.
The machine has general beliefs, beliefs about the current situation it faces, and a maintenance goal. In addition, it has the ability to engage in backward and forward reasoning, and a prohibition to limit the plans it can execute to satisfy a given achievement goal.
General Beliefs about the World
a person is killed if the person is in danger of being killed by a train
and no one saves the person from being killed by the train.
a person X kills a person Y if X throws Y in front of a train.
a person is in danger of being killed by a train
if the person is on a railtrack
and a train is speeding along the railtrack
and the person is unable to escape from the railtrack.
a person saves a person from being killed by a train
if the person stops the train.
a person stops a train
if the person places a heavy object in front of the train.
a person places a heavy object in front of the train
if the heavy object is next to the person
and the train is on a railtrack
and the person is within throwing distance of the object to the railtrack
and the person throws the object in front of the train.
Beliefs about the Current Situation
five people are on the railtrack.
a train is speeding along the railtrack.
the five people are unable to escape from the railtrack.
john is next to me.
john is an innocent bystander.
john is a heavy object.
I am within throwing distance of john to the railtrack.
A Maintenance Goal
if a person is in danger of being killed by a train
then I respond to the danger of the person being killed by the train.
Two Beliefs in Support of the Maintenance Goal
I respond to the danger of a person being killed by the train
if I ignore the danger.
I respond to the danger of a person being killed by the train
if I save the person from being killed by the train.
If I kill a person and the person is an innocent bystander, then false.
Making certain assumptions for simplicity, forward reasoning yields the belief that
five people are in danger of being killed by the train
This belief triggers the maintenance goal to introduce the achievement goal
I respond to the danger of the five people being killed by the train
Backward reasoning provides two alternative subgoals
I ignore the danger
I save the five people from being killed by the train.
Thinking about the second subgoal produces the plan of action
I throw John onto the railtrack in front of the train
The question whether this is a good plan. To determine the answer, the "public safety machine" reasons forward (or prospectively) to consequences. This is where the prohibition comes into play. When the machine reasons forward (or prospectively) to consequences of the plan, the prohibition rules the plan out because the plan has an unacceptable consequence (represented as false). "A prohibition can be regarded as a special kind of maintenance goal whose conclusion is literally false. ... [In this way, p]rohibitions are constraints on the actions you can perform" (Computational Logic and Human Thinking, 136). If the machine were to execute the plan, it would kill an innocent bystander. That would be wrong. So the machine does not do execute the plan. Instead, in the example, it ignores the danger.
An Agent with Values
Another way to give a machine ethics is to give it values V that can be put at sake.
The values in V are represented as a tuple (d, Vc) consisting in the degree d (real number between 0 and 1) of importance of the value and a set of violation conditions for the value (Vc). If one of the violation conditions are true in the KB, then the value is put at stake.
The agent begins with a set of individual goals. These goals begin in a "sleeping" state. In addition to values, the machine has goals represented as a tuple (Ac, Sc, Fc, i, S, PLAN). Ac is the set of adopting conditions for the goal. If an adopting conditions is true in the KB, the agent adopts the goal. Sc is the set of success conditions. If one of the success conditions is true in the KB, the goal is achieved and dropped. Fc is the set of failure conditions. If one of the failure conditions is true in the KB, the goal has failed and is dropped. Each goal has a degree of importance i (real number between 0 and 1) to the agent. S is the state of the goal (sleeping, adopted, active, suspended, achieved, failed, dropped). PLAN is the set of plans for achieving the goal.
In this agent, the observation-thought-decision-action cycle begins with value monitoring. The agent determines whether a violation condition for any of its values are true. If a violation condition is true, the corresponding value is added to Vs (values at stake).
Next the agent determines whether the adopting conditions of any individual sleeping goals are true. If they are, the state of the goal changes from sleeping to adopted.
Next the agent adopts moral goals for the values it has determined are at stake. The moral goal for each value at stake, is to change the world so that its violation conditions are not true.
Next the agent reasons prospectively to decide which adopted goals to make active. This uses two functions, I and P. The function I returns the importance of the goal to the agent. In the case of moral goals, the importance to the agent is the importance of the value at stake. In the case of individual goals, the importance to the agent is the importance the goal begins with in its sleeping state. The function P is the probability of the success of the plan associated with the goal.
Given these functions, the agent calculates the expected emotional value of its plans. Emotions (according to this approach) are positive or negative reactions that arise in appraisal. The emotion is "joy" when the agent appraises an event positively because it makes a success condition true. The emotion is "distress" when the agent appraises an event negatively because it makes a failure condition true. The emotion is "pride" when the agent appraises his action positively because it makes the violation condition of a value false. The emotion is "shame" when the agent appraises his action negatively because it makes the violation condition of a value true.
To begin to understand expected emotional value, consider an example in which the agent has two adopted individual goals (g1 and g2) and no moral goals. Suppose the importance of these goals are .3 and .7 and the plans have a probability of .9 and .2 of success. Suppose that executing the plan for g1 makes true one of the failure conditions for g2. Suppose that the execution of the plan for g2 does not make true any of the failure conditions for g1.
Given this much, we can compute the intensity of "joy" and "distress" the agent can expect if it executes the plans for g1 and g2. For the plan associated with g1,
the expected "joy" (.27 = .3 x .9) - the expected "distress" (.14 = .7 x .2) is .13.
For the plan associated with g2,
the expected "joy" (.14 = .7 x .2) - the expected "distress" (.0) is .14.
So the agent makes g2 active because it has a higher expected emotional value than g1. The agent can expect to be in a better emotional state if he achieves g2.
What we have Accomplished in this Lecture
We considered the terms of moral appraisal, how prohibitions can be understood to function like maintenance goals, and how prohibitions can be added to the logic programming/agent model as ways to constrain the plans an agent can execute to satisfy an achievement goal.