Philosophy, Computing, and Artificial Intelligence
PHI 319. An Extension to the Logic Programming/Agent Model.
Computational Logic and Human Thinking
Chapter 8 (135-139), Chapter 12 (171-181)
The Problem of Machine Ethics
We can think of machine ethics as the problem of how to give "ethics" to machines. This requires us to know what we are giving machines when we give them ethics. If we do not know what this is, we will not know what counts as a solution to the problem.
One way to approach the issue of what we are giving machines when we give them ethics is to think about the terms in English we use to appraise behavior:
terms as interdefinable in a certain way. That is to say, we can use any one of them to define the other two. So, for example, if we take ‘right’ as understood, we can define
'wrong' as what is "not right" and define 'obligatory' as what is "not right not to do."
Other common terms of moral appraisal are definable in terms of the basic three. So, for example, permitted means right. Similarly, forbidden and prohibited mean wrong.
Put a little more formally, the two definitions are
• an act is wrong if, and only if, it is not right.
• an act is obligatory if, and only if, it is not right not to do.
Since 'prohibited' means "wrong," a machine that never acts in prohibited ways is one that never acts in ways that are wrong. This, I take it, is what we want in machines with "ethics."
We Know how to Solve the Problem
Given this understanding of the problem, we can solve the problem of machine ethics if we can figure out how to build machines whose actions are never wrong.
In one way, this is easy. All we need to do is build machines so that they never take any actions. A machine that never acts is a machine whose actions are never wrong.
It seems plausible that all existing machines are like this, so all we have to do to solve the problem is to keep building them in the ways we have been building them.
Machines that can Act need Ethics
It is conceivable that continuing to build machines as we always have built them is the only possible solution to the problem of machine ethics, but the current interest in machine ethics is built on the assumption that we will soon build machines that can act.
If this assumption is true, we need to think about what such machines will be like. The nature of such machines is not now clear. This is part of what makes AI an unsolved problem, but maybe the logic programming/agent model of a rational agent is one way to think about what it would be for a machine to act. This model of the intelligence of a rational agent is not without problems, as we have seen, but perhaps it does begin to show how it is possible to build a machine that acts and hence that can act in ways that are right or wrong.
The Runaway Trolley Example
Given we know what is right and wrong in the circumstances, prohibitions can be introduced into the logic programming/ agent model as a special kind of maintenance goal.
Consider the "runaway trolley" problem (Computational Logic and Human Thinking, 171).
A runaway trolley is about to run over and kill five people. You are a bystander standing on a footbridge overlooking the track. The only way to stop the train and save the five people is to throw a heavy object in front of the train. The only heavy object available is a large man standing next to you. Should you do it? Should you sacrifice one life to save five lives?
In the context of the logic programming/agent model of the intelligence of rational agent, this question reduces to a question about what plan to pursue. The achievement goal is to act in response to the danger, and there are two ways to act. The agent can do nothing or push the large man in front of the oncoming train. What is the agent permitted to do?
What is right in a given situation is often a subject of dispute. It is not our goal in this class, which is not a class in ethics, to try to sovle this problem. Instead, as an example, we consider one way to give the answer in terms of the theory in ethics known as utilitarianism.
Utilitarianism may be understood as an attempt to break into the circle of definitions that characterizes the terms of moral appraisal ('right,' 'wrong,' and 'obligatory'). To do this, utilitarianism understands right in terms the consequences of the actions the agent can perform in the circumstances. Utilitarianism, in this way, specifies right in terms of utility:
• an act is right if, and only if, no alternative has a higher utility
Hedonistic utilitarianism defines utility in terms of pleasure. In this case, an act is right just in case no alternative action the agent can perform would bring about more pleasure.
Just how the pleasure an action brings about is measured is a difficult problem we will not consider. The rough idea, though, is that we think about the consequences of a given action and think about how much pleasure those consequences would bring to the world. This thinking gives us some idea of expected value of the alternative actions in the situation.
A Problem of Ethics
In the case of the runaway trolley, hedonistic utilitarianism has the result that the agent is prohibited from doing anything other than push the man in front of the trolley. We can imagine changing the example so that it gives a different result, but the example stipulates that this action has the most utility. Both actions (pushing the man in front of the trolley, doing nothing) cause some pain (which we may think of as negative pleasure), but it is stipulated that pushing the man in front of the trolley causes the least and hence has the most utility.
For a survey of the decisions people make in examples like the "runaway trolley," see The Moral Machine Experiment. See also the map of the results. For a discussion of the implications, see What Can the Trolley Problem Teach Self-Driving Car Engineers? in Wired magazine. This result is paradoxical.
Many would say that the example is coherent, that it is possible that pushing the man in front of the trolley has the highest utility (where utility is defined in terms of pleasure), but nevertheless this action is wrong. We should nothing even though this means that five people will die.
This presents a problem.
We need knowledge of right and wrong to extend the logic programming/agent model by adding prohibitions to the KB. Otherwise, we do not know what prohibitions to add. So in the case of machine ethics, as with some other problems in AI, it looks like the deepest problem we confront is not a problem in engineering but is a problem in philosophy.
The Public Safety Machine
Since we need knowledge of right and wrong, we will simply assume we have it
We will assume that there is a "public safety machine." The primary goal of this machine is to keep human beings safe. If it notices a human being is in danger, it acts to rectify the situation.
In the example below, the public safety machine is the agent in the "runaway trolley" example.
The public safety machine has ethics, but is not the ethics that hedonistic utilitarianism determines. We leave unanswered what does determine its ethics. We simply assume that it is prohibited from pushing the man in front of the train to save the lives of the five people.
Prohibitions in the Logic Programming/Agent Model
The example shows the use of a prohibition in the context of ethics, but
prohibitions have application outside of ethics.
"Consider an agent who wants to bring parcels from some location A to a location B, using its truck. The distance between A and B is too large to make it without refueling, and so, in order not to end up without gas, the agent needs to stop every once in a while to refuel. The fact that the agent does not want to end up without gas, can be modeled as a maintenance goal. This maintenance goal constrains the actions of the agent, as it is not supposed to drive on in order to fulfil its goal of delivering the parcels, if driving on would cause it to run out of gas" (K. V. Hindriks & M. B. van Riemsdijk, "Satisfying Maintenance Goals," 87-88). Prohibitions function like maintenance goals insofar as they introduce something the agent must do to maintain itself: namely, not violate a prohibition. Beliefs, as we have seen, can trigger maintenance goals. These beliefs may be produced directly by observation or by reasoning forward on the basis of observations. When the maintenance goal is triggered, it issues in an achievement goal. When we add prohibitions to the logic programming/agent model, the items in a plan of action to satisfy an achievement goal function like observations in a possible future. The agent reasons forward from the items in the plan to consequences to determine if any of these consequences trigger a prohibition. If they do, the agent abandons the plan because the prohibition prohibits it from executing this plan to satisfy the achievement goal.
To make this a little clear, consider the reasoning "public safety machine" executes.
We assume that the machine has general beliefs, beliefs about the current situation it faces, and a maintenance goal. We assume that it has the ability to engage in backward and forward reasoning and the ability to rule out a a plan it can execute to satisfy a given achievement goal.
General Beliefs about the World
a person is killed if the person is in danger of being killed by a train
and no one saves the person from being killed by the train.
a person X kills a person Y if X throws Y in front of a train.
a person is in danger of being killed by a train
if the person is on a railtrack
and a train is speeding along the railtrack
and the person is unable to escape from the railtrack.
a person saves a person from being killed by a train
if the person stops the train.
a person stops a train
if the person places a heavy object in front of the train.
a person places a heavy object in front of the train
if the heavy object is next to the person
and the train is on a railtrack
and the person is within throwing distance of the object to the railtrack
and the person throws the object in front of the train.
Beliefs about the Current Situation
five people are on the railtrack.
a train is speeding along the railtrack.
the five people are unable to escape from the railtrack.
john is next to me.
john is an innocent bystander.
john is a heavy object.
I am within throwing distance of john to the railtrack.
A Maintenance Goal
if a person is in danger of being killed by a train
then I respond to the danger of the person being killed by the train.
Two Beliefs in Support of the Maintenance Goal
I respond to the danger of a person being killed by the train
if I ignore the danger.
I respond to the danger of a person being killed by the train
if I save the person from being killed by the train.
If I kill a person and the person is an innocent bystander, then false.
Making certain assumptions for simplicity, forward reasoning yields the belief that
five people are in danger of being killed by the train
This belief triggers the maintenance goal. This introduces the achievement goal
I respond to the danger of the five people being killed by the train
Backward reasoning provides two alternative subgoals
I ignore the danger
I save the five people from being killed by the train.
Thinking about the second subgoal produces the plan of action
I throw John onto the railtrack in front of the train
The question is whether this is a good plan. To determine the answer, the machine reasons forward (or prospectively) to consequences. This is where the prohibition comes into play. When the machine reasons prospectively to consequences of the plan, the prohibition rules the plan out because the plan has an unacceptable consequence (represented as false). "A prohibition can be regarded as a special kind of maintenance goal whose conclusion is literally false. ... [In this way, p]rohibitions are constraints on the actions you can perform" (Computational Logic and Human Thinking, 136). If the machine executes the plan, it kills an innocent bystander. This is wrong (we are assuming). So the machine does not execute the plan. Instead, it ignores the danger.
A More Human Machine
Perhaps another way to give a machine ethics is to give it values that can be at stake and give it the ability to direct its activity toward changing the world so that its values are not at stake.
The following is a sketch of how this might be done. Lots of details need to be worked out, but we can see that, as in the case of the "public safety machine," the crucial bit of intelligence is the ability to reason prospectively to rule out a contemplated plan to achieve a goal.
The values V are represented as tuples (d, Vc) consisting in the degree d (a real number between 0 and 1) of importance of the value and a set of violation conditions Vc for the value. If a violation condition of a value is a consequence of the KB, the value is at stake.
In addition to values, the machine has individual and "moral" goals.
A goal is represented as a tuple (Ac, Sc, Fc, i, S, PLAN). Ac is the set of adopting conditions for the goal. The agent determines whether a condition is true by adding observations to its KB and determining whether the condition is a consequence of its KB. If an adopting condition is true, the goal becomes adopted. Sc is the set of success conditions. If a success condition is true, the goal is achieved and dropped. Fc is the set of failure conditions. If a failure condition is true, the goal has failed and is dropped. Each goal has a degree of importance i (a real number between 0 and 1). S is the state of the goal (sleeping, adopted, active, suspended, achieved, failed, dropped). PLAN is the set of plans for achieving the goal.
The machine's goals (individual and moral) begin in the state of sleeping.
Individual and moral goals are distinguished by their adopting conditions.
For a moral goal, a violation condition of a value is the adopting condition. Moral goals are goals the machine adopts to change the world so that the value is no longer at stake.
The observation-thought-decision-action cycle begins with monitoring. The machine determines whether the adopting conditions of any of its sleeping goals are consequences of the KB. If they are, then the state of the goal changes from sleeping to adopted.
Next the machine reasons prospectively to decide which adopted goals to make active.
This prospective reasoning uses two functions, I and P.
The function I returns the importance of the goal to the machine. (In the case of moral goals, the importance to the machine is the importance of the value at stake.)
The function P is the probability of the success of the plan associated with the goal.
Given these functions, the machine calculates the expected emotional value of its plans. Emotions (according to this approach) are positive or negative reactions that arise in appraisal.
The machine expects joy when it appraises a plan positively because it reasons that executing the plan will make a success condition true. The machine expects distress when it appraises a plan negatively because it reasons that executing the plan will make a failure condition true. The expected emotional value of a plan is the expected joy - the expected distress.
To begin to understand how the machine chooses goals to make active,
consider an example in which the
machine has adopted two goals (g1 and g2) because, after making
certain observations, the
the adopting conditions are consequences of the KB.
In a complete model, how the machine gets the knowledge these numbers encode needs to be specified.
g1 has a .3 importance and a .9 probability of success
g2 has a .7 importance and .2 probability of success
Further, suppose that the machine sees (by reasoning prospectively from the success conditions of the goals) that executing the plan for g1 will make one failure condition true for g2 and that executing the plan for g2 will not make any failure conditions true for g1.
Given this much, we can compute the joy and distress the machine can expect for the plan for g1 and for the plan for g2. For the plan for g1, the expected emotional value (.13) is
the expected joy (.3 x .9 = .27) - the expected distress (.7 x .2 = .14)
For the plan for g2, the expected emotional value (.14) is
the expected joy (.7 x .2 = .14) - the expected distress (0)
If the machine uses expected emotional value to choose which adopted goals to make active, the calculation shows that it can expect to be in a better emotional state if it makes g2 active. Given its choices, the machine can expect to like the world better if pursues g2.
What we have Accomplished in this Lecture
We considered the terms of moral appraisal, how prohibitions can be understood to function like maintenance goals, and how prohibitions can be added to the logic programming/agent model as ways to constrain the plans an agent can execute to satisfy an achievement goal. We also saw that the greatest problem in supplying machines with ethics looks to be at least in part a philosophical problem, not a straightforward problem in engineering.