Philosophy, Computing, and Artificial Intelligence

PHI 319. An Extension to the Logic Programming/Agent Model.

⊕ Computational Logic and Human Thinking
Chapter 8 (135-139), Chapter 12 (171-181)

The Problem of Machine Ethics

We can think of machine ethics as the problem of how to give "ethics" to machines.

To solve this problem, we must know what we are giving machines when we give them ethics. If we do not know what this is, we will not know what counts as a solution to the problem.

One way to begin to think about this is to think about how we ourselves have ethics and about we express this capacity in our use of the following terms of appraisal:

• right
• wrong
• obligatory

These terms are interdefinable. We can take any one of them as basic and use it to define the other two. If, for example, we take the meaning of the term ‘right’ as basic, we can define 'wrong' as what is "not right" and define 'obligatory' as what is "not right not to do." ⊕

Put a little more formally, we can use the term 'right' to define both 'wrong' and 'obligatory':

• an act is wrong if, and only if, it is not right.
• an act is obligatory if, and only if, it is not right not to do.

Since 'prohibited' means "wrong," a machine that never does anything prohibited never does anything wrong. This might be what we want when we want to give machines "ethics."

We Know how to Solve the Problem

Given this understanding of the problem of machine ethics, a solution would be to build machines whose actions are never wrong and thus whose actions are always right.

In one way, this is easy. All we need to do is build machines so that they never take any actions. A machine that never takes any actions is a machine whose actions are never wrong.

It seems plausible that all existing machines are like this. A toaster, for example, changes in various ways when we turn it on, but none of these changes are actions the toaster takes. So all we have to do to solve the problem of machine ethics is to keep doing what we are doing.

Machines that can Act need Ethics

It is conceivable that this is the only possible solution to the problem of machine ethics, but the current interest in the problem is built on the assumption that we will soon build machines that can themselves act and thus can themselves perform actions that are wrong.

If this assumption is true, we need to think about what such machines will be like.

This is not easy because AI is an unsolved problem, but maybe the logic programming/agent model shows us what it would be for a machine to act. This model of the intelligence of a rational agent is not without problems, but perhaps it does begin to show how it is possible to build a machine that acts and hence that can act in ways that are right or wrong.

The Runaway Trolley Example

If we know what is right and wrong, prohibitions against certain actions can be introduced into the logic programming/ agent model as a special kind of maintenance goal.

⊕ Consider the "runaway trolley" example (Computational Logic and Human Thinking, 171).

The trolley is about to run over and kill five people. You are a bystander on a footbridge overlooking the track. The only way to stop the trolley is to push a heavy object over the footbridge so that it falls in front of the trolley. The only heavy object available is a large man standing next to you. Should you do it? Should you sacrifice one life to save five lives?

In the context of the logic programming/agent model we have been exploring, this question reduces to a question about what plan to pursue. The achievement goal is to act in response to the danger, and there are two ways for the agent to act. The agent can do nothing or push the large man in front of the oncoming train. What is the agent permitted to do?

Utilitarianism

What is right in a given situation is often a subject of dispute. It is not our goal to solve this problem. Instead, as an example, we consider one way to give the answer.

This answer is in terms of the theory in ethics known as Utilitarianism.

Utilitarianism may be understood as an attempt to break into the circle of definitions that characterizes the terms of appraisal ('right,' 'wrong,' and 'obligatory'). To do this, utilitarianism understands right in terms of the utility of the consequences of the actions:

• an act is right if, and only if, no alternative has a higher utility

Hedonistic utilitarianism defines utility in terms of the pleasure the actions would cause. This is probably those most well know form of utilitarianism. According to it, an act is right just in case no alternative action the agent can perform would bring about more pleasure.

How the pleasure an action causes is measured is a difficult problem we will not much consider. Maybe one possiblity is that we think about the consequences of an action and think about how much pleasure those consequences would bring to the world. Maybe this thinking gives us a ranking of the pleasure actions cause in the circumstances.

A Problem of Ethics

In the case of the runaway trolley, hedonistic utilitarianism seems to have the result that the agent is prohibited from doing anything other than push the man in front of the trolley. Both actions (pushing the man in front of the trolley, doing nothing) cause pain (which we may think of as negative pleasure), but unless we change the example, it seems that pushing the man in front of the trolley causes the least pain and hence has the most utility.

⊕ For a survey of the decisions people make in examples like the "runaway trolley," see The Moral Machine Experiment. See also the map of the results. For a discussion of the implications, see What Can the Trolley Problem Teach Self-Driving Car Engineers? in Wired magazine. This result is paradoxical (contrary to common opinion).

Many people say that pushing the man in front of the trolley is wrong. We should do is nothing even though this causes more pain and means that five people will die.

This presents a problem.

We need to know what is right and wrong to add prohibitions, and we may not have this knowledge. If this is true, then in the case of machine ethics it appears that the deepest problem we confront is not a problem in engineering but is a problem in philosophy.

The Public Safety Machine

Since we need knowledge of right and wrong, we will simply assume we have it.

We will assume that there is a "public safety machine." Its primary goal is to keep human beings safe. If it notices a human being is in danger, it acts to rectify the situation.

The public safety machine is the agent in the runaway trolley example. It has ethics, but is not the ethics that hedonistic utilitarianism determines. We leave unanswered what does determine the public safety machine's ethics. We simply assume that it is prohibited in the circumstances from pushing the man in front of the train to save the lives of the five people.

Prohibitions in the Logic Programming/Agent Model

⊕ The example shows the use of a prohibition in the context of ethics, but prohibitions have application outside of ethics.

"Consider an agent who wants to bring parcels from some location A to a location B, using its truck. The distance between A and B is too large to make it without refueling, and so, in order not to end up without gas, the agent needs to stop every once in a while to refuel. The fact that the agent does not want to end up without gas, can be modeled as a maintenance goal. This maintenance goal constrains the actions of the agent, as it is not supposed to drive on in order to fulfil its goal of delivering the parcels, if driving on would cause it to run out of gas" (K. V. Hindriks & M. B. van Riemsdijk, "Satisfying Maintenance Goals," 87-88). Prohibitions function like maintenance goals insofar as they introduce something the agent must do to maintain itself: namely, not violate a prohibition. Beliefs can trigger maintenance goals. These beliefs may be produced directly by observation or by reasoning forward on the basis of observations. When the maintenance goal is triggered, it issues in an achievement goal. When we add prohibitions to the logic programming/agent model, the items in a plan of action to satisfy an achievement goal function like observations in a possible future. The agent reasons forward from the items in the plan to consequences to determine if any trigger a prohibition. If they do, the agent abandons the plan in order to maintain its integrity.

To make this a little clearer, consider the thinking in the public safety machine.

We assume that it has general beliefs, beliefs about the current situation, and a maintenance goal. We assume too that the machine has the ability to engage in backward and forward chaining and the ability to rule out a a plan it can execute to satisfy a given achievement goal.

General Beliefs about the World


        a person is killed if the person is
        in danger of being killed by a train

        and no one saves the person from being killed by the
        train.

        

        a person X kills a person Y if X
        throws Y in front of a train.

        

        a person is in danger of being killed by a train

        if the person is on a railtrack

        and a train is speeding along the railtrack

        and the person is unable to escape from the railtrack.

        

        a person saves a person from being killed by a train

        if the person stops the train.

        

        a person stops a train

        if the person places a heavy object in front of the
        train.

        

        a person places a heavy object in front of the train

        if the heavy object is next to the person

        and the train is on a railtrack

        and the person is within throwing distance of the object to
        the railtrack

        and the person throws the object in front of the
        train.

Beliefs about the Current Situation


        five people are on the
        railtrack.

        a train is speeding along the railtrack.

        the five people are unable to escape from the railtrack.

        john is next to me.

        john is an innocent bystander.

        john is a heavy object.

        I am within throwing distance of john to the
        railtrack.

A Maintenance Goal


        if a person is in danger of being
        killed by a train

        then I respond to the danger of the person being killed by
        the train.

Two Beliefs in Support of the Maintenance Goal


        I respond to the danger of a person
        being killed by the train

        if I ignore the danger.

        

        I respond to the danger of a person being killed by the
        train

        if I save the person from being killed by the train.

A Prohibition


        If I kill a person and the person
        is an innocent bystander, then false.

Making certain assumptions for simplicity, forward reasoning yields the belief that

five people are in danger of being killed by the train

This belief triggers the maintenance goal. This introduces the achievement goal

I respond to the danger of the five people being killed by the train

Backward reasoning provides two alternative subgoals

I ignore the danger
I save the five people from being killed by the train.

Thinking about the second subgoal produces the plan of action

I throw John onto the railtrack in front of the train

The question is whether this is a good plan. To determine the answer, the machine reasons forward (or prospectively) to consequences. This is where the prohibition comes into play. When the machine reasons prospectively to consequences of the plan, the prohibition rules the plan out because the plan has an unacceptable consequence (represented as false). ⊕ "A prohibition can be regarded as a special kind of maintenance goal whose conclusion is literally false. ... [In this way, p]rohibitions are constraints on the actions you can perform" (Robert Kowalski, Computational Logic and Human Thinking, 136). If the machine executes the plan, it kills an innocent bystander. This is wrong (we are assuming). So the machine does not execute the plan. Instead, it ignores the danger.

A More Human Machine

Perhaps another way to give a machine ethics is to give it values that can be at stake and give it the ability to direct its activity toward changing the world so that its values are not at stake.

The following sketches how the mind of such a machine might work. Lots of problems remain unsolved, but we can see that, as in the case of the public safety machine, the crucial bit of intelligence is the ability to reason prospectively to rule out a contemplated plan.

The machine's values V are represented as tuples (d, Vc) consisting in the degree d (a real number between 0 and 1) of importance of the value and a set of violation conditions Vc for the value. If a violation condition is a consequence of the machine's KB, then the value is at stake.

In addition to values, the machine has individual and "moral" goals.

A goal is represented as a tuple (Ac, Sc, Fc, i, S, PLAN). Ac is the set of adopting conditions for the goal. ⊕ If an adopting condition is true, the goal is adopted. Sc is the set of success conditions. If a success condition is true, the goal is achieved and dropped. Fc is the set of failure conditions. If a failure condition is true, the goal has failed and is dropped. Each goal has a degree of importance i (a real number between 0 and 1). S is the state of the goal (sleeping, adopted, active, suspended, achieved, failed, dropped). PLAN is the set of plans for achieving the goal.

The machine's goals (individual and moral) begin in the state of sleeping.

Individual and moral goals are distinguished by their adopting conditions.

For a moral goal, a violation condition of a value is the adopting condition. Moral goals are goals the machine adopts to change the world so that the value is no longer at stake.

The observation-thought-decision-action cycle begins with the machine making obserations and thinking about whether the adopting conditions of any of its sleeping goals are consequences of its KB (what it believes is true of the world). If they are consequences, then the machine changes the state of the goal from sleeping to adopted.

Next the machine reasons prospectively to decide which adopted goals to make active.

This prospective reasoning uses two functions, I and P.

The function I returns the importance of the goa. (In the case of moral goals, the importance of the goal is the importance to the machine of the value at stake.)

The function P is the probability of the success of the plan associated with the goal.

Given these functions, the machine calculates the expected emotional value of its plans. Emotions (according to this approach) are positive or negative reactions that arise in appraisal.

The machine expects joy when it appraises a plan positively because it reasons that executing the plan will make a success condition true. The machine expects distress when it appraises a plan negatively because it reasons that executing the plan will make a failure condition true. The machine calculates the expected emotional value of a given plan by subtracting the distress it expects from executing the plan from the joy it expects from executing the plan.

To begin to understand how the machine chooses goals to make active, consider an example in which the machine has adopted two goals (g1 and g2) because, after making certain observations, it realizes that their adopting conditions are consequences of its KB (what it believes is true of the world). These goals have an importance of probability of success: ⊕

• g1 has a .3 importance and a .9 probability of success
• g2 has a .7 importance and .2 probability of success

Further, suppose that the machine realizes (by reasoning prospectively from the plans) that

• executing the plan for g1 will make one failure condition true for g2
• executing the plan for g2 will not make any failure conditions true for g1.

Given this much, we can compute the joy and distress the machine can expect for the plan for g1 and for the plan for g2. For the plan for g1, the expected emotional value (.13) is

the expected joy (.3 x .9 = .27) - the expected distress (.7 x .2 = .14)

For the plan for g2, the expected emotional value (.14) is

the expected joy (.7 x .2 = .14) - the expected distress (0)

If the machine uses expected emotional value to choose which of its adopted goals to make active, the calculation shows that it can expect to be in a better emotional state (to like the world better) if it makes g2 active. Hence this is the decision it makes.

What we have Accomplished in this Lecture

We considered the terms of moral appraisal, how prohibitions can be understood to function like maintenance goals, and how prohibitions can be added to the logic programming/agent model as ways to constrain the plans an agent can execute to satisfy an achievement goal. We also saw that the greatest problem in supplying machines with ethics looks to be at least in part a philosophical problem, not a straightforward problem in engineering.