# Logic and Logic Programming

## The Technical Background

*Computational Logic and Human Thinking*, Appendix A1,
Appendix A2, Appendix A3, Appendix A5

**This material looks more difficult than it is. Be Patient.
Spend some time thinking about it. It is interesting and beautiful
in a certain way, but it takes some time and effort to appreciate
its interest and beauty.
Don't worry if you don't understand every detail of this lecture.
For the purposes of doing well in the course, you only need to
understand enough to answer the questions posed in the
assignments.
Remember too that you can post questions about assignments to make sure you understand them.
**

Logic programming was developed in an effort to construct a better computer language. Almost all modern computers (computing machines) are based on the work of John von Neuman (1903-1957) and his colleagues in the 1940s. As a practical matter, thinking in terms of a von Neuman computing machine is not particularly natural for most people. This led to an attempt to design languages that abstracted away from the underlying machine so that the language would be a more convenient medium of thought for human beings. Many of the languages developed in the mainstream of early computer science (such as the C programming language) remained heavily influenced by the architecture of the machine, but logic programming is completely different in this respect. It is based on logic, which traditionally has been thought to have an intimate connection with thought.

# Logic Programming and Logic

To understand the relationship between
*logic programming* and *logic*,
the first step is to understand the relation between the two underlying
languages.

**The language of logic programming**

A logic program is itself really just a formula. The formula is a conjunction of clauses, but typically it is written in a way that can make this hard to see.

So, for example, in the logic program from the first lecture

**a ← b, c.
a ← f.
b.
b ← g.
c.
d.
e.**

each line is a clause. The program itself is the conjunction of these clauses.

Each *clause* is a disjunction of literals. **a ∨
¬b ∨ ¬c** is a clause. It is logically equivalent
to **a ← (b ∧ c**) (in Prolog notation, **a ← b, c.**), which is the backward-arrow
way to write the more familiar **(b ∧ c) → a**

Clauses are one of
the many so-called "normal" forms in logic.

*Literals* are atomic formulas and the negations of atomic formulas.

**a**, **¬b**, **¬c**

Atomic formulas are *positive literals*.

**a**, **b**, **c**

Negations of atomic formulas are *negative literals*.

**¬b**, **¬c**

A *definite clause* contains exactly one positive literal
and zero or more negative literals.

**a** ∨ **¬b ∨
¬c**

A *positive unit clause* is a definite clause containing no
negative literals.

A *negative clause* contains zero or more negative literals
and no positive literals.

The *empty clause* is a negative clause containing no
literals. It is designated by the special symbol **⊥**.

A *Horn clause* is a definite clause or a negative clause.
(Alfred Horn was a
mathematician who described what are now known as "Horn"
clauses.)

An *indefinite clause* is a clause containing at least two
positive literals.

Positive unit clauses are *assertions* or *facts*.
All other definite clauses are *conditional clauses* or
*rules*.

A set of definite clauses whose positive literals share the same
predicate is a *definition* of the predicate. It is also
called a *procedure* for the predicate.

Negative clauses are *queries* or *goal clauses*.

A *logic program* or *knowledge base* (KB) is a conjunction (or set) of
non-negative clauses.

A *definite logic program* is a conjunction (or set) of definite clauses. Any other
program is an *indefinite logic program*. In this course, we are primarily concerned with definite logic programs.

**The language of logic: the propositional calculus**

The definitions of "atomic formulas" and "negations of atomic
formulas" are part of a description of the propositional calculus.

(The propositional calculus is a simplified form of the first-order
predicate calculus, so it helps for understanding (and is traditional) to consider this calculus first.)

Formulas in the propositional calculus are constructed from atomic formulas and truth-functional connectives (¬, ∧, ∨, →). The so-called "atomic" formulas have no parts, hence their name. The atomic formulas correspond to declarative sentences. A key is necessary to define the relationship. We will talk more about the key later.

Given the atomic
formulas, compound formulas are defined as follows. If **φ** and **ψ** are
formulas, then so are

**¬φ**

The formula **¬φ** is the negation of **φ**

Read **¬φ** as "not **φ**"

**(φ ∧ ψ)**

The formula **(φ ∧ ψ)** is the conjunction of **φ** and **ψ**

Read (**φ ∧ ψ)** as "**φ** and **ψ**"

**(φ ∨ ψ)**

The formula **(φ ∨ ψ)** is the disjunction of **φ** and **ψ**.

Read **(φ ∨ ψ)** as "**φ** or **ψ**"

**(φ → ψ)**

The formula **(φ → ψ)** is the implication of **ψ** from **φ**

Read **(φ → ψ)** as "if **φ**, then **ψ**"

In these compound formulas, **φ** and **ψ** may be atomic or compound.
Parentheses are needed to eliminate ambiguity. Outside parentheses
are typically dropped to increase readability.

**An example logic program**

Given all this, we can return again to the example logic program (or knowledge base) we considered in the first lecture:

**
a ← b, c.
a ← f.
b.
b ← g.
c.
d.
e.**

In this program, there are three *rules* and four *facts*. The first
(**a ← b, c**), second (**a ←
f**), and fourth (**b ← g**) entries in the
program are the *rules* in the knowledge base. The other
entries are all *facts*.

From a logical point of view, this logic program is a conjunction of the following clauses

**
a ∨ ¬b ∨ ¬c
a ∨ ¬f
b
b ∨ ¬g
c
d
e**

In this way, a logic program is really just a bunch of formulas of logic. (In this
case, they are formulas in the propositional calculus. A formula
of the form (φ → ψ) is truth-functionally equivalent to (¬φ ∨ ψ), so formulas of these
forms have the same truth conditions.) We will consider
formulas of the first-order predicate calculus later.) This way of understanding logic programs is the first and perhaps **most important thing** to know about logic programs.
**A logic program is a way of expressing formulas that themselves are ways of representing the world. **

**Semantics for the propositional calculus**

To know what state of the world a formula represents, it is necessary
to have a *key* or *interpretation* of the symbols of the language.

An
*interpretation*, *f*, is a function from the atomic
formulas to *true* or *false* that is extended to all
the formulas in a way the respects the truth-functional meaning of
the connectives (¬, ∧, ∨, →). The following table displays this function.
It shows, e.g., that
φ is true just in case ¬φ is false. This is what one would expect given that ¬ represents negation.

φ ψ ¬φ φ ∧ ψ φ ∨ ψ φ → ψtrue true false true true true true false false false true false false true true false true true false false true false false true

Given a set of formulas, an interpretation may be what is called a model. A *model* of a set of formulas is an interpretation in
which all the formulas are true.

We give such interpretations a special name because we are especially interested in them.
The main reason we are interested in them here is that we want to know whether backward chaining can reach a false
output on the basis of true inputs.
(Another reason to take an interest in models (which will be familiar to many of you who have taken a logic class) is that they
characterize certain classes of formulas. So, for example, a formula for which all interpretations
are models is a *tautology*. Its truth is independent of the way the world is.)

Now consider again the example logic program (on one side of the hashed vertical line) and corresponding formulas in the propositional calculus (on the other)

a ← b, c. | a ∨ ¬b ∨ ¬c a ← f. | a ∨ ¬f b. | b b ← g. | b ∨ ¬g c. | c d. | d e. | e

To specify a model, we need to specify an interpretation
function that makes all the formulas true. Here is a partial
specification of such an interpretation function, *f*:

*f*(**a**) = true

*f*(**b**) = true

*f*(**c**) = true

*f*(**d**) = true

*f*(**e**) = true

This interpretation makes the clauses in the logic program (the entries in the KB) all true. The world is the way the agent with this KB thinks the world is.

**Backward chaining and proofs**

When we pose a query to a KB, we want a positive answer
only if there is a certain relationship between the query and the beliefs in the KB. We want a positive answer
only if it is *rational* to believe the query given the beliefs in the KB, but
we don't know the procedure to compute the answer in these terms.
So we settle for the relation of *logical
consequence*. This is something we can compute.

(This settling for the relation of logical consequence may not seem important, but it is. Logical deduction (deducing logical consequences) is an instance of reasoning, but there is more to reasoning than this. Whether the intelligence of a rational agent can be understood in terms of a model built on logical consequence is an unanswered question and focus of the course.)

Suppose that **P** is a set of the definite clauses constituting
a KB and that the question is whether a query
**a** is a logical consequence of **P**.
In the context of logic programming, the way to answer this question is
to "ask" whether **a** is a logical consequence of the KB. We ask this question by posing the
query **a** to the KB. The query corresponds to the negative clause
**¬a**. This negative clause functions logically as an assumption for a proof by *reductio ad absurdum*.
From a logical point of view, the computation (the process of matching) is an
attempt to derive the empty
clause, ⊥.

(⊥ is an empty disjunction. It is a disjunction with no disjuncts. Sine a disjunction is true just in case a least one disjunct is true, ⊥ is false.)

This way of "asking" the KB for an answer is *sound*:
if **P** U {**¬a**} ⊢ ⊥, then **P** ⊨
**a**. If the beliefs in the KB are true, a positive answer means that the query is true.

**P** U {**¬a**} ⊢ ⊥ says that
⊥ is a *logical consequence* of **P** and
**¬a**. This means there is a logical deduction of ⊥ from
**¬a** and the clauses in **P**.

**P** ⊨ **a** says that **P** *logically entails* **a**. This means
that **a** is true in every
model that makes all the clauses in **P** true.

To get a feel for why backward chaining issues in a positive answer only if the KB (or program) entails the truth of the query, consider the following simple logic program

**a ← b.
b.**

The logical form (in the propositional calculus) is

**a ∨ ¬b
b**

Suppose the query posed to the logic program is

**?-a.**

This query is answered by determining whether the corresponding negative clause

**¬a.**

can be added to the KB without contradiction. If it can, then the query is answered negatively. If it cannot, then the query is answered positively.

In this example,
the query **a** matches the head of the rule **a ←
b**. This produces the derived query **b**. The
derived query **b** matches
the fact **b**. Now the list of derived queries is empty. (The empty list
represents the empty clause, which is designated as ⊥.) This causes backward chaining to stop and the query
to be answered positively.

Further, in this example, it is easy to see that any interpretation that makes both
**a ∨ ¬b** and **b** true also makes **a** true. The KB logically entails that **a** is true.

**The corresponding proof in the propositional calculus**

Backward chaining that issues in a positive response to a query corresponds to the existence of a
proof in classical logic (= a deduction in classical logic). In the case of the example, the corresponding proof (which looks harder to understand than it is) is set out below
in the form of Gentzen-style natural
deduction.
(Gerhard Gentzen
(November 24, 1909 - August 4, 1945) was a German mathematician and
logician who did the pioneering work on natural deduction.
For an introduction to logic that uses Gentzen-style proofs, see
Neil Tennant's *Natural
Logic*.)

Red marks the rule ( a ∨ ¬b) and the fact (b).
Blue marks the negative clause (¬a) corresponding to the query.
This negative clause is the assumption for *reductio*.

[¬a] assumption 1 [a] assumption 2 ------------------------------------- ¬ elimination ⊥ ------ absurdity (ex falso quodlibet) a ∨ ¬b ¬b [¬b] assumption, 3 ---------------------------------------------------------------------- ∨ elimination, discharge assumptions 2 and 3 ¬b b --------------------------------------- ¬ elimination ⊥ ----- ¬ introduction, discharge assumption 1 ¬¬a ------ double negation elimination a

This proof is a little easier to understand if it is separated into its three
primary parts.
The first of these parts shows that from the premises a ∨ ¬b (which is the logical form of the first
entry in the KB) and ¬a (which is the negative clause that corresponds to the query),
the conclusion **¬b** is a logical consequence:

[¬a] assumption 1 -------------------- . . . a ∨ ¬b -------------------------------------------------- ¬b

The second part of the proof extends the first. It shows that given the first part of the proof and given b (which is the fact in the KB), it follows that ⊥:

[¬a] assumption 1 -------------------- . . . a ∨ ¬b --------------------------------------------------- ¬b b ---------------------------------------- ⊥

The pattern in the proof (which is used in backward chaining) is that the negation of the query together with a rule or a fact are premises in a proof of a derived query:

(queries) (rules and facts) ¬a a ← b (or: a ∨ ¬b) | / | / | / | / ¬b b | / | / | / ⊥

If the derived query is ⊥ (which represents the empty clause), then the initial
query is successful. This success is specified in the final part of the proof by the derivation of **a**.

. . . ⊥ ----- ¬¬a ------ a

This shows that backward chaining is sound. It is really just a way of searching for a
*reductio ad absurdum* proof that
a given query is a logical consequence of the KB.

**The representation in clausal form**

One might think that the representation in clausal form is unnecessary. Consider again the logic program

**a ← b.
b.**

with the query

**?-a.**

We can readily see that if backward chaining is successful, there is a proof of the query using conditional elimination:

b b → a ------------ →E a

So one might wonder why it is traditional to think of the proof underlying backward chaining in terms of the program represented as a conjunction of clauses

**a ∨ ¬b
b**

The reason, as I understand it, is primarily historical. Logic programming comes out of work
in automated theorem-proving. In this tradition, the development of a
technique called "resolution" was a major breakthrough. (The seminal paper is
J. A. Robinson's "A Machine-Oriented Logic Based on the Resolution Principle."
*Journal of the ACM*, vol. 12, 1965, 23-41.)

**The resolution rule of deduction**

It is possible to understand the computation in backward chaining in terms of resolution.

The resolution rule in
propositional logic is a derived deduction rule that produces
a new clause from two clauses with complementary
literals. (Literals are complementary
if one is the negation of the other.)
The following is a simple instance of resolution. In the clauses, **a** and **¬a** are
complementary literals.

a ∨ b ¬a ∨ c --------------- b ∨ c

Because the resolution rule is a derived rule, the proofs are shorter. Here is a simple example. The logic program

**a ← b.
b.**

has the logical form

**a ∨ ¬b
b**

The query

**?-a.**

corresponds to the negative clause

**¬a**

Resolution may be applied to the negative clause corresponding to the query and the logical form of the first rule

(resolution) (queries) (rules and facts) | ¬a a ← b (or: a ∨ ¬b) ¬a a ∨ ¬b | | / ------------- | | / ¬b | ¬b

The conclusion ¬b represents the derived query. Resolution may be applied again to this derived query and to the fact in the KB

| ¬b b ¬b b | | / ---------- | | / ⊥ | ⊥

Now we have reached the empty query ⊥. The initial query is a consequence of the logic program or KB.

In the automated theorem-proving question, the question is whether a conclusion is a logical consequence of some set of premises. The first step in determining the answer is to rewrite the premises and conclusion as sets of clauses. (In the automatic theorem-proving tradition, clauses are represented as sets.)

The rewriting occurs according to the following rules, which need to be applied in order.

1. Conditionals (C):

φ → ψ ⇒ ¬φ ∨ ψ

2. Negations (N):

¬¬φ ⇒ φ

¬(φ ∧ ψ) ⇒ ¬φ ∨ ¬ψ

¬(φ ∨ ψ) ⇒ ¬φ ∧ ¬ψ

3. Distribution (D):

φ ∨ (ψ ∧ χ) ⇒ (φ ∨ ψ) ∧ (φ ∨ χ)

(φ ∧ ψ) ∨ χ ⇒ (φ ∨ χ) ∧ (ψ ∨ χ)

φ ∨ (φ_{1} ∨ ... ∨ φ_{n}) ⇒ φ ∨ φ_{1} ∨ ... ∨ φ_{n}

(φ_{1} ∨ ... ∨ φ_{n}) ∨ φ ⇒ φ_{1} ∨ ... ∨ φ_{n} ∨ φ

φ ∧ (φ_{1} ∧ ... ∧ φ_{n}) ⇒ φ ∧ φ_{1} ∧ ... ∧ φ

(φ_{1} ∧ ... ∧ φ_{n}) ∧ φ ⇒ φ_{1} ∧ ... ∧ φ_{n} ∧ φ

4. Sets (S):

φ_{1} ∨ ... ∨ φ_{n} ⇒ {φ_{1}, ... , φ_{n}} (Sets cannot have
a member multiple times. This means, e.g., that **a v b v a** rewrites as **{a, b}**.)

φ_{1} ∧ ... ∧ φ_{n} ⇒ {φ_{1}}, ... , {φ_{n}}

Consider for example the formula **a ∧ (b → c)**.
Based on the rewrite rules, the sets of clauses are ** {a}, {¬b, c}**. Suppose we wanted to know if **a**
follows from **a ∧ (b → c)**. We can easily see that it does follow. The proof in classical logic consists in
one application of the deduction rule and-elimination.

a ∧ (b → c) ------------ ∧E a

Resolution does not find this proof. Instead, it works as a refutation procedure. We rewrite the premise and the negation of the
conclusion. If
the empty clause **{}** derivable using the resolution rule

{φ_{1}, ... , χ, ... , φ_{m}}

{ψ_{1}, ... , ¬χ, ... , ψ_{n}}

----------------------------------

{φ_{1}, ... , φ_{m}, ψ_{1}, ..., ψ_{n}}

then the clauses are inconsistent. In this example, the clauses are ** {a}, {¬b, c}, {¬a}**.
So it is easy to see that the empty clause is derivable. The proof in classical logic is

a ∧ (b → c) ----------- ∧E a [¬a]^{1}--------------------- ¬E ⊥ ------- ¬I,1 ¬¬a ------- ¬¬E a

To use resolution in automated theorem-proving, it is necessary to have a control procedure for the steps to determine whether the empty clause is
derivable. Consider, for example, the argument

p

p → q

(p → q) → (q → r)

---------------------

r

One way to construct a resolution proof that the conclusion is a logical consequence of the premises is

1. {p} Premise

2. {¬p, q} Premise

3. {p, ¬q, r} Premise (Note that this is not a definite clause.)

4. {¬q, r} Premise

5. {¬r} Premise

6. {q} 1, 2

7. {r} 4, 6

8. {} 5, 7

This, however, is not the only way to apply the resolution rule. We could have first applied it to 2 and 3, and we
could have done this in different ways. Logic programming was born out of reflection on the question of the control procedure in the case
in which the clauses are definite clauses. The way a query is solved in logic programming incorporates one possible control procedure.

**The successful query and its proof**

For the **purposes of this course, it is not important to know all the details** about how
the correspondence between successful queries and the existence of a proof. The important point is to
know that there is a correspondence. If we pose a query **a** to a KB, and the query is successful,
then there exists a logical deduction of **a** from premises in the KB. The deduction is a
*reductio ad absurdum*. The negation of the query is the assumption for *reductio*. Because
a contradiction
follows from this assumption and premises in the KB, the query is a logical consequence of premises in the KB.

**The language of logic: the first-order predicate calculus**

The language of the propositional calculus is not very expressive. This is a problem because, in the model, the propositional calculus is the language in which the agent has his beliefs. This means that the content of the agent's beliefs are too simple to be very realistic. Fortunately this shortcoming is easily addressed if we use the language of the first-order predicate calculus.

The *first-order predicate calculus* is more expressive than the
propositional calculus. The first-order calculus allows for the
representation of parts (names, predicates) of sentences. It also allows for the
representation of quantity. So the description of the language, the models, and backward chaining will be correspondingly
more complicated.

The vocabulary of the first-order predicate calculus subdivides
into two parts, a logical and a nonlogical part. The logical part
is common to all first-order theories. It does not change. The
nonlogical part varies from theory to theory.
The logical part of the vocabulary consists in

• the connectives: ¬ ∧ ∨ → ∀ (universal quantifier) ∃ (existential quantifier)

• the comma and the left and right paren: **, ( )**

• a denumerable list of variables: **x _{1} x_{2}
x_{3} x_{4}. . .**

The nonlogical part of the vocabulary consists in

• a denumerable list of constants: **a _{1} a_{2}
a_{3} a_{4}. . .**

• for each

*n*, a denumerable list of

*n*-place predicates:

**P**

P

P

.

.

.

^{1}_{1}, P^{1}_{2}, P^{1}_{3}, . . .P

^{2}_{1}, P^{2}_{2}, P^{2}_{3}, . . .P

^{3}_{1}, P^{3}_{2}, . . ..

.

.

Given the vocabulary, a well-formed formula is defined
inductively on the number of connectives. (A *term* is
either a variable or a constant.)

• If **P ^{n}** is a

**n**-place predicate, and

**t**are terms, then

_{1}, ..., t_{n}**P**is a well-formed formula.

^{n}t_{1}, ..., t_{n}• If

**A**and

**B**are well-formulas, and

**v**is a variable, then

**¬A**,

**(A ∧ B)**,

**(A ∨ B)**,

**(A**→

**B)**,

**∀vA**,

**∃vA**are all well-formed formulas.

• Nothing else is a well-formed formula.

Models for the first-order predicate calculus are a formal representation of what the formulas are about. This allows for the statement of truth-conditions.

• A *model* is an ordered pair <*D*,
*F*>, where *D* is a *domain* and
*F* is an *interpretation*.

The *domain* *D* is a non-empty set. This set
contains the things the formulas are about.

The *interpretation*, *F*, is a function on the
non-logical vocabulary. It gives the meaning of this vocabulary relative to the domain.

For every constant **c**, *F*(**c**) is in *D*. *F*(**c**) is the *referent*
of **c** in the model.

For every **n**-place predicate **P**^{n},
*F*(**P**^{n}) is a subset of
*D*^{n}. *F*(**P**^{n}) is the
*extension* of **P**^{n} in the model.

• An *assignment* is a function from variables to elements
of *D*. A **v***-variant of an assignment*
*g* is an assignment that agrees with *g* except
possibly on **v**.

(Assignments are primarily technical devices. They are required to
provide the truth conditions for the quantifiers, ∀ and ∃.)

The *truth* of a formula relative to a model and an
assignment is defined inductively. The base case uses the composite
function [ ]
^{F} _{g} on terms,
defined as follows:

[**t**]
^{F} _{g} =
*F*(**t**) if **t** is a constant. Otherwise, [**t**]
^{F} _{g} =
*g*(**t**) if
**t** is a variable.

The clauses in the inductive definition of *truth relative to M and
g* are as follows:

**P ^{n}**

**t**,...,

_{1}**t**is

_{n}*true relative to M and g*iff <[

**t**]

^{F}

_{g}, ..., [

**t**]

^{F}

_{g}> is in

*F*(

**P**

^{n}).

**¬A**is

*true relative to M and g*iff

**A**is not true relative to

*M*and

*g*.

**A ∧ B**is

*true relative to M and g*iff

**A**and

**B**are true relative to

*M*and

*g*.

**A ∨ B**is

*true relative to M and g*iff

**A**or

**B**is true relative to

*M*and

*g*.

**A**→

**B**is

*true relative to M and g*iff

**A**is not true relative to

*M*and

*g*or

**B**is true relative to

*M*and

*g*.

**∃vA**is

*true relative to M and g*iff

**A**is true relative to

*M*and

*g**, for some

**v**-variant

*g**of

*g*.

**∀vA**is

*true relative to M and g*iff

**A**is true relative to

*M*and

*g**, for every

**v**-variant

*g**of

*g*.

A formula is *true relative to a model* *M* iff it
is true relative to *M* for every assignment *g*.

A formula is
*first-order valid* iff it is true relative to every model.
∀*xFx* → ∃*xFx* is an example. The double
turnstile (⊨) is used to assert truth in all models. So ⊨
∀*xFx* → ∃*xFx* means that ∀*xFx* →
∃*xFx* is true in every model. The truth of this formula
does not depend on any particular *D* or *F*. It is
true in every model.

**An Example in Prolog Notation**

This description of the first-order predicate calculus is obviously much more complicated than the the previous one of the propositional calculus, but it is not necessary to understand every detail. An example stated in the Prolog notation helps show the relationship between the first-order predicate calculus and logic programming.

A *variable* (in Prolog) is a word starting with an upper-case letter. A
*constant* is a word that starts with a lower-case letter. A
*predicate* is a word that starts with a lower-case letter.
Constants and predicate symbols are distinguishable by their
context in a knowledge base. An atomic formula has the form
**p(t _{1},...,t_{n})**, where

**p**is a predicate symbol and each

**t**is a term.

_{I}Consider the following example based on the movie *Pulp
Fiction*. (This example is from *Representation and
Inference for Natural Langauge*: *A First Course in
Computational Semantics*, Patrick Blackburn and Johan Bos.)
In the example, various people love other people. Further, there is a rule defining
jealousy. In Prolog notation

*loves (vincent, mia).
loves (marcellus, mia).
loves (pumpkin, honey_bunny).
loves (honey_bunny, pumpkin).
jealous (X, Y) :- loves (X, Z), loves (Y, Z).*

There is no significance to the space between the facts and the rule. It is there for readability. The rule is universally quantified. From a logical point of view, it is

∀X ∀Y ∀Z ( ( loves (X, Z) ∧ loves (Y, Z) ) → jealous (X, Y) )

(Note that this sentence is not a formula of Prolog or the first-order predicate calculus. It is a mixed form, meant to be suggestive.)

The facts and rules constitute a knowledge base. Among the
facts, **vincent** and **marcellus** both love **mia**. **pumpkin** and **honey_bunny** love
each other. In the rule, the symbols X, Y, and Z are variables. The
rule itself is general. It says that for every x, y, and z, x is
jealous of y if x loves z and y loves z. Obviously, jealousy in the
real world is different.

To express this knowledge base in the first-order predicate calculus, a key is necessary. The key specifies the meanings of the constants and predicates:

Vincent **a _{1}**

Marcellus

**a**

_{2}Mia

**a**

_{3}Pumpkin

**a**

_{4}Honey Bunny

**a**

_{5}__ loves ___

**P**

^{2}_{1}Given this key, it is possible to express the entries in the knowledge base. So, for example, the fact that Marcellous loves Mia

*loves (marcellus, mia)*.

is expressed in the first-order predicate calculus (relative to the key) by the formula
**P ^{2}
_{1}**

**a**,

_{2}**a**

_{3}Next consider the following query:

*?- loves (mia, vincent).*

The response is

*no* (or *false*, depending on
the particular implementation of Prolog)

This response means (when understood in terms of classical logic) that the query is not a logical consequence of the knowledge base. (It does not mean that the query is false. True and false are semantic notions. Logic programming tells whether there exists a proof of the query on the basis of premises taken from the KB.)

For an example of a slightly less trivial query, consider

*?- jealous (marcellus, W).*

This query asks whether Marcellus is jealous of someone, i.e., whether

*∃W jealous (marcellus,W)*

is true. Since Marcellus is jealous of Vincent (given the knowledge base), the response is

*W = vincent*

From a logical point of view, the query (and the answer to the query is computed in terms of) the corresponding negative clause

*¬jealous(marcellus,W)*

This negative clause is read as its universal closure

*∀W ¬jealous(marcellus,W)*

which (by the equivalence of ∀ to ¬∃¬ in classical logic) is equivalent to

*¬∃W jealous(marcellus,W)*

The computation (to answer the query) corresponds to the attempt to refute the universal closure (in the context of the knowledge base or program) by trying to derive the empty clause. Given the KB in the example, the empty clause is derivable

{KB, ∀W ¬jealous(marcellus,W)} ⊢ ⊥

This means that

*∃W jealous (marcellus,W)*

is a consequence of the KB (or logic program). Moreover, the computation results in a witness to this existential truth. So the response to the initial query is

*W = vincent*

Here is the corresponding classical proof:

Red marks the rule and two facts from the KB. Blue marks the negative clause corresponding to the query.
This is the assumption for *reductio*.

∀x∀y∀z((loves(x,z) ∧ loves(y,z)) → jealous(x,y)) ---------------------------------------------------- ∀ E ∀y∀z((loves(marcellus,z) ∧ loves(y,z)) → jealous(marcellus,y)) ---------------------------------------------------------------- ∀ E ∀z((loves(marcellus,z) ∧ loves(vincent,z)) → jealous(marcellus,vincent)) loves(marcellus,mia) loves(vincent,mia) ------------------------------------------------------------------------- ∀ E ------------------------------------------ ∧ I (loves(marcellus,mia) ∧ loves(vincent,mia)) → jealous(marcellus,vincent) loves(marcellus,mia) ∧ loves(vincent,mia) [∀x¬jealous(marcellus,x)] assumption 1 ---------------------------------------------------------------------------------------------------------------------------- → E -------------------------- ∀ E jealous(marcellus,vincent) ¬jealous(marcellus,vincent) ---------------------------------------------------------------------------------------------------- ¬ E ⊥ ----- ¬ I, discharge assumption 1 ¬∀x¬jealous(marcellus,x)

Once again, the proof is easier to understand when it is presented in parts. The first part instantiates the rule (in the KB) defining jealousy to Marcellus, Vincent, and Mia:

```
∀x∀y∀z((loves(x,z) ∧ loves(y,z)) → jealous(x,y))
---------------------------------------------------- ∀ E
∀y∀z((loves(marcellus,z) ∧ loves(y,z)) → jealous(marcellus,y))
---------------------------------------------------------------- ∀ E
∀z((loves(marcellus,z) ∧ loves(vincent,z)) → jealous(marcellus,vincent))
------------------------------------------------------------------------- ∀ E
(loves(marcellus,mia) ∧ loves(vincent,mia)) → jealous(marcellus,vincent)
. .
```

Given the facts (in the KB) that both Marcellus and Vincent love Mia, it follows that Marcellus is jealous of Vincent:

```
loves(marcellus,mia) loves(vincent,mia)
------------------------------------------ ∧ I
(loves(marcellus,mia) ∧ loves(vincent,mia)) → jealous(marcellus,vincent) loves(marcellus,mia) ∧ loves(vincent,mia)
---------------------------------------------------------------------------------------------------------------------------- → E
jealous(marcellus,vincent)
```

Now, given the query (which takes the form the assumption for *reductio*), it follows that there is someone such that Marcellus is jealous of him or her:

```
[∀x¬jealous(marcellus,x)] assumption 1
-------------------------- ∀ E
jealous(marcellus,vincent) ¬jealous(marcellus,vincent)
---------------------------------------------------------------------------------------------------- ¬ E
⊥
----- ¬ I, discharge assumption 1
¬∀x¬jealous(marcellus,x)
```

**Unification is Part of the Procedure**

The instantiation of variables is the one complicating factor in logic programming with the first-order
predicate calculus. Now the backward chaining procedure includes what is called
*unification*.

Unification is a procedure for making two terms match.

- A
*unifier*of two formulas φ and ψ is any substitution, σ, such that φσ = ψσ. - A
*substitution*is a replacement of variables by terms. A substitution σ has the following form {U_{1}/t_{1}, ... , U_{n}/t_{n}}, where U_{I}is a variable and t_{i}is a term. - For a formula φ, φσ is the replacement of every free occurrence
of U
_{i}in φ with t_{i}. φσ is a*substitution instance*of φ.

Unification is easier to understand in the context of an example.

Consider the two-place relation "above" in the following
simple blocks-world program or KB:

on(b1,b2).

on(b3,b4).

on(b4,b5).

on(b5,b6).

above(X,Y) :- on(X,Y).

above(X,Y) :- on(X,Z), above(Z,Y).

This logic program consists in four facts (about which block is on top of which block) and two rules (that define the "above" relation).

Corresponding to the facts and rules is the following world (use your imagination to see
blocks on top of one
another):

b3 b4 b1 b5 b2 b6

Now suppose the query is whether block *b3* is on top of
block *b5*

?- above(b3,b5).

The computation to answer (or solve) this query runs roughly as follows. The query does not match the head of any fact. Nor does it match the head of any rule. It is clear, though, that there is a substitution that unifies this query and the head of the first rule. The unifying substitution is

{X/b3, Y/b5}

This substitution produces

above(b3,b5) :- on(b3,b5).

So the derived query is

on(b3,b5).

This derived query fails. So now it is necessary to backtrack to see if another match is possible further down in the knowledge base. Another match is possible. The query can be made to match the head of the second rule. The unifying substitution is

{X/b3, Y/b5}

This produces

above(b3,b5) :- on(b3,Z), above(Z,b5).

The derived query is

on(b3,Z), above(Z,b5).

Now the question is whether the first conjunct in this query can be unified with anything in the knowledge base. It can. The unifying substitution for the first conjunct in the derived query is

{Z/b4}

The substitution has to be made throughout the derived query. So, given that the first conjunct has been made to match, the derived query becomes

above(b4,b5).

This can be made to match the head of the first rule for
*above*. The unifying substitution is

{X/b4, Y/b5}

and the derived query is now

on(b4,b5}.

This query matches one of the facts in the knowledge base. So the computation is a success!

Here is a more abstract form of the successful computation that indicates the instances of resolution in an effort to derive the empty clause:

(queries) (rules and facts) (substitutions) ¬above(b3,b5) above(X,Y) :- on(X,Z), above(Z,Y) {X/b3, Y/b5} above(X,Y) ∨ ¬on(X,Z) ∨ ¬above(Z,Y) \ \ / \ / \ / \ / \ / ¬on(b3,Z) ∨ ¬above(Z,b5)) on(b3,b4) {Z/b4} \ / \ / \ / \ / \ / \ / \ / \ / \ / ¬above(b4,b5) above(X,Y) :- on(X,Y) {X/b4, Y/b5} above(X,Y) ∨ ¬on(X,Y) \ / \ / \ / \ / \ / \ / \ / \ / ¬on(b4,b5) on(b4,b5) \ / \ / \ / \ / \ / \ / \ / \ / ⊥

Consider another even simpler query. The question is whether there is a block
on top of block *b5*

?- above(Block,b5). (A variable in Prolog is a word starting with an upper-case letter.)

It is clear that this query can be made to match the head of the
first rule for *above*. The unifying substitution is

{X/Block, Y/b5}.

The derived query is

on(Block,b5).

This can be made to match one of the facts. The unifying
substitution is

{Block/b4}.

Again, the computation is a success! There is something above block
*b5*, and the substitution provides the witness:

Block = b4

(queries) (rules and facts) (substitutions) ¬above(Block,b5) above(X,Y) :- on(X,Y) {X/Block, Y/b5} above(X,Y) ∨ ¬on(X,Y) \ / \ / \ / \ / \ / \ / ¬on(Block,b5) on(b4,b5) {Block/b4} \ / \ / \ / \ / \ / \ / \ / ⊥

(To test your understanding, set out the tree for the computation for the
query **?- jealous (marcellus, W)**.)

# What we have accomplished

We looked at the relation between logic and logic programming.

We saw that if a query is successful, then the query is a logical consequence of the KB because there is a logical deduction of the query from premises in the KB. To see this, we looked at the language of logic programming and the language of the propositional and first-order predicate calculus. We looked at models and proofs to understand the difference between logical entailment and logical consequence. We looked at the relationship between the backward chaining process in a successful query and the underlying proof in classical logic.