# Logic and Logic Programming

## The Technical Background

*Computational Logic and Human Thinking*, Appendix A1,
Appendix A2, Appendix A3, Appendix A5

**This material looks more difficult than it is. Be Patient.
Spend some time thinking about it. It is interesting and beautiful
in a certain way, but it takes some time and effort to appreciate
its interest and beauty.
Don't worry if you don't understand every detail of this lecture.
For the purposes of doing well in the course, you only need to
understand enough to answer the questions posed in the
assignments.
Remember too that you can post questions about assignments to make sure you understand them.
**

Logic programming was developed in an effort to construct a better computer language. Almost all modern computers (computing machines) are based on the work of John von Neuman (1903-1957) and his colleagues in the 1940s. As a practical matter, thinking in terms of a von Neuman computing machine is not particularly natural for most people. This led to an attempt to design languages that abstracted away from the underlying machine so that the language would be a more convenient medium of thought for human beings. Many of the languages developed in the mainstream of early computer science (such as the C programming language) remained heavily influenced by the architecture of the machine, but logic programming is completely different in this respect. It is based on logic, which traditionally has been thought to have an intimate connection with thought.

# Logic Programming and Logic

To understand the relationship between *logic programming* and
*logic*, the first step is to understand the relation between the two underlying
languages.

**The language of logic programming**

A logic program is itself really just a formula. It is a conjunction of clauses, but typically it is written in a way that can make this hard to see.

So, for example, in the logic program from the first lecture

**a ← b, c.
a ← f.
b.
b ← g.
c.
d.
e.**

each line is a clause. The program itself is the conjunction of these clauses.

Each *clause* is a disjunction of literals. **a ∨
¬b ∨ ¬c** is a clause. It is logically equivalent
to **a ← (b ∧ c**) (in Prolog notation, **a ← b, c.**), which is the backward-arrow
way to write the more familiar **(b ∧ c) → a**

Clauses are one of
the many normal forms of the first-order predicate calculus. The
first-order predicate calculus is the standard language of logic.

*Literals* are atomic formulas and the negations of atomic formulas.

**a**, **¬b**, **¬c**

Atomic formulas are *positive literals*.

**a**, **b**, **c**

Negations of atomic formulas are *negative literals*.

**¬b**, **¬c**

A *definite clause* contains exactly one positive literal
and zero or more negative literals.

**a** ∨ **¬b ∨
¬c**

A *positive unit clause* is a definite clause containing no
negative literals.

A *negative clause* contains zero or more negative literals
and no positive literals.

The *empty clause* is a negative clause containing no
literals. It is designated by the special symbol **⊥**.

A *Horn clause* is a definite clause or a negative clause.
(Alfred Horn was a
mathematician who described what are now known as "Horn"
clauses.)

An *indefinite clause* is a clause containing at least two
positive literals.

Positive unit clauses are *assertions* or *facts*.
All other definite clauses are *conditional clauses* or
*rules*.

A set of definite clauses whose positive literals share the same
predicate is a *definition* of the predicate. It is also
called a *procedure* for the predicate.

Negative clauses are *queries* or *goal clauses*.

A *logic program* or *knowledge base* (KB) is a conjunction (or set) of
non-negative clauses.

A *definite logic program* is a conjunction (or set) of definite clauses. Any other
program is an *indefinite logic program*. In this course, we are primarily concerned with definite logic programs.

**The language of logic: the propositional calculus**

The definitions of "atomic formulas" and "negations of atomic
formulas" are part of a description of the propositional calculus.

(The propositional calculus is a simplified form of the first-order
predicate calculus, so it helps for understanding (and is traditional) to consider this calculus first.)

Formulas in the propositional calculus are constructed from (possibly atomic) formulas and truth-functional connectives (¬, ∧, ∨, →). The so-called "atomic" formulas have no parts, hence their name. The atomic formulas correspond to declarative sentences. A key is necessary to define the relationship. We will talk more about the key later.

Given the atomic
formulas, compound formulas are defined as follows. If **φ** and **ψ** are
formulas, then so are

**¬φ**

The formula **¬φ** is the negation of **φ**

Read **¬φ** as "not **φ**"

**(φ ∧ ψ)**

The formula **(φ ∧ ψ)** is the conjunction of **φ** and **ψ**

Read (**φ ∧ ψ)** as "**φ** and **ψ**"

**(φ ∨ ψ)**

The formula **(φ ∨ ψ)** is the disjunction of **φ** and **ψ**.

Read **(φ ∨ ψ)** as "**φ** or **ψ**"

**(φ → ψ)**

The formula **(φ → ψ)** is the implication of **ψ** from **φ**

Read **(φ → ψ)** as "if **φ**, then **ψ**"

In these compound formulas, **φ** and **ψ** may be atomic or compound.
Parentheses are needed to eliminate ambiguity. Outside parentheses
are typically dropped to increase readability.

**An example logic program**

Given all this, we can return again to the example logic program (or knowledge base) we considered in the first lecture:

**
a ← b, c.
a ← f.
b.
b ← g.
c.
d.
e.**

In this program, there are three *rules* and four *facts*. The first
(**a ← b, c**), second (**a ←
f**), and fourth (**b ← g**) entries in the
program are the *rules* in the knowledge base. The other
entries are all *facts*.

From a logical point of view, this logic program is a conjunction of the following clauses

**
a ∨ ¬b ∨ ¬c
a ∨ ¬f
b
b ∨ ¬g
c
d
e**

In this way, a logic program is really just a bunch of formulas of logic. (In this
case, they are formulas in the propositional calculus. A formula
of the form (φ → ψ) is equivalent to (¬φ ∨ ψ), so formulas of these
forms are ways to say that same thing.) We will consider
formulas of the first-order predicate calculus later.) This way of understanding logic programs is the first and perhaps **most important thing** to know about logic programs.
**A logic program is a way of expressing formulas that themselves are ways of representing the world. **

**Semantics for the propositional calculus**

To know how the formulas represent the world, it is necessary
to have a *key* or *interpretation* of the symbols of the language.

An
*interpretation*, *f*, is a function from the atomic
formulas to *true* or *false* that is extended to all
the formulas in a way the respects the truth-functional meaning of
the connectives (¬, ∧, ∨, →). The following table displays this function. If shows, e.g., that if
φ is true, then ¬φ is false. This is what one would expect given that ¬ represents negation.

φ ψ ¬φ φ ∧ ψ φ ∨ ψ φ → ψtrue true false true true true true false false false true false false true true false true true false false true false false true

Given a set of formulas, an interpretation may be such that all the formulas true. A *model* of a set of formulas is an interpretation in
which all the formulas are true.

We give such interpretations a special name because we are especially interested in them. (If the interpretations tell us how the world is, and the formulas are our beliefs, then all our beliefs are true on an interpretation that makes all the formulas true.

Now consider again the example logic program (on one side the hashed vertical line) and corresponding formulas in the propositional calculus (on the other)

a ← b, c. | a ∨ ¬b ∨ ¬c a ← f. | a ∨ ¬f b. | b b ← g. | b ∨ ¬g c. | c d. | d e. | e

To specify a model, we need to specify an interpretation
function that makes all the formulas true. Here is a partial
specification of such an interpretation function, *f*:

*f*(**a**) = true

*f*(**b**) = true

*f*(**c**) = true

*f*(**d**) = true

*f*(**e**) = true

This interpretation makes the clauses in the logic program (the entries in the KB) all true.

**Backward Chaining**

When we pose a query to a KB, we want a positive answer
only if there is a certain relation between the query and the beliefs in the KB. We want a positive answer
only if it is rational to believe the query given the beliefs in the KB, but
we don't know the procedure to compute the answer in these terms.
So we settle for the relation of *logical
consequence*. This is something we can compute.

(The extent to which the intelligence of a rational agent can be understood in terms of logical consequence is an unanswered question and the focus of the course.)

Suppose that **P** is a set of the definite clauses constituting
a KB and that the question is whether a query
**a** is a logical consequence of *P*.
In the context of logic programming, the way to answer this question is
to "ask" whether **a** is a logical consequence of the KB. We ask this question by posing the
query **a** to the KB. The query corresponds to the negative clause
**¬a**. This negative clause functions logically as an assumption for a proof by *reductio ad absurdum*.
From a logical point of view, the computation (the process of matching) is an
attempt to derive the empty
clause, ⊥.

This way of "asking" the KB for the answer has the property of being *logically sound*:
if **P** U {**¬a**} ⊢ ⊥, then **P** ⊨
**a**.

**P** U {**¬a**} ⊢ ⊥ says that
⊥ is a *logical consequence* of **P** and
**¬a** and that there is a logical deduction of ⊥ from
**¬a** and the clauses in **P**.

**P** ⊨ **a** says that **P** *logically entails* **a** and
that **a** is true in every
model that makes all the clauses in **P** true.

To get a feel for why backward chaining issues in a positive answer only if the KB (or program) entails the query, consider the following simple logic program

**a ← b.
b.**

The logical form (in the propositional calculus) is

**a ∨ ¬b
b**

Suppose the query posed to the logic program is

**?-a.**

This query is answered by determining whether the corresponding negative clause

**¬a.**

can be added to the KB without contradiction. If it can, then the query is answered negatively. If it cannot, then the query is answered positively.

In this example,
the query **a** matches the head of the rule **a ←
b**. This produces the derived query **b**. The
derived query **b** matches
the fact **b**. This produces the empty list as the derived query. (The empty list
represents the empty clause, which is designated as ⊥.) So back chaining stops, and the query
is answered positively.

Further, in this example, it is easy to see that any interpretation that makes
**a ∨ ¬b** and **b** true makes **a** true. The KB logically entails **a**.

**The corresponding proof in the propositional calculus**

Backward chaining that issues in a positive response to a query corresponds to the existence of a
proof in logic. In the case of the example, The corresponding proof (which looks harder to understand than it is) is set out below
in the form of Gentzen-style natural
deduction.
(Gerhard Gentzen
(November 24, 1909 - August 4, 1945) was a German mathematician and
logician who did the pioneering work on natural deduction.
For an introduction to logic that uses Gentzen-style proofs, see
Neil Tennant's *Natural
Logic*.)

Red marks the rule ( a ∨ ¬b) and the fact (b).
Blue marks the negative clause (¬a) corresponding to the query.
This is the assumption for *reductio*.

[¬a] assumption 1 [a] assumption 2 ------------------------------------- ¬ elimination ⊥ ------ absurdity (ex falso quodlibet) a ∨ ¬b ¬b [¬b] assumption, 3 ---------------------------------------------------------------------- ∨ elimination, discharge assumptions 2 and 3 ¬b b --------------------------------------- ¬ elimination ⊥ ----- ¬ introduction, discharge assumption 1 ¬¬a ------ double negation elimination a

This proof is easier to grasp if it is broken down into its primary parts.

The first is that from a ∨ ¬b (which is the logical form of the first
entry in the KB) and ¬a (which is the negative clause that corresponds to the query),
it follows that **¬b**:

[¬a] assumption 1 -------------------- . . . a ∨ ¬b -------------------------------------------------- ¬b

In the next part of the proof, we shows that given b (which is the fact in the KB), it follows that ⊥:

[¬a] assumption 1 -------------------- . . . a ∨ ¬b --------------------------------------------------- ¬b b ---------------------------------------- ⊥

The pattern in the proof is that the negation of the query together with a rule or a fact are premises in a proof of a derived query:

(queries) (rules and facts) ¬a a ← b (or: a ∨ ¬b) | / | / | / | / ¬b b | / | / | / ⊥

If the derived query is ⊥ (which represents the empty query), then the initial
query is successful. This success is specified in the final part of the proof by the derivation of **a**.

. . . ⊥ ----- ¬¬a ------ a

This shows that backward chaining is sound because it is really just a way of searching for a proof that a given query is a logical consequence of the KB.

**The resolution rule of deduction**

Another way to think of the computation in backward chaining
is in terms of the derived deduction rule called *resolution*.

The resolution rule in
propositional logic is a (derived) deduction rule that produces
a new clause from two clauses with complementary
literals. (Literals are complementary
if one is the negation of the other.)
The following is a simple instance of resolution. In the clauses, **a** and **¬a** are
complementary literals.

a ∨ b ¬a ∨ c --------------- b ∨ c

Because the resolution rule is a derived rule, the proofs are shorter. Here is a simple example. The logic program

**a ← b.
b.**

has the logical form

**a ∨ ¬b
b**

The query

**?-a.**

corresponds to the negative clause

**¬a**

Resolution may be applied to the negative clause corresponding to the query and the logical form of the first rule

(resolution) (queries) (rules and facts) | ¬a a ← b (or: a ∨ ¬b) ¬a a ∨ ¬b | | / ------------- | | / ¬b | ¬b

The conclusion ¬b represents the derived query. Resolution may be applied again to this derived query and to the fact in the KB

| ¬b b ¬b b | | / ---------- | | / ⊥ | ⊥

Now we have reached the empty query ⊥. The initial query is a consequence of the logic program or KB.

**The First-Order Predicate Calculus**

The first shortcoming concerns the expressiveness of the language of the propositional calculus. This is the language in which the agent has his beliefs. If the language of belief is the propositional calculus, the content of the agent's beliefs are too simple to be very realistic. Fortunately this shortcoming is easily addressed if we use the language of the first-order predicate calculus.

The *first-order predicate calculus* is more expressive than the
propositional calculus. The first-order calculus allows for the
representation of parts (names, predicates) of sentences. It also allows for the
representation of quantity. So the description of the language, the models, and backward chaining will be correspondingly
more complicated.

The vocabulary of the first-order predicate calculus subdivides
into two parts, a logical and a nonlogical part. The logical part
is common to all first-order theories. It does not change. The
nonlogical part varies from theory to theory.
The logical part of the vocabulary consists in

• the connectives: ¬ ∧ ∨ → ∀ (universal quantifier) ∃ (existential quantifier)

• the comma and the left and right paren: **, ( )**

• a denumerable list of variables: **x _{1} x_{2}
x_{3} x_{4}. . .**

The nonlogical part of the vocabulary consists in

• a denumerable list of constants: **a _{1} a_{2}
a_{3} a_{4}. . .**

• for each

*n*, a denumerable list of

*n*-place predicates:

**P**

P

P

.

.

.

^{1}_{1}, P^{1}_{2}, P^{1}_{3}, . . .P

^{2}_{1}, P^{2}_{2}, P^{2}_{3}, . . .P

^{3}_{1}, P^{3}_{2}, . . ..

.

.

Given the vocabulary, a well-formed formula is defined
inductively on the number of connectives. (A *term* is
either a variable or a constant.)

• If **P ^{n}** is a

*n*place predicate, and

**t**are terms, then

_{1}, ..., t_{n}**P**is a well-formed formula.

^{n}t_{1}, ..., t_{n}• If

**A**and

**B**are well-formulas, and

**v**is a variable, then

**¬A**,

**(A ∧ B)**,

**(A ∨ B)**,

**(A**→

**B)**,

**∀vA**,

**∃vA**are all well-formed formulas.

• Nothing else is a well-formed formula.

Models for the first-order predicate calculus are a formal representation of what the formulas are about. This allows for the statement of truth-conditions.

• A *model* is an ordered pair <*D*,
*F*>, where *D* is a *domain* and
*F* is an *interpretation*.

The *domain* *D* is a non-empty set. This set
contains the things the formulas are about.

The *interpretation*, *F*, is a function on the
non-logical vocabulary. It gives the meaning of this vocabulary relative to the model.

*F*(**c**) is in *D* and is the *referent*
of **c** in the model.

For every *n*-place predicate **P**^{n},
*F*(**P**^{n}) is a subset of
*D*^{n}. *F*(**P**^{n}) is the
*extension* of **P**^{n} in the model.

• An *assignment* is a function from variables to elements
of *D*. A **v***-variant of an assignment*
*g* is an assignment that agrees with *g* except
possibly on **v**.

Assignments are primarily technical devices. They are required to
provide the truth conditions for the quantifiers, ∀ and ∃.

The *truth* of a formula relative to a model and an
assignment is defined inductively. The base case uses the composite
function [ ]
^{F} _{g} on terms,
defined as follows:

[**t**]
^{F} _{g} =
*F*(**t**) if **t** is a constant. Otherwise, if
**t** is a variable, [**t**]
^{F} _{g} =
*g*(**t**).

The clauses in the inductive definition of *truth relative to M and
g* are as follows:

**P ^{n}**

**t**,...,

_{1}**t**is

_{n}*true relative to M and g*iff <[

**t**]

^{F}

_{g}, ..., [

**t**]

^{F}

_{g}> is in

*F*(

**P**

^{n}).

**¬A**is

*true relative to M and g*iff

**A**is not true relative to

*M*and

*g*.

**A ∧ B**is

*true relative to M and g*iff

**A**and

**B**are true relative to

*M*and

*g*.

**A ∨ B**is

*true relative to M and g*iff

**A**or

**B**is true relative to

*M*and

*g*.

**A**→

**B**is

*true relative to M and g*iff

**A**is not true relative to

*M*and

*g*or

**B**is true relative to

*M*and

*g*.

**∃vA**is

*true relative to M and g*iff

**A**is true relative to

*M*and

*g**, for some

**v**-variant

*g**of

*g*.

**∀vA**is

*true relative to M and g*iff

**A**is true relative to

*M*and

*g**, for every

**v**-variant

*g**of

*g*.

A formula is *true relative to a model* *M* iff it
is true relative to *M* for every assignment *g*.

A formula is
*first-order valid* iff it is true relative to every model.
∀*xFx* → ∃*xFx* is an example. The double
turnstile (⊨) is used to assert truth in all models. So ⊨
∀*xFx* → ∃*xFx* means that ∀*xFx* →
∃*xFx* is true in every model. The truth of this formula
does not depend on any particular *D* or *F*. It is
true in every model.

**An Example in Prolog Notation**

This description of the first-order predicate calculus is obviously much more complicated than the the previous one of the propositional calculus, but it is not necessary to understand every detail. An example stated in the Prolog notation helps show the relation between the first-order predicate calculus and logic programming.

A *variable* (in Prolog) is a word starting with an upper-case letter. A
*constant* is a word that starts with a lower-case letter. A
*predicate* is a word that starts with a lower-case letter.
Constants and predicate symbols are distinguishable by their
context in a knowledge base. An atomic formula has the form
**p(t _{1},...,t_{n})**, where

**p**is a predicate symbol and each

**t**is a term.

_{I}Consider the following example based on the movie *Pulp
Fiction*. (This example is from *Representation and
Inference for Natural Langauge*: *A First Course in
Computational Semantics*, Patrick Blackburn and Johan Bos.)
In the example, various people love other people. Further, there is a rule defining
jealousy. In Prolog notation

*loves (vincent, mia).
loves (marcellus, mia).
loves (pumpkin, honey_bunny).
loves (honey_bunny, pumpkin).
jealous (X, Y) :- loves (X, Z), loves (Y, Z).*

There is no significance to the space between the facts and the rule. It is there for readability. The rule is universally quantified. From a logical point of view, it is

∀X ∀Y ∀Z ( ( loves (X, Z) ∧ loves (Y, Z) ) → jealous (X, Y) )

(Note that this sentence is not a formula of Prolog or the first-order predicate calculus. It is a mixed form, meant to be suggestive.)

The facts and rules constitute a knowledge base. Among the facts, Vincent and Marcellus both love Mia. Pumpkin and Honey-Bunny love each other. In the rule, the symbols X, Y, and Z are variables. The rule itself is general. It says that for every x, y, and z, x is jealous of y if x loves z and y loves z. Obviously, jealousy in the real world is different.

To express this knowledge base in the first-order predicate calculus, a key is necessary. The key specifies the meanings of the constants and predicates:

Vincent **a _{1}**

Marcellus

**a**

_{2}Mia

**a**

_{3}Pumpkin

**a**

_{4}Honey Bunny

**a**

_{5}__ loves ___

**P**

^{2}_{1}Given this key, it is possible to express the entries in the knowledge base. So, for example, the fact that Marcellous loves Mia

*loves (marcellus, mia)*.

is expressed in the first-order predicate calculus (relative to the key) by the formula

**P ^{2}
_{1}**

**a**,

_{2}**a**

_{3}

Consider the following query:

*?- loves (mia, vincent).*

The response is

*no* (or *false*, depending on
the particular implementation of Prolog)

This response means (when understood in terms of classical logic) that the query is not a logical consequence of the knowledge base. (It does not mean that the query is false. True and False are semantic notions. Logic programming tells whether there exists a proof of the query on the basis of premises taken from the KB.)

For an example of a slightly less trivial query, consider

*?- jealous (marcellus, W).*

This query asks whether Marcellus is jealous of someone, i.e., whether

*∃W jealous (marcellus,W)*

is true. Since Marcellus is jealous of Vincent (given the knowledge base), the response is

*W = vincent*

From a logical point of view, the query (and the answer to the query is computed in terms of) the corresponding negative clause

*¬jealous(marcellus,W)*

This negative clause is read as its universal closure

*∀W ¬jealous(marcellus,W)*

which (by the equivalence of ∀ to ¬∃¬ in classical logic) is equivalent to

*¬∃W jealous(marcellus,W)*

The computation (to answer the query) corresponds to the attempt to refute the universal closure (in the context of the knowledge base or program) by trying to derive the empty clause. Given the KB in the example, the empty clause is derivable

{KB, ∀W ¬jealous(marcellus,W)} ⊢ ⊥

This means that

*∃W jealous (marcellus,W)*

is a consequence of the KB (or logic program). Moreover, the computation results in a witness to this existential truth. So the response to the initial query is

*W = vincent*

Here is the corresponding classical proof:

Red marks the rule and two facts from the KB. Blue marks the negative clause corresponding to the query.
This is the assumption for *reductio*.

∀x∀y∀z((loves(x,z) ∧ loves(y,z)) → jealous(x,y)) ---------------------------------------------------- ∀ E ∀y∀z((loves(marcellus,z) ∧ loves(y,z)) → jealous(marcellus,y)) ---------------------------------------------------------------- ∀ E ∀z((loves(marcellus,z) ∧ loves(vincent,z)) → jealous(marcellus,vincent)) loves(marcellus,mia) loves(vincent,mia) ------------------------------------------------------------------------- ∀ E ------------------------------------------ ∧ I (loves(marcellus,mia) ∧ loves(vincent,mia)) → jealous(marcellus,vincent) loves(marcellus,mia) ∧ loves(vincent,mia) [∀x¬jealous(marcellus,x)] assumption 1 ---------------------------------------------------------------------------------------------------------------------------- → E -------------------------- ∀ E jealous(marcellus,vincent) ¬jealous(marcellus,vincent) ---------------------------------------------------------------------------------------------------- ¬ E ⊥ ----- ¬ I, discharge assumption 1 ¬∀x¬jealous(marcellus,x)

Once again, the proof is easier to understand when it is presented in parts. The first part instantiates the rule (in the KB) defining jealousy to Marcellus, Vincent, and Mia:

```
∀x∀y∀z((loves(x,z) ∧ loves(y,z)) → jealous(x,y))
---------------------------------------------------- ∀ E
∀y∀z((loves(marcellus,z) ∧ loves(y,z)) → jealous(marcellus,y))
---------------------------------------------------------------- ∀ E
∀z((loves(marcellus,z) ∧ loves(vincent,z)) → jealous(marcellus,vincent))
------------------------------------------------------------------------- ∀ E
(loves(marcellus,mia) ∧ loves(vincent,mia)) → jealous(marcellus,vincent)
. .
```

Given the facts (in the KB) that both Marcellus and Vincent love Mia, it follows that Marcellus is jealous of Vincent:

```
loves(marcellus,mia) loves(vincent,mia)
------------------------------------------ ∧ I
(loves(marcellus,mia) ∧ loves(vincent,mia)) → jealous(marcellus,vincent) loves(marcellus,mia) ∧ loves(vincent,mia)
---------------------------------------------------------------------------------------------------------------------------- → E
jealous(marcellus,vincent)
```

Now, given the query (which takes the form the assumption for *reductio*), it follows that there is someone such that Marcellus is jealous of him or her:

```
[∀x¬jealous(marcellus,x)] assumption 1
-------------------------- ∀ E
jealous(marcellus,vincent) ¬jealous(marcellus,vincent)
---------------------------------------------------------------------------------------------------- ¬ E
⊥
----- ¬ I, discharge assumption 1
¬∀x¬jealous(marcellus,x)
```

**Unification is Part of the Procedure**

The variables are the one complicating factor in logic programming with the first-order
predicate calculus. Now the backward chaining procedure includes what is called
*unification*.

Unification is a procedure for making two terms match.

- A
*unifier*of two formulas φ and ψ is any substitution, σ, such that φσ = ψσ. - A
*substitution*is a replacement of variables by terms. A substitution σ has the following form {U_{1}/t_{1}, ... , U_{n}/t_{n}}, where U_{I}is a variable and t_{i}is a term, for all i. - For a formula φ, φσ is the replacement of every free occurrence
of U
_{i}in φ with t_{i}. φσ is a*substitution instance*of φ.

Unification is easier to understand in the context of an example. Consider the two-place relation "above" in the following
simple blocks-world program or KB:

on(b1,b2).

on(b3,b4).

on(b4,b5).

on(b5,b6).

above(X,Y) :- on(X,Y).

above(X,Y) :- on(X,Z), above(Z,Y).

This logic program consists in four facts (about which block is on top of which block) and two rules (that define the "above" relation).

Corresponding to the facts and rules is, the following world--use your imagination to see
blocks on top of one
another:

b3 b4 b1 b5 b2 b6

Now suppose the query is whether block *b3* is on top of
block *b5*

?- above(b3,b5).

The computation to answer (or solve) this query runs roughly as
follows. The query does not match the head of any fact. Nor does it match the head
of any rule.
It is clear, though, that there is a substitution that unifies this query
and the head of the first rule. The unifying substitution is

{X/b3, Y/b5}

and the derived query is

on(b3,b5).

This derived query fails. So now it is necessary to backtrack to
see if another match is possible further down in the knowledge
base. Another match is possible. The query can be made
to match the head of the second rule. The unifying substitution
is

{X/b3, Y/b5}

and the derived query is

on(b3,Z), above(Z,b5).

Now the question is whether the first conjunct in this query can be
unified with anything in the knowledge base. It can. The
unifying substitution for the first conjunct in the derived query
is

{Z/b4}

The substitution has to be made throughout the derived query. So,
given that the first conjunct has been made to match, the derived
query becomes

above(b4,b5).

This can be made to match the head of the first rule for
*above*. The unifying substitution is

{X/b4, Y/b5}

and the derived query is now

on(b4,b5}.

This query matches one of the facts in the knowledge base. So the
computation is a success!

Here is a more abstract form of the successful computation that indicates the instances of resolution in an effort to derive the empty clause:

(queries) (rules and facts) (substitutions) ¬above(b3,b5) above(X,Y) :- on(X,Z), above(Z,Y) {X/b3, Y/b5} above(X,Y) ∨ ¬on(X,Z) ∨ ¬above(Z,Y) \ \ / \ / \ / \ / \ / ¬on(b3,Z) ∨ ¬above(Z,b5)) on(b3,b4) {Z/b4} \ / \ / \ / \ / \ / \ / \ / \ / \ / ¬above(b4,b5) above(X,Y) :- on(X,Y) {X/b4, Y/b5} above(X,Y) ∨ ¬on(X,Y) \ / \ / \ / \ / \ / \ / \ / \ / ¬on(b4,b5) on(b4,b5) \ / \ / \ / \ / \ / \ / \ / \ / ⊥

Consider another even simpler query, whether there is a block
on top of block *b5*

?- above(Block,b5). (A variable in Prolog is a word starting with an upper-case letter.)

It is clear that this query can be made to match the head of the
first rule for *above*. The unifying substitution is

{X/Block, Y/b5}.

The derived query is

on(Block,b5).

This can be made to match one of the facts. The unifying
substitution is

{Block/b4}.

Again, the computation is a success! There is something above block
*b5*, and the substitution provides the witness:

Block = b4

(queries) (rules and facts) (substitutions) ¬above(Block,b5) above(X,Y) :- on(X,Y) {X/Block, Y/b5} above(X,Y) ∨ ¬on(X,Y) \ / \ / \ / \ / \ / \ / ¬on(Block,b5) on(b4,b5) {Block/b4} \ / \ / \ / \ / \ / \ / \ / ⊥

(To test your understanding, set out the tree for the computation for the
query *?- jealous (marcellus, W)*.)