Logic and Logic Programming

The Technical Background


Computational Logic and Human Thinking, A1-A3 (251-283), A5 (290-300)
Thinking as Computation, Chapter 2 (23-39)



This material looks more difficult than it is. Be Patient. Spend some time thinking about it. It is interesting and beautiful in a certain way, but it takes some time and effort to appreciate its interest and beauty.

Don't worry if you don't understand every detail of this lecture. For the purposes of doing well in the course, you only need to understand enough to answer the questions posed in the assignments.

Remember too that you can post questions about assignments to make sure you understand them.


Logic programming was developed in an effort to construct a better computer language. Almost all modern computers (computing machines) are based on the work of John von Neuman (1903-1957) and his colleagues in the 1940s. As a practical matter, thinking in terms of a von Neuman computing machine is not particularly natural for most people. This led to an attempt to design languages that abstracted away from the underlying machine so that the language would be a more convenient medium of thought for human beings. Many of the languages developed in the mainstream of early computer science (such as the C programming language) remained heavily influenced by the architecture of the machine, but logic programming is completely different in this respect. It is based on logic, which traditionally has been thought to be connected with rational thought.

(It is worth thinking about what logic is, what thinking is in a rational agent, and how they are connected.)

The term 'logic' here refers to the primary example of modern logic: the first-order predicate calculus. This is the logic that comes out of the work of Gottlob Frege (1848-1925) and others who developed it to understand mathematics. Logic programming has its basis in this logic, and we are using logic programming to model (a part) of the "mind" of a rational agent.


The Languages of Logic and Logic Programming

To understand the relationship between logic and logic programming, a first step is to understand the relation between the two underlying languages.


The language of logic programming

A logic program is itself really just a formula of logic. This formula is a conjunction of clauses, but typically it is written in a way that can make this hard to see.

So, for example, in the logic program I set out in the first lecture

a ← b, c.
a ← f.
b.
b ← g.
c.
d.
e.

each line is a clause. The program itself is the formula that is the conjunction of these clauses.


Some Definitions

Logic programming employs a host of definitions. In this course, it is not necessary to memorize these definitions. I provide the list because I use some of the terms in the lectures, and they are common in the literature on logic programming. The only really important terms to understand are "logic program" and the terms used in its definition.

• A clause is a disjunction of literals.
a ∨ ¬b ∨ ¬c is a clause. It is logically equivalent to a ← (b ∧ c) (in Prolog notation, a ← b, c.). This is the backward-arrow way to write (b ∧ c) → a

Literals are atomic formulas and the negations of atomic formulas.
a is an atomic formula. ¬b, ¬c are negations of atomic formulas.

• Atomic formulas are positive literals.
a, b, c

• Negations of atomic formulas are negative literals.
¬b, ¬c

• A definite clause contains exactly one positive literal and zero or more negative literals.
a¬b ∨ ¬c

• A positive unit clause is a definite clause containing no negative literals.

• A negative clause contains zero or more negative literals and no positive literals.

• An empty clause is a negative clause containing no literals. It is designated by the special symbol .

• A Horn clause is a definite clause or a negative clause. (Alfred Horn was a mathematician who described what are now known as "Horn" clauses.)

• An indefinite clause is a clause containing at least two positive literals.

• Positive unit clauses are facts. All other definite clauses are rules.

• A set of definite clauses whose positive literals share the same predicate is a definition of the predicate. It is also called a procedure for the predicate.

• Negative clauses are queries or goal clauses.

• A logic program is a conjunction (or set) of non-negative clauses.

• A definite logic program is a conjunction (or set) of definite clauses. Any other program is an indefinite logic program.
(In this course, we are primarily concerned with definite logic programs.)


The language of logic: the propositional calculus

The definitions of "atomic formulas" and :negations of atomic formulas" are part of a description of the propositional calculus. The propositional calculus is a simplified form of the first-order predicate calculus, so it is traditional in philosophy to consider the propositional calculus as an introduction to the first-order predicate calculus.

Formulas in the propositional calculus are constructed from atomic formulas and truth-functional connectives (¬, ∧, ∨, →). The so-called "atomic" formulas have no parts, hence their name. The atomic formulas represent declarative sentences. A key is necessary to define the relationship. We will talk more about the key later.

It is a theory in the philosophy of language that declarative sentences express propositions. (We saw propositions in the first lecture. In the sentence "Tom knows that Socrates died 399 BCE," the nominalized sentence "that Socrates died in 399 BCE" refers to the proposition the sentence "Tom knows that Socrates died 399 BCE" says that Tom knows.)

Given the atomic formulas, compound formulas are defined as follows. If φ and ψ are formulas, then so are

¬φ
The formula ¬φ is the negation of φ
Read ¬φ as "not φ"

(φ ∧ ψ)
The formula (φ ∧ ψ) is the conjunction of φ and ψ
Read (φ ∧ ψ) as "φ and ψ"

(φ ∨ ψ)
The formula (φ ∨ ψ) is the disjunction of φ and ψ.
Read (φ ∨ ψ) as "φ or ψ"

(φ → ψ)
The formula (φ → ψ) is the implication of ψ from φ
Read (φ → ψ) as "if φ, then ψ"

In these compound formulas, φ and ψ may be atomic or compound. Parentheses are needed to eliminate ambiguity. Outside parentheses are typically dropped to increase readability. In this course, we will not consider in detail how to use these rules to construct formulas.


An example logic program

Given this information about the two languages, we can return again to the example logic program we considered in the first lecture:

a ← b, c.
a ← f.
b.
b ← g.
c.
d.
e.

In this program, there are three rules and four facts. The first entry (a ← b, c) is a rule. So is the second (a ← f) and the fourth (b ← g) entries. The other entries are all facts.

This logic program is written as a list, but really it is a conjunction of clauses. Further, each clause is a formula in the propositional calculus. We can see this more clearly if we keep in mind that (φψ) is truth-functionally equivalent to (¬φψ). Given this equivalence, the logic program from the first lecture (stated above) is the conjunction of the following formulas:

a ∨ ¬b ∨ ¬c
a ∨ ¬f
b
b ∨ ¬g
c
d
e

So a logic program is really just a bunch of formulas of logic. (In this case, they are formulas in the propositional calculus. We will consider formulas of the first-order predicate calculus later.) This is the first and most important thing to know about logic programs. A logic program is a conjunction of formulas that themselves are ways of representing the world.

Here is the second most important thing to know about logic programs. A rational agent has beliefs about the world in terms of which he or she acts. The collection of these beliefs is what we call the agent's "knowledge base" (or "KB"). In the model of the intelligence of a rational agent we are developing, we model the agent's knowledge base as a logic program.


Semantics for the propositional calculus

To know what state of the world a formula represents, it is necessary to have a key for the symbols of the language. A key assigns the symbols meanings. This is necessary to model how the agent's beliefs understood as a logic program are true or false relative to the way world is. This is accomplished by constructing an interpretation function relative to the key.

An interpretation function is a function, f, from the atomic formulas to true or false that is extended to all the formulas in a way the respects the truth-functional meanings the key assigns to the connective symbols (¬, ∧, ∨, →). The following table displays a part of four such functions. It shows, e.g., that φ is true just in case ¬φ is false.


φ        ψ      ¬φ      φ ∧ ψ   φ ∨ ψ   φ → ψ     

true    true    false   true    true    true    
true    false   false   false   true    false   
false   true    true    false   true    true    
false   false   true    false   false   true

Given values for φ and ψ, formulas constructed out of these formulas receive values according to the truth-functions for the connectives (¬, ∧, ∨, →).

Consider the simplest case in which φ and ψ are atomic formulas. Suppose, according to the key, φ and ψ stand for "The sun is shinning" and "It is raining." An interpretation function arbitrarily assigns truth-values (true and false) to these atomic sentences and assigns truth-values to formulas constructed out of these atomic sentences according to the truth-functions in displayed in the table. Suppose it assigns true to the atomic formula (φ) that corresponds to "The sun is shinning." Then it assigns false to the negation (¬φ) that corresponds to "It is not the case that the sun is shinning."


A model of a set of formulas

Given a set of formulas, an interpretation may be what in logic is called a model. A model of a set of formulas is an interpretation function that makes all the formulas true. (This is a use of the term 'model' in the context of logic. It is not what we mean when we say that a logic program is a model of an agent's knowledge base.)

Notice that a model need not correspond to reality. A model may assign true to an atomic formula that, given the key, is false in the actual world. So, e.g., it might assign true to the atomic formula corresponding to "The sun is shinning," even though "The sun is shinning" if false because it is cloudy and raining outside.

We give such interpretation functions a special name because we are especially interested in them. The reason we are interested in them here is that we want to know whether backward chaining can reach a false output on the basis of true inputs. To know this, we need to know the conditions under which the formulas in a logic program are true or false.

(Another reason to take an interest in models (which will be familiar to many of you who have taken a class in logic) is that they characterize certain classes of formulas. So, for example, a formula for which all interpretations are models is a tautology. Its truth is independent of the way the world is. a ∨ ¬a is an example. It is true no matter the truth-value assigned to a,)

Why are we interested in whether backward chaining can reach a false output on the basis of true inputs?

In thinking about what to do, rational agents reason in terms of their beliefs. In the model of the intelligence of a rational agent we are developing, we model this reasoning as backward chaining. (We represent the agent's beliefs as a logic program, and we represent the agent's reasoning in terms of his or her beliefs in terms of backward chaining on this logic program.) If we know whether backward chaining can reach a false output on the basis of true inputs, we know have some information about how reliable backward chaining is and hence how good the model is.


A model of the formulas in a logic program

Now consider again the example logic program (on one side of the hashed vertical line) and corresponding formulas in the propositional calculus (on the other side):


a ← b, c.     |      a ∨ ¬b  ∨ ¬c 
a ← f.        |      a ∨ ¬f
b.            |      b
b ← g.        |      b ∨ ¬g
c.            |      c
d.            |      d
e.            |      e

To specify a model, we need to specify an interpretation function that makes all the formulas true. Here is a partial specification of such an interpretation function, f:

f(a) = true
f(b) = true
f(c) = true
f(d) = true
f(e) = true

This interpretation, f, makes all the clauses in the logic program true. (Given the interpretation function, the world is the way the agent with this KB (represented in terms of the logic program) thinks the world is.) We can see that this interpretation function makes all the clauses in the program true by looking at the the form the clauses take in the propositional calculus. So, e.g., we can see that the first clause (a ← b, c) is true because, according to the way an interpretation function s constructed, we know that a disjunction (a ∨ ¬b ∨ ¬c) with at least one true disjunct (a) is true.


Backward chaining answers questions

When we pose a query to a KB, we get a positive answer only if there is a certain relationship between the query and the beliefs in the KB. We want a positive answer only if it is rational to believe the query given the beliefs in the KB, but unfortunately we don't know the procedure to compute the answer in these terms.

So we settle for the relation of logical consequence between the beliefs in the KB and the query. This is something we can compute.

(This settling for the relation of logical consequence may not seem important, but it is. Logical deduction (deducing logical consequences) is an instance of reasoning, but there is more to reasoning than this. Whether the intelligence of a rational agent can be understood in terms of a model built on logical consequence is an unanswered question and focus of the course.)

Suppose that P is a set of the definite clauses constituting a KB and that the question is whether a query a is a logical consequence of P. In the context of logic programming, the way to answer this question is to "ask" whether a is a logical consequence of the KB. We ask this question by posing the query a to the KB. The query corresponds to the negative clause ¬a. This negative clause functions logically as an assumption for a proof by reductio ad absurdum. From a logical point of view, the computation (the process of matching) is an attempt to derive the empty clause, ⊥.

(⊥ is an empty disjunction. It is a disjunction with no disjuncts. Sine a disjunction is true just in case a least one disjunct is true, ⊥ is false.)


How good are the answers to the question?

This way of "asking" and getting an "answer" from the KB is sound. This "asking" gets a positive "answer" only if the KB logically entails the truth of the query. A positive answer does not mean that the query is true. It means that the query is true on all the interpretations that make all the beliefs in the KB true.

More formally: if P U {¬a} ⊢ ⊥, then Pa.

P U {¬a} ⊢ ⊥ says that ⊥ is a logical consequence of P and ¬a. This means there is a logical deduction of ⊥ on the basis of ¬a and the clauses in P.

Pa says that P logically entails a. This means that a is true in every model that makes all the clauses in P true.

So if the agent has all true beliefs (= every belief in his KB is true), then a query is answered positively only if it is true too. Given the way an answer to the query is computed, there is no case in which an interpretation makes the beliefs in the KB true but fails to make a positively answered query true.

It is not necessary for us in this course to understand the proof that backward chaining is sound. The proof is technical and beyond the scope of this course, but it is easy enough to get a feel for why backward chaining issues in a positive answer only if the KB logically entails the truth of the query. Consider the following simple logic program

a ← b.
b.

The logical form (in the propositional calculus) is

a ∨ ¬b
b

Suppose the query posed to the logic program is

?-a.

This query is answered by determining whether the corresponding negative clause

¬a.

can be added to the KB without contradiction. If it can, then the query is answered negatively. If it cannot, then the query is answered positively.

The matching procedure we saw in the first lecture determines whether the negative clause can be added to the KB without contradiction. The first step is to see whether the query matches the head of any entry in the KB. It does match a head. The query a matches the head of the rule a ← b. This produces the derived query b. The derived query b matches the fact b. Now the list of derived queries is empty. (The empty list represents the empty clause, which is designated as ⊥.) This causes backward chaining to stop and the query to be answered positively.

Further, in this example, it is easy to see that any interpretation that makes both a ∨ ¬b and b true also makes a true. The KB logically entails that a is true.


The corresponding proof in the propositional calculus

Backward chaining that issues in a positive response to a query corresponds to the existence of a proof in classical logic (= a deduction in classical logic). In the case of the example, the corresponding proof (which looks harder to understand than it is) is set out below in the form of Gentzen-style natural deduction. (Gerhard Gentzen (1909-1945) was a German mathematician and logician who did the pioneering work on natural deduction. For an introduction to logic that uses Gentzen-style proofs, see Neil Tennant's Natural Logic.‎) For this course, it is not necessary to understand the corresponding classical proof. The proof, though, is not all that hard to understand, so I set it out for those who are interested.

In the proof, red marks the rule (a ∨ ¬b) and the fact (b). Blue marks the negative clause (¬a) corresponding to the query. This negative clause is the assumption for reductio.



  [¬a] assumption 1  [a] assumption 2          
  -------------------------------------  ¬ elimination      
                       ⊥
                    ------ absurdity (ex falso quodlibet)
  a ∨ ¬b              ¬b                                     [¬b] assumption, 3          
  ----------------------------------------------------------------------  ∨ elimination, discharge assumptions 2 and 3
                                 ¬b                            b
                           --------------------------------------- ¬ elimination 
                                                    ⊥
                                                  ----- ¬ introduction, discharge assumption 1
                                                   ¬¬a
                                                  ------ double negation elimination
                                                     a


This proof may be divided into three parts. The first of these parts shows that from the premises a ∨ ¬b (which is the logical form of the first entry in the logic program) and ¬a (which is the negative clause that corresponds to the query), the conclusion ¬b is a logical consequence:



                  [¬a]  assumption 1           
                 --------------------       
                            .
                            .
                            .
                    
  a ∨ ¬b                                       
  --------------------------------------------------  
                                ¬b             
 
                             

The second part of the proof extends the first. It shows that given the first part of the proof and given b (which is the fact in the KB), it follows that ⊥:



              [¬a]  assumption 1           
            --------------------       
                        .
                        .
                        .
                    
  a ∨ ¬b                                       
  ---------------------------------------------------
                                 ¬b                           b        
                            ----------------------------------------
                                                  ⊥
                                              


The pattern in these two parts of the proof corresponds to the matching procedure in backward chaining. The negative clause (¬a) that is the logical form of the query (a) together with a rule or a fact are premises in a proof of a negative clause (¬b) that is the logical form of the derived query (b):



             (rules and facts)

    ¬a      a ← b  (or: a ∨ ¬b)
     |     /
     |    /
     |   /
     |  /
    ¬b     b
     |    /
     |   /
     |  / 
     ⊥

   

If the negative clause is ⊥ (which corresponds to having no more derived queries), then the initial query is successful. That is to say, the backward chaining process stops when there are no more derived queries and returns a positive answer to the initial query (a). This success is specified in the final part of the proof by the derivation of a.

	
	                                          .
	                                          .
	                                          .
	                                          
	                                          ⊥
	                                        ----- 
                                                 ¬¬a
                                                ------ 
                                                  a
	
	

In this way, backward chaining is really just a way of searching for a reductio ad absurdum proof that a given query is a logical consequence of the logic program.



Logic Programming and Automated Theorem-Proving

Logic programming comes out of work in automated theorem-proving. In this tradition, the development of a technique called "resolution" was a major breakthrough. (The seminal paper is J. A. Robinson's "A Machine-Oriented Logic Based on the Resolution Principle." Journal of the ACM, vol. 12, 1965, 23-41.)

(For this course, it is not necessary to understand how logic programming comes out of automated theorem-proving. It is illuminating, though, and not hard to understand.)

The resolution rule in propositional logic is a derived deduction rule that produces a new clause from two clauses with complementary literals. (Literals are complementary if one is the negation of the other.) The following is a simple instance of resolution. In the clauses, a and ¬a are complementary literals.


  a ∨ b   ¬a ∨ c
  ---------------
      b ∨ c

Because the resolution rule is a derived rule, the proofs are shorter. Here is a simple example. The logic program

a ← b.
b.

has the logical form

a ∨ ¬b
b

The query

?-a.

corresponds to the negative clause

¬a

Resolution may be applied to the negative clause corresponding to the query and to the first rule

      
(resolution)                     (rules and facts)
      
                    |      ¬a    a ← b  (or: a ∨ ¬b)
¬a     a ∨ ¬b       |       |   / 
-------------       |       |  /
    ¬b              |       ¬b

The conclusion ¬b represents the derived query. Resolution may be applied again to ¬b and to the fact in the KB


                     |       ¬b    b
¬b       b           |        |   / 
----------           |        |  /
    ⊥                |        ⊥

Now we have reached ⊥, so the initial query is a consequence of the logic program.


Resolution and automated theorem-proving

In the context of automated theorem-proving, the question is whether a given conclusion is a logical consequence of a given set of premises. The first step in determining the answer is to rewrite the premises and conclusion as sets of clauses. (In the automatic theorem-proving tradition, clauses are represented as sets.)

The rewriting occurs according to the following rules, which need to be applied in order.

1. Conditionals (C):
       φ → ψ   ⇒   ¬φ ∨ ψ

2. Negations (N):
      ¬¬φ   ⇒   φ
      ¬(φ ∧ ψ)   ⇒   ¬φ ∨ ¬ψ
      ¬(φ ∨ ψ)   ⇒   ¬φ ∧ ¬ψ

3. Distribution (D):
      φ ∨ (ψ ∧ χ)   ⇒   (φ ∨ ψ) ∧ (φ ∨ χ)
      (φ ∧ ψ) ∨ χ   ⇒   (φ ∨ χ) ∧ (ψ ∨ χ)
      φ ∨ (φ1 ∨ ... ∨ φn)   ⇒   φ ∨ φ1 ∨ ... ∨ φn
      (φ1 ∨ ... ∨ φn) ∨ φ   ⇒   φ1 ∨ ... ∨ φn ∨ φ
      φ ∧ (φ1 ∧ ... ∧ φn)   ⇒   φ ∧ φ1 ∧ ... ∧ φn
      (φ1 ∧ ... ∧ φn) ∧ φ   ⇒   φ1 ∧ ... ∧ φn ∧ φ

4. Sets (S):
      φ1 ∨ ... ∨ φn   ⇒   {φ1, ... , φn}       (Sets cannot have a member multiple times. This means, e.g., that a v b v a rewrites as {a, b}.)
      φ1 ∧ ... ∧ φn   ⇒   {φ1}, ... , {φn}


Consider, for example, the formula a ∧ (b → c). Based on the rewrite rules, the sets of clauses are {a}, {¬b, c}. Suppose we wanted to know if a follows from a ∧ (b → c). We can easily see that it does follow. The proof in classical logic consists in one application of the deduction rule called and-elimination.

	
    a ∧ (b → c) 
   ------------ ∧E
	 a   
	

Automated-theorem proving uses resolution to find the answer, but it does not find the answer by finding the proof in terms of and-elimination. Instead, it uses resolution in a refutation procedure. It rewrites the premise and the negation of the conclusion. If the empty clause {} derivable using the resolution rule

         {φ1, ... , χ, ... , φm}
         {ψ1, ... , ¬χ, ... , ψn}
         ----------------------------------
         {φ1, ... , φm, ψ1, ..., ψn}

then the sets of clauses that come from rewriting the premises and conclusion are inconsistent. In this example, the clauses are

{a}, {¬b, c}, {¬a}.

So it is easy to see that the empty clause is derivable.

	
	{a}
	{¬a}
       ------
	{ }
	

This proves that the conclusion a is a logical consequence of the premise a ∧ (b → c). The corresponding "refutation-style" proof in classical logic is

	
  a ∧ (b → c)
  ----------- ∧E
	 a             [¬a]1
	--------------------- ¬E
	         ⊥
	      ------- ¬I,1
	        ¬¬a
	       ------- ¬¬E
	          a          
	

To use resolution in automated theorem-proving, it is necessary to have a control procedure for the steps to determine whether the empty clause is derivable.
Consider, for example, the argument

          p
          p → q
          (p → q) → (q → r)
          ---------------------
          r

One way to construct a resolution proof that the conclusion is a logical consequence of the premises is

          1. {p}               Premise
          2. {¬p, q}         Premise
          3. {p, ¬q, r}      Premise      (Note that this premise is not a definite clause.)
          4. {¬q, r}          Premise
          5. {¬r}              Premise
          6. {q}               1, 2
          7. {r}                4, 6
          8. {}                 5, 7

This, however, is not the only way to apply the resolution rule. We could have first applied it to 2 and 3, and we could have done this in different ways. Logic programming was born out of reflection on the question of the control procedure in the case in which the clauses are definite clauses. The way a query is solved in logic programming incorporates one possible control procedure.


The language of logic: the first-order predicate calculus

The language of the propositional calculus is not very expressive. This is a problem because, in the model, the propositional calculus is the language in which the agent has his beliefs. This means that the content of the agent's beliefs are too simple to be very realistic. Fortunately this shortcoming is easily addressed if we use the language of the first-order predicate calculus.

The first-order predicate calculus is more expressive (and more complicated) than the propositional calculus. It allows for the representation of parts (names, predicates) of sentences. It also allows for the representation of quantity. So the description of the language, the models, and backward chaining will be correspondingly more complicated.

(Again, for this course, it is not necessary to understand all the details.)

The vocabulary of the first-order predicate calculus subdivides into two parts, a logical and a nonlogical part. The logical part is common to all first-order theories. It does not change. The nonlogical part varies from theory to theory. The logical part of the vocabulary consists in

• the connectives: ¬ ∧ ∨ → ∀ (universal quantifier) ∃ (existential quantifier)
• the comma and the left and right paren: , ( )
• a denumerable list of variables: x1 x2 x3 x4. . .

The nonlogical part of the vocabulary consists in

• a denumerable list of constants: a1 a2 a3 a4. . .
• for each n, a denumerable list of n-place predicates:

P1 1, P1 2, P1 3, . . .

P2 1, P2 2, P2 3, . . .

P3 1, P3 2, . . .
.
.
.

Given the vocabulary, a well-formed formula is defined inductively on the number of connectives. (A term is either a variable or a constant.)

• If Pn is a n-place predicate, and t1, ..., tn are terms, then Pnt1, ..., tn is a well-formed formula.
• If A and B are well-formulas, and v is a variable, then ¬A, (A ∧ B), (A ∨ B), (AB), ∀vA, ∃vA are all well-formed formulas.
• Nothing else is a well-formed formula.

Models for the first-order predicate calculus are a formal representation of what the formulas are about. This allows for the statement of truth-conditions.

• A model is an ordered pair <D, F>, where D is a domain and F is an interpretation.

The domain D is a non-empty set. This set contains the things the formulas are about.

The interpretation, F, is a function on the non-logical vocabulary. It gives the meaning of this vocabulary relative to the domain.
For every constant c, F(c) is in D. F(c) is the referent of c in the model.
For every n-place predicate Pn, F(Pn) is a subset of Dn. F(Pn) is the extension of Pn in the model.

• An assignment is a function from variables to elements of D. A v-variant of an assignment g is an assignment that agrees with g except possibly on v.
(Assignments are primarily technical devices. They are required to provide the truth conditions for the quantifiers, ∀ and ∃.)

The truth of a formula relative to a model and an assignment is defined inductively. The base case uses the composite function [ ] F g on terms, defined as follows:

[t] F g = F(t) if t is a constant. Otherwise, [t] F g = g(t) if t is a variable.

The clauses in the inductive definition of truth relative to M and g are as follows:

Pnt1,..., tn is true relative to M and g iff <[t] F g, ..., [t] F g> is in F(Pn).

¬A is true relative to M and g iff A is not true relative to M and g.
A ∧ B is true relative to M and g iff A and B are true relative to M and g.
A ∨ B is true relative to M and g iff A or B is true relative to M and g.
AB is true relative to M and g iff A is not true relative to M and g or B is true relative to M and g.
∃vA is true relative to M and g iff A is true relative to M and g*, for some v-variant g* of g.
∀vA is true relative to M and g iff A is true relative to M and g*, for every v-variant g* of g.

A formula is true relative to a model M iff it is true relative to M for every assignment g.

A formula is first-order valid iff it is true relative to every model. ∀xFx → ∃xFx is an example. The double turnstile (⊨) is used to assert truth in all models. So ⊨ ∀xFx → ∃xFx means that ∀xFx → ∃xFx is true in every model. The truth of this formula does not depend on any particular D or F. It is true in every model.


An example in prolog notation

An example stated in the Prolog notation helps show the relationship between the first-order predicate calculus and logic programming.

A variable (in Prolog) is a word starting with an upper-case letter. A constant is a word that starts with a lower-case letter. A predicate is a word that starts with a lower-case letter. Constants and predicate symbols are distinguishable by their context in a knowledge base. An atomic formula has the form p(t1,...,tn), where p is a predicate symbol and each tI is a term.

Consider the following example based on the movie Pulp Fiction. (This example is from Representation and Inference for Natural Langauge: A First Course in Computational Semantics, Patrick Blackburn and Johan Bos.) In the example, various people love other people. Further, there is a rule defining jealousy. In Prolog notation

loves (vincent, mia).
loves (marcellus, mia).
loves (pumpkin, honey_bunny).
loves (honey_bunny, pumpkin).

jealous (X, Y) :- loves (X, Z), loves (Y, Z).

There is no significance to the space between the facts and the rule. It is there for readability. The rule is universally quantified. From a logical point of view, it is

∀X ∀Y ∀Z ( ( loves (X, Z) ∧ loves (Y, Z) ) → jealous (X, Y) )

(Note that this sentence is not a formula of Prolog or the first-order predicate calculus. It is a mixed form, meant to be suggestive.)

The facts and rules constitute a knowledge base. Among the facts, Vincent and Marcellus both love Mia. Pumpkin and Honey_bunny love each other. In the rule, the symbols X, Y, and Z are variables. The rule itself is general. It says that for every x, y, and z, x is jealous of y if x loves z and y loves z. Obviously, jealousy in the real world is different.

To express this knowledge base in the first-order predicate calculus, a key is necessary. The key specifies the meanings of the constants and predicates:

Vincent           a1
Marcellus        a2
Mia                 a3
Pumpkin         a4
Honey Bunny   a5

__ loves ___    P2 1

Given this key, it is possible to express the entries in the knowledge base. So, for example, the fact that Marcellous loves Mia

loves (marcellus, mia).

is expressed in the first-order predicate calculus (relative to the key) by the formula P2 1a2,a3


Next consider the following query:

?- loves (mia, vincent).

The response is

false (or no, depending on the particular implementation of Prolog)


For an example of a slightly less trivial query, consider

?- jealous (marcellus, W).

This query asks whether Marcellus is jealous of someone, i.e., whether

∃W jealous (marcellus,W)

is true. Since Marcellus is jealous of Vincent (given the knowledge base), the response is

W = vincent


From a logical point of view, the query (and the answer to the query is computed in terms of) the corresponding negative clause

¬jealous(marcellus,W)

This negative clause is read as its universal closure

∀W ¬jealous(marcellus,W)

which (by the equivalence of ∀ to ¬∃¬ in classical logic) is equivalent to

¬∃W jealous(marcellus,W)

The computation (to answer the query) corresponds to the attempt to refute the universal closure (in the context of the knowledge base or program) by trying to derive the empty clause. Given the KB in the example, the empty clause is derivable

{KB, ∀W ¬jealous(marcellus,W)} ⊢ ⊥

This means that

∃W jealous (marcellus,W)

is a consequence of the KB (or logic program). Moreover, the computation results in a witness to this existential truth. So the response to the initial query is

W = vincent


Here is how this looks running Prolog on my machine:

tom:arch [~/Desktop]
% swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 7.6.4)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit http://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- consult('pulp_fiction.pl').
true.

?- loves(mia,vincent).
false.

?- jealous(marcellus,W).
W = vincent .

?- 
	  

The corresponding proof in the predicate calculus

Red marks the rule and two facts. Blue marks the negative clause corresponding to the query. The predicates "loves" and "jealous" are abbreviated.

   
 ∀x∀y∀z((l(x,z) ∧ l(y,z)) → j(x,y))
 ------------------------------------------------ ∀ E
 ∀y∀z((l(marcellus,z) ∧ l(y,z)) → j(marcellus,y))
---------------------------------------------------- ∀ E
 ∀z((l(marcellus,z) ∧ l(vincent,z)) → j(marcellus,vincent))       l(marcellus,mia)   l(vincent,mia)
----------------------------------------------------------- ∀ E   --------------------------------- ∧ I
 (l(marcellus,mia) ∧ l(vincent,mia)) → j(marcellus,vincent)       l(marcellus,mia) ∧ l(vincent,mia)      [∀x¬j(marcellus,x)] assumption 1
-------------------------------------------------------------------------------------------------- → E   ------------------------------- ∀ E
                                                             j(marcellus,vincent)                        ¬j(marcellus,vincent)
                                                             ----------------------------------------------------------------- ¬ E
                                                                                              ⊥
                                                                                           ----- ¬ I, discharge assumption 1                                       
                                                                                             ¬∀x¬j(marcellus,x)
                                                                                                          

Once again, the proof has parts. The first part instantiates the rule defining jealousy to Marcellus, Vincent, and Mia. There is one step for each quantifier and person:

 
        
 ∀x∀y∀z((l(x,z) ∧ l(y,z)) → j(x,y))
------------------------------------ ∀ E
 ∀y∀z((l(marcellus,z) ∧ l(y,z)) → j(marcellus,y))
--------------------------------------------------- ∀ E
 ∀z((l(marcellus,z) ∧ l(vincent,z)) → j(marcellus,vincent))            
----------------------------------------------------------  ∀ E     
 (l(marcellus,mia) ∧ l(vincent,mia)) → j(marcellus,vincent) 
                                 .                             .                                                                                                                                                         

Given the facts that both Marcellus and Vincent love Mia, it follows that Marcellus is jealous of Vincent:


          
                                                                 l(marcellus,mia)   l(vincent,mia)
                                                                 ---------------------------------- ∧ I
 (l(marcellus,mia) ∧ l(vincent,mia)) → j(marcellus,vincent)      l(marcellus,mia) ∧ l(vincent,mia)             
--------------------------------------------------------------------------------------------------- → E          
                                               j(marcellus,vincent)                                               
                                                                    
                                                                                                             

Now, given the negative clause (that is the logical form of the query), it follows that there is someone such that Marcellus is jealous of him or her.
(This conclusion is expressed with ¬∀x¬, which is equivalent to ∃ in the context of classical logic.)

  
                                                                 [∀x¬(marcellus,x)] assumption 1
                                                                 ------------------------------- ∀ E
                           j(marcellus,vincent)                  ¬j(marcellus,vincent)
                          ------------------------------------------------------------- ¬ E
                                                         ⊥
                                                      ----- ¬ I, discharge assumption 1                                       
                                                        ¬∀x¬j(marcellus,x)
 
 
 
                                                                                                            

Unification is part of the computation

The instantiation of variables is a complicating factor. Now the backward chaining procedure includes what is called unification.

Notice that in "pulp fiction" example, the query

jealous (marcellus, W)

matches the head of no entry in the logic program. It can, however, be unified with the head of the rule defining jealousy.

Unification, in this way, is a substitution that unifies two terms. A substitution is a replacement of variables by terms. A substitution σ has the following form

{V1/t1, ... , Vn/tn}, where VI is a variable and ti is a term.

For a formula φ, φσ is the replacement of every free occurrence of Vi in φ with ti. φσ is a substitution instance of φ.


An example of backward chaining with unification

An example makes unification easier to understand. Consider the following blocks-world program:

on(b1,b2).
on(b3,b4).
on(b4,b5).
on(b5,b6).

above(X,Y) :- on(X,Y).
above(X,Y) :- on(X,Z), above(Z,Y).

An agent with this logic program as his or her KB thinks the world looks like this:

        b3
        b4
 b1     b5
 b2     b6
 

Now suppose the query is whether block b3 is on top of block b5

?- above(b3,b5).

The computation to answer (or solve) this query runs roughly as follows. The query does not match the head of any fact. Nor does it match the head of any rule. It is clear, though, that there is a substitution that unifies this query and the head of the first rule. The unifying substitution is

{X/b3, Y/b5}

This substitution produces

above(b3,b5) :- on(b3,b5).

So the derived query is

on(b3,b5).

This derived query fails. So now it is necessary to backtrack to see if another match is possible further down in the knowledge base. Another match is possible. The query can be made to match the head of the second rule. The unifying substitution is

{X/b3, Y/b5}

This produces

above(b3,b5) :- on(b3,Z), above(Z,b5).

The derived query is

on(b3,Z), above(Z,b5).

Now the question is whether the first conjunct in this query can be unified with anything in the knowledge base. It can. The unifying substitution for the first conjunct in the derived query is

{Z/b4}

The substitution has to be made throughout the derived query. So, given that the first conjunct has been made to match, the derived query becomes

above(b4,b5).

This can be made to match the head of the first rule for above. The unifying substitution is

{X/b4, Y/b5}

and the derived query is now

on(b4,b5}.

This query matches one of the facts in the knowledge base. So the computation is a success!


Here is a more abstract form of the successful computation that indicates the instances of resolution in an effort to derive the empty clause:


                      (rules and facts)                        (substitutions)


    ¬above(b3,b5)     above(X,Y) :- on(X,Z), above(Z,Y)         {X/b3, Y/b5}
    
                      above(X,Y) ∨ ¬on(X,Z) ∨ ¬above(Z,Y)
        \                     
         \                   /
          \                 /
           \               /
            \             /
             \           /
    ¬on(b3,Z) ∨ ¬above(Z,b5))            on(b3,b4)               {Z/b4}
               \                           /
                \                         /
                 \                       / 
                  \                     /
                   \                   /
                    \                 /
                     \               /
                      \             /
                       \           /
                      ¬above(b4,b5)    above(X,Y) :- on(X,Y)     {X/b4, Y/b5}
                      
                                       above(X,Y) ∨ ¬on(X,Y)
                         \                   /
                          \                 /
                           \               / 
                            \             /
                             \           /
                              \         /
                               \       /
                                \     /
                             ¬on(b4,b5)       on(b4,b5)
                                  \                /
                                   \              /
                                    \            / 
                                     \          /
                                      \        /
                                       \      /
                                        \    /
                                         \  /
                                          ⊥


Another example of backward chaining with unification

Consider another query to the same logic program. The question this time is whether there is a block on top of block b5

?- above(Block,b5).      (A variable in Prolog is a word starting with an upper-case letter.)

It is clear that this query can be made to match the head of the first rule for above. The unifying substitution is

{X/Block, Y/b5}.

The derived query is

on(Block,b5).

This can be made to match one of the facts. The unifying substitution is

{Block/b4}.

Again, the computation is a success! There is something above block b5, and the substitution provides the witness:

Block = b4


                        (rules and facts)                (substitutions)


¬above(Block,b5)        above(X,Y) :- on(X,Y)            {X/Block, Y/b5}

			above(X,Y) ∨ ¬on(X,Y) 
        \                 /
         \               /
          \             /
           \           /
            \         /
             \       /
        ¬on(Block,b5)    on(b4,b5)                        {Block/b4}
               \             /
                \           /
                 \         / 
                  \       /
                   \     /
                    \   /
                     \ /
                      ⊥



What we have accomplished in this lecture

We looked at the relation between logic and logic programming. We saw that if a query is successful, then the query is a logical consequence of premises taken from the logic program. To see this, we considered the connection between the backward chaining process in computing a successful query and the underlying proof in the propositional and first-order predicate calculus.








move on g back