The seminal paper on CSP is [Mackworth, 1977]. In this paper he describes the central concepts of CSP and gives the basic consistency algorithms described below. [Mackworth and Freuder, 1985] and, later, [Mohr and Henderson, 1986] improved on the basic algorithms. [Freuder, 1978] introduces solution synthesis and [Tsang and Foster, 1990] present improved methods. [Tsang, 1993] is an indispensable resource for anyone interested in CSP.
The simple Constraint Satisfaction Problem presented in the introduction is a good example of the pitfalls of uninformed backtracking. Figure 2 displays the ever-present fact-of-life inherent in uninformed search. In that problem, seven combinations of A,B and C were tested before one was found that met the input constraints. If all answers to the problem were required, an exhaustive search (18 combinations in this case) would be required. All of this despite the fact that no search was required at all! By applying the basic principles of CSPs, the single correct answer falls out without search. This is not to say that search is never required; in most problems it is. But for many types of problems, using CSP techniques can drastically reduce the amount of processing needed.
DEFINITIONS
The following terminology will be used throughout this paper.
A CSP consists of a set of variables, also called
nodes . Each variable can take on a value taken from a
set of values, called its domain. A variable's domain will
always be presented between curly braces { }. An assignment of a
value to a variable will be written <A,2>, meaning variable A has
value 2. There are two types of
constraints. Unary constraints restrict the domain of a
variable without reference to any other variable. For instance,
[A > 3] is a unary constraint for variable A. It will be assumed that
there is one (or zero) unary constraint per variable.
Binary constraints
restrict the values a variable can take by comparing
it to another variable. For instance, [A < B] is a binary constraint
between A and B. Actual constraints will always be presented between
straight brackets [ ]. They will also be represented as follows:
is a unary constraint for A,
is a binary constraint between
the arc AB. An arc is defined to be any pair of variables for
which there is a binary constraint. It is assumed there is only one binary
constraint per arc. In general, capital letters at the beginning of
the alphabet (A,B,C ...) will represent variables, while those at the
end (X,Y,Z) will represent unknown values.
A solution to a CSP is an assignment to each variable of a value taken
from that variable's domain, such that all the unary and binary
constraints are satisfied. A complete solution is the set of all such
solutions for a CSP. A solution will be represented as a set of
values enclosed within parentheses, such as (1,2,4,0), where 1 is the
value of the first variable (generally A in these abstract examples),
2 is the value of the second variable, etc.. Alternatively, a
solution sometimes is represented more explicitly as a set of
variable-value assignments such as (<A,1>,<B,2>,<C,4>,<D,0>).
A set of solutions will
be a list of solutions enclosed in curly brackets:
{(1,2,4,0),(1,2,4,1), ... }. Partial solutions will be represented
similarly, with the assignment of values to variables determined by
context. The fact that a partial solution satisfies a given
constraint will be represented (X,Y) SAT
for binary
constraints,
and X SAT
for unary constraints.
Figure 6: Constraint Satisfaction Problem - Consistency
Figure 6A is a slightly more complex CSP. A few more possible values have been added to the domains of each variable to more clearly demonstrate the types of consistency discussed below. Unary constraints are also included. The three central ``consistency'' checks used for CSPs are node consistency, arc consistency and path consistency.
Node consistency (NC) is simply a state where the domains of each variable have been reduced to the set of all possible values that satisfy the unary constraints. In Figure 6A-B, the results of node consistency processing are displayed. For instance, the values 11 and 15 are removed from the domain of C since they do not satisfy C's unary constraint that C < 10.
Node consistency can often be assumed, either because there are no unary constraints, or because the unary constraints were used in setting up the problem. For instance, in the Mikrokosmos semantic analyzer, the unary constraints correspond to picking the correct lexicon entries based on the root word and surrounding syntax.
A simple algorithm for enforcing node consistency is as follows:
1 PROCEDURE NC-12 FOR each variable, V
3
<-- unary constraint for V
4 IF ¯
is null ;; no unary constraints
5 THEN do nothing
6 ELSE
7 FOR each value in V's domain, X
8 IF X SAT
![]()
9 THEN do nothing
10 ELSE remove X from the domain of V
This algorithm has time-complexity O(an), where a is the maximum
size of the domains and n is the number of variables to be examined.
In all that follows, our primary goal will be to reduce the time
complexity with respect to n, the number of
variables.
Thus, in this
respect, NC-1 is linear in time.
Arc consistency (AC) ensures that the binary constraints connecting any two nodes are satisfied. There have been various implementations of AC. AC-1, AC-2 and AC-3 are presented in [Mackworth 1977]. [Mohr and Henderson 1986] present the optimal AC-4 algorithm. The naive approach, AC-1, is given below:
1 PROCEDURE AC-12 NC-1 ;; ensure NC first
3 REPEAT
4 Changed <-- false
5 FOR each arc, AB
6
<-- binary constraint between A and B
7 FOR each value in the domain of A, X
8 IF there is a value in the domain of B, Y
such that (X,Y) SAT
![]()
9 THEN do nothing
10 ELSE
11 REMOVE X from the domain of A
12 Changed <-- true
13 UNTIL NOT Changed
This algorithm cycles through each potential arc
to
determine if for
each value in the domain of the first variable there exists a value in
the domain of the second variable that can satisfy the binary
constraint on that arc. If a value is found for which this is not
true, it is removed from the domain of the first variable. If any
values are removed, the process must be started again from the
beginning, since the removal might cause some arc-inconsistency in an
arc that was already tested.
In Figure 6B-C, the arc-consistency results are shown. For example, in Figure 6B, the domain of B is {0,1,3,4}. When the arc AB is tested with the binary constraint [A > B] and the value 0 for variable A is tested, it is discovered that no values of B remain such that [0 > B]. Thus 0 must be removed from the domain of A. Note that this will impact on the testing of arc CA. When the value 1 is tested for C, no values for A remain in its revised domain {1,2,3} such that [A < 1]. Thus 1 will need to be removed from the domain of C. This demonstrates the necessity of testing all arcs again after any value is removed, since if arc CA is tested before AB, the value of 1 for C is still consistent.
If a is the maximum size of the domains, e is the number of
arcs and n is the number of variables, then there will be at most
na values to be checked. If, in the worst case, one value is deleted
for each iteration in the UNTIL loop, then the loop will be run
na times. The first FOR loop (line 5) can be run a maximum e times
per iteration of the REPEAT loop. The second FOR loop (line 7) can be called a
maximum of a times per iteration of the first FOR loop, and the
IF statement in line 8 may involve a maximum
of a searches through the domain of B per each iteration of the
second FOR loop. Thus, the complexity of the second FOR loop is
O(
), the complexity of the first FOR loop is
O(e
), and the complexity of the entire procedure is
O(ne
). In unconstrained graphs, e can be at most
. However, in computational semantics, typically, e is
proportional to linear n. Thus AC-1 typically has time-complexity
O(
), which is not linear with respect to the number of
variables.
Various improvements to AC-1 have been offered. AC-4
[Mohr and Henderson, 1986] is the optimal algorithm, offering time
complexity of O(e
). In AC-1, whenever a value is removed
from a variable's domain, every arc has to be rechecked. AC-4
recognizes the fact that a value in a variable's domain
``supports'' an identifiable list of values in other variables. For
instance, in Figure 6C, the value 0 in B ``supports'' the value 1 in
A. The constraint [A > B] is enabled by this support. Remove the
value 0 from B's domain and the value 1 in A is no longer supported.
Therefore, if a value is deleted in one variable, only the values in
other variables that are supported by the deleted value must be
checked. Furthermore, the number of ``supports'' for a certain value
can be tracked. Examining Figure 6C again, the two values in B,
0 and 1, both support the value 2 in A. This will be
represented by saying SUPPORTS(B,0) = {(A,2,
),...}
and
SUPPORTS(B,1) = {<A,2>,...}. The fact that the value 2 in A has two
values in B supporting
is represented as
SUPPORT(A,2,
) = 2.
If one of the supporting values in B was removed,
SUPPORT(A,2,
) = 1 would still be supported by one
value. Thus, by recording how many supports each value-constraint
pair has initially, and by subtracting one from that total each time a
supporting value is removed, it can be determined when a
value-constraint pair no longer can be supported. Whenever a
value-constraint pair has no supports, the value must be deleted.
An algorithm for AC-4 follows:
1 PROCEDURE AC-42 NC-1 ;; ensure node consistency first
;; Step I: INITIALIZATION
3 INITIALIZE array SUPPORT[V,X,AB] = 0
4 INITIALIZE SUPPORTS[V,X] = nil
5 INITIALIZE FAIL-LIST = nil
;; Step II: SETTING UP SUPPORT VARIABLES
6 FOR each arc, AB
7 FOR each value X in variable A
8 FOR each value Y in variable B
9 IF ¯ (A,B) SAT
![]()
10 THEN
11 INCREMENT(SUPPORT[A,X,
])
;; increment the number of supporters for X
12 APPEND(SUPPORTS[B,Y],(A,X,
))
;; record that Y supports X
13 IF ¯ (SUPPORT[A,X,
] = 0)
;; X has no supports for this arc
14 THEN
15 APPEND(FAIL-LIST,(A,X))
16 REMOVE X from the domain of A
;; Step III: REMOVING UNSUPPORTED VALUES
17 WHILE FAIL-LIST <> { }
18 PICK first label (A,X) from FAIL-LIST
19 FAIL-LIST = FAIL-LIST - (A,X)
20 LABELS-SUPPORTED <-- SUPPORTS[A,X]
21 FOR each label (B,Y,
) in LABELS-SUPPORTED
;; determine if the supported value has any other supports left
22 DECREMENT(SUPPORT[B,Y,
])
23 IF ¯ ((SUPPORT[B,Y,
] = 0)
AND NOT-MEMBER((B,Y),FAIL-LIST) ;; not pending failure already
AND MEMBER(Y,DOMAIN(B))) ;; not already failed
24 APPEND(FAIL-LIST,(B,Y))
25 REMOVE Y from the domain of B
The INITIALIZATION stage is trivial, and, practically speaking, can be
combined with step II. In step II, if the number of arcs is e, the maximum
number of values in a variable is a, then it is easy to see that
the inner-most IF statement (line 9) is executed
times.
In step III, in
the worst case, the WHILE loop will be executed once for each value.
If the number of variables is n, then there can be at most
an values to be deleted. Finally, there can be at most a
labels to be examined in the inner FOR loop (line 21), so that the total
complexity of step III is O(
n).
Note that n, while proportional to e, is always less than
or equal to it in connected graphs. Therefore, the total complexity
of AC-4 is O(
n +
e) = O(
e). Again, in computational semantics
e is typically proportional to n; thus, thus the time
complexity for AC-4 is typically O(
n).
Because space becomes an issue in AC-4, the space complexity will
also be analyzed. The space complexity is dominated by the SUPPORTS
array. There can be at most one entry in SUPPORTS for every
variable-value pair, or an entries. Each entry can support
every other variable-value pair, again an. Since there can be
an entries, each of can hold up to an bits of information,
the total complexity is O(![]()
). Since e is
equivalent to
, this is the same as O(
e), which
is what is reported in the literature. Again, it should be pointed
out that in computational semantics, instead of every variable-value pair supporting every
other variable-value pair ( an), typically it is more on the line of
ca, where c is a constant. Thus,
implementations of AC-4 for natural language semantics
typically have a space complexity of O(
n), which is
linear with respect to the number of variable.
Note that Figure 6C still contains some value combinations that theoretically could be removed ahead of time. For instance, the partial solution (B,C) = (1,2) is never possible. When B = 1 is assigned, A must be constrained to {2,3} to meet the [A > B] constraint. However, with C = 2, the [A < C] constraint could no longer be satisfied. Path consistency is an attempt to eliminate impossible partial solutions.
Path consistency requires that for any path
(A,B,C,...M), then for any value assignment
A = X and M = Y, there must exist a value assignment for each of
the variables B,C,...,L such that all binary constraints on
adjacent variables are satisfied. In Figure 6C, the path in
question is (B A C). When we assign B = 1 and C = 2, there is no
value X for A left such that (X,B) SAT
AND
(X,C) SAT
.
Path consistency (PC) algorithms can be found in [Tsang, 1993] and [Mohr
and Hendersen, 1986]. The optimal algorithm runs with time complexity
O(![]()
) and space complexity O(![]()
).
Obviously, these complexities are much higher than those for AC-4 and
NC-1.
Fortunately, PC algorithms are not necessary. A dynamic application
of AC algorithms performs the same function. To illustrate, assume
that a pre-processor performs AC-4 on Figure 6A, yielding the CSP in
Figure 6C. Next, submit the CSP to a AC-enabled search algorithm.
Such an algorithm would choose one value for B, say B = 1. This, in
effect, creates an ``artificial island'',
with the domain of B = {1}. Since we
have removed the value 0 from the domain of B, the AC-enabled search
can dynamically eliminate values in A and C which are supported only
by B = 0. This would remove A = 1. After removing A = 1, the
AC-enabled search would also eliminate C = 2. Thus, a dynamic
AC-enabled search algorithm performs the same function as PC
algorithms. The algorithm for an AC-enabled algorithm will be given
below.