# P vs NP Problem

The P versus NP problem, formulated independently by Stephen and Leonid Levin, has been one of the most important scientific questions posed to date. Simply stated, the P vs NP question asks if there exists a polynomial solution to any of the problems shown to be NP-Complete. Cook's Theorem states that any problem that can be solved in polynomial-time by a non-deterministic Turing machine can be reduced (in polynomial-time) to the problem of determining whether a Boolean formula is satisfiable.

Over the past several decades researchers have been trying to prove the P \neq NP conjecture by determining whether or not there is a polynomial solution to any of the known NP-Complete problems. If indeed we were to discover that one of these NP-Complete problems was in P, then by the process of polynomial-reduction we will have managed to solve every NP problem polynomially. Cook states that although a practical algorithm for solving an NP-Complete problem would have devastating consequences for cryptography, it would also have stunning implications of a more positive nature.

Though still unproven, the general consensus is that P \neq NP. Undoubtedly, even if this was true, the consequences of showing that every NP-Hard problem important to science and industry is susceptible to a polynomial-time algorithm is difficult to imagine, since this might yield many of the practical benefits that could be expected in a world in which P \approx NP. Markedly, Constraint Satisfaction research has provided various conditions that have shown to be sufficient to ensure tractability.

# Constraint Satisfaction

Pioneered in the 1970's, the Constraint Satisfaction Problem (CSP) describes a general framework for problems in which values must be assigned to a set of variables subject to specific constraints, and is one of the most prominent research areas in Theoretical Computer Science and Artificial Intelligence, to name only a few.

Widely studied in academia for several decades, research in this field is beginning to successfully penetrate mainstream industry. With the advent of the modern computer contributing to the significant growth of this field, Constraints research is being applied to numerous Operational Research problems, including timetabling, location, scheduling, car sequencing, cutting stock, vehicle routing and rostering. In particular, one type of CSP, Propositional Satisfiability (SAT), has been adopted by the semiconductor industry as a methodology (Formal Verification) for testing the design of Integrated Circuits (Chips).

# Propositional Satisfiability

Given a propositional formula on a set of Boolean variables, SAT asks whether or not there exists an assignment to the set of variables such that the formula evaluates to True. SAT-based algorithms have been embraced by industry because they help ease a real `customer pain', namely the Chip testing bottleneck. As a potential consequence of this `market force' SAT-algorithms have evolved relatively independently from the field of Constraint Satisfaction. Though, in recent years cross-fertilisation has begun between these two research areas.

The Satisfiability Problem in propositional logic is the quintessential NP-Complete problem, and is a particularly important type of CSP. Over the past decade research in this field has improved dramaitcally, giving rise to \SAT-Solvers that can solve instances with thousands and even millions of variables.

# The Path to Satisfaction: Polynomial Algorithms for SAT

#### Abstract

Twenty years ago the bottleneck preventing us from solving many practical problems lay in the limited capacity and performance of computers. However, modern computers are now capable of storing a huge amount of data and processing it at very high-speed. The bottleneck nowadays is in the efficiency and effectiveness of the algorithms manipulating the data. Two fields of Computer Science addressing this issue --- Constraint Satisfaction (CSP) and Propositional Satisfiability (SAT) --- have developed relatively independently over the past half century, it is only in recent years that the mutual benefits of these fields have started to be explored.

One method used to explore the relationship between these fields is to study the mappings between the two core problem domains. After introducing the relevant background and providing a detailed survey of SAT and CSP encodings, my in-depth analysis of these encodings uncovers three categories. Utilising the ideas presented in this thesis, and cross-fertilising these two fields, I provide a framework for current and future SAT and CSP encodings to be developed. As a result of this framework I define a new encoding and I demonstrate that in certain circumstances this encoding maybe preferable to other encodings. This new encoding also opens up the potential for additional CSP to SAT encodings.

After categorising SAT and CSP encodings, the focus of this research shifts to characterising them. Some empirical work on comparing algorithmic performance on instances with different solution-density has been performed, and it has been shown that this measure can be used to choose between using stochastic and branching algorithms. I provide a complete characterisation of SAT to CSP encodings and I prove that some encodings result in problems with higher solution-densities than others. Since it has been shown that solution-density is an important factor in determining the solubility of an instance, my work provides a guide to assist in choosing one encoding over another.

A large amount of theoretical analysis has been published comparing stochastic and branching SAT and CSP algorithmic techniques on SAT and CSP encodings, however, very little research has compared the effect of enforcing local-consistency algorithms in each of these domains. After introducing the main algorithmic techniques from the SAT and Constraint communities, I use the graph-theoretic framework to reconfirm and strengthen the equivalence between the SAT-based and CSP-based proof methods. I provide a complete comparison between local-consistency techniques on each of the SAT to CSP encodings. One result of this work has direct practical implications regarding the choice of encoding; that enforcing local-consistency on one type of encoded problem does more work than on other encoded instances. I also prove that enforcing certain levels of local-consistency on some types of encoded SAT instances does zero work if each clause has distinct literals, which explains the empirical results presented in this thesis.

In addition, I bridge the CSP and SAT techniques further by introducing the concept of Extended-Consistency, thus providing a more complete picture of the Frege, SAT and CSP proof-systems. Extended proof-systems are some of the most powerful systems known, which allows the introduction of auxiliary variables to maintain a constant arity. Relatively little practical work has been done on extended proof-systems, and typically SAT and CSP algorithms that employ additional variables tend to apply them in an `ad-hoc' manner. The motivation of defining Extended-Consistency is to allow synergies between these two extended proof-systems to be cultivated and explored.

I identify two key problematic aspects of empirical studies of encodings. First, typically one type of problem is used as a benchmark to compare encodings. Second, either stochastic or backtracking algorithms are applied to the encoded problems. Clearly both of these choices may bias the results, since an encoding/algorithm may `favour' a particular problem. I address these issues by applying a local-consistency algorithm to a wide variety of unsatisfiable problems. As a result of applying this algorithm to CSP encoded SAT instances my experiments strongly indicate that the choice of encoding has a dramatic effect on reducing the search-space. I demonstrate that enforcing a small local-level of consistency on problems using a certain encoding does not solve any of the SAT benchmarks. This is in stark contrast to problems encoded using another encoding, where I show that enforcing such a low-level of consistency can solve a large number of `hard' SAT benchmark families. In particular, I show that converting SAT instances to CSP and applying local-consistency can not only solve many `hard' SAT instances, but this technique can even compete with state-of-the-art SAT-Solvers.

The results presented in this thesis are part of a research program aimed at bridging the gap between Propositional Satisfiability and Constraint Satisfaction. Hence, the main aim of this thesis is to strengthen this relationship and to capitalise on synergies between these two fields. The broader aim is to develop a better understanding of Computer Science as part of a wider goal, which aims to benefit the scientific and industrial communities by increase the theoretical understanding of the complexity and tractability of natural problems, and improve practical algorithms that can be applying it to pertinent scientific and industrial problems.