SAT Competition 2013
affiliated with the SAT 2013 conference, July 8-12 in Helsinki, Finland
jointly organized by Ulm University, University of Helsinki,
University College Dublin, University of Texas at Austin,

Certified UNSAT


We are accepting 3 different UNSAT certificate types. The first format trace-check (stc.c) by Armin Biere is based on resolution. The second format DRUP-check (drup-check.c) by Marijn Heule and Nathan Wetzler is based on reverse unit propagation (RUP). The third format BDRUP by Marijn Heule is based on a conversion to the DRUP format.

The EDACC verifier which is used in the competition accepts all these formats. All solver outputs should be on stdout, which is redirected in EDACC to file, which is then passed to the verifier.
The output of the solver though should conform to the following rules:

  • Comment lines are to be prepended by "c "
  • Solution line "s SATISFIABLE" or "S UNSATISFIABLE"
  • In case of unsat answer the certification format that will follow in the form
    "o proof [format]" where format can be one of DRUP, BDRUP or TC (for trace check).
  • The proof as specified by each individual format
The solution line can appear anywhere in the output of the solver, though
- the proof type should appear before the proof (which should be no problem to print at the beginning of the output)
- the solution "v ..." in case of SAT should appear after the solution line "s SATISFIABLE" as it is the case in the original specification

Certified UNSAT solver output example


c this is a solver output example
c computing ....
o proof DRUP
   1  2  0
d  1  2 -3 0
   1  0
d  1  2  0
d  1  3  4 0
d  1 -2 -4 0
   2  0
   0
s UNSATISFIABLE

Usage of the EDACC verifier

Compile by running "make" (requires g++)
Usage:
./certifiedSAT <instance> <solveroutput>

Example:
Run your solver and redirect stdout of the solver to a file <solveroutput>. e.g.:
./solver test.cnf > solution.out

verify the ouput with
./certifiedSAT test.cnf solution.out

Formats

Below, a brief description of the two formats based on an example formula. More details about trace-check and DRUP can be found here and here. The spacing in the examples is to improve readability. Consider the following formula in the DIMACS format.

p cnf 4 8
 1  2 -3 0
-1 -2  3 0
 2  3 -4 0
-2 -3  4 0
 1  3  4 0
-1 -3 -4 0
-1  2  4 0
 1 -2 -4 0

A resolution proof for this formula in the trace-check format is:

 1  1  2 -3 0 0
 2 -1 -2  3 0 0
 3  2  3 -4 0 0
 4 -2 -3  4 0 0
 5  1  3  4 0 0
 6 -1 -3 -4 0 0
 7 -1  2  4 0 0
 8  1 -2 -4 0 0
 9  1  2  0 1 3 5 0
10  1  0  9 4 5 8 0
11  2  0  9 3 6 7 0
12  0 10 11 2 4 6 0

The first 8 lines show the 8 input clauses. Input clauses can be recognized by the double 0 at the end of the line. In the first column, the number of each clause is shown. The clauses 9 to 12 are resolvents. The literals of each resolvent are listed between its number and the first 0. The clauses required to construct the resolvent are listed between the first and the second 0.

The same proof in the RUP format is:

1 2 0
1 0
2 0
0

RUP proofs are a sequence of clauses that are redundant with respect to the input formula. The check that a clause C is redundant, all literals l in C are assigned to false followed by unit propagation. In order to verify redundancy, unit propagation should result in a conflict. The redundant clause is added to the formula in case the check is successful. The conflict clauses clauses produced by conflict-driven clause learning (CDCL) solvers have the RUP property. Therefore, a sequence of all conflict clauses (in the order of learning) is a RUP proof for unsatisfiable formulas.

One important disadvantage of checking RUP proofs is the cost to verify a proof. To counter this disadvantage, we propose the DRUP format (delete reverse unit propagation). The DRUP format extends the RUP format by allowing it to express clause elimination information within the proof format.

   1  2  0
d  1  2 -3 0
   1  0
d  1  2  0
d  1  3  4 0
d  1 -2 -4 0
   2  0
   0

Apart from the redundant RUP clauses, a DRUP proof may contain lines with a 'd' prefix. These lines refer to clauses (original or learned) that can be deleted without influencing the proof checking. In the above example, all the deleted clauses are subsumed by added RUP clauses. In practice, however, the clauses that one wants to include in a DRUP proof are the clauses that are removed while reducing the (learned) clause database.

BDRUP

Participants can also use the binary decision-tree reverse unit propagation (BDRUP) format. The BDRUP format was developed to express refutations of lookahead SAT solvers efficiently. The format contains three types of lines: 1) new branch decisions, 2) locally learned unit clauses (mostly failed literals) and binary clauses (mostly hyper binary resolvents), and 3) termination of a branch. A new branch is a line with a b prefix followed by the branch literal. Locally learned clauses are expressed in the DIMACS format (a list of integers ending with a zero). A terminated branch is expressed as a line with only a zero. Proofs in the BDRUP format can easily be converted into DRUP (delete reverse unit propagation) proofs.

Consider the following example BDRUP proof

b 1
b 2
  3 4 0
  5 0
  4 6 0
  0
  3 0
  0
  0

This proof will be converted into the following DRUP proof

  1 2 3 4 0
  1 2 5 0
  1 2 4 6 0
  1 2 0
d 1 2 4 6 0
d 1 2 5 0
d 1 2 3 4 0
  1 3 0
  1 0
d 1 3 0
d 1 2 0
  0

Format comparision

The main advantages and disadvantages of the proof formats are as follows:

Resolution (trace-check):

  • Checking resolution proofs is fast, i.e., the time limit should not be a problem;
  • Resolution proofs can be huge, i.e., may exceed the 100 Gb space restriction;
  • Modifying a SAT solver to emit a resolution proof is hard.
Reverse unit propagation (DRUP-check):
  • Checking RUP proofs can be costly, i.e., may not be completed within the timeout;
  • RUP proofs are typically "small", i.e., the 100 Gb space restriction should not be a problem;
  • Modifying a SAT solver to emit a resolution proof is easy (typically only 10 lines of code).