Overview of the competition

Last update: November, 29 2007.

Please note that some details might be updated before the competition starts. Changes will be clearly identified. Please check that web page regularily. Important update will be announced on SAT Live! Please send your comments to SATCompetition@satlive.org

Competition platform

Hardware

The competition will run on one cluster of 92 nodes provided by CRIL. The nodes are all identical: Bi-Xeon 2.8 GHz with 2GB RAM.

Software

All the nodes are using one flavor of Red-Hat Linux.

Linux master0 2.6.9-22.EL.rootsmp #1 SMP Mon Oct 3 08:59:52 CEST 2005 x86_64 x86_64 x86_64 GNU/Linux

For C and C++ source submissions GCC version 3.4.5 will be used for compilation. No other versions are supported. You will need to 'port' your source code to be compilable under version 3.4.5 of GCC prior to submission.

gcc (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2)

We require that the compilation flag '-static' is used and further recommend '-O3 -fno-strict-aliasing'.

No platform specific options are allowed, such as '-mcpu=i686' or '-march=i686', but we welcome additional options, e.g. '-DNDEBUG', which should precisely be communicated to the organizers or set in the compilation script or Makefile that is included in the sources.

The SAT competition is now using some rules that made CASC sucessful.

Changes in the input format

This year, we will remove holes in variable numbering. As a consequence, the "p cnf nbvar nbclause" line will reflect exactly the number variables in the formula (numbered from 1 to nbvar) and the exact number of clauses.

Note that the order of the variable will not be affected.

Benchmarks submission rules

Benchmarks submitters are invited to provide a detailed description of the benchmarks, including:

complete description of the initial problem
description of the SAT encoding
expected behavior of the solvers on those benchmarks

Solvers submission rules

There are two options to participate in the SAT Competition: by participating in the competition itself or by participating in the system demonstration.

In both cases, a 2 page description of the solvers describing the techniques used, technical issues and the expected behavior of the solver (e.g., does it only solve satisfiable formulas) should be submitted together with the solver.

A paper describing the solver might simultaneously be submitted to the SAT 2007 conference, unless a published description already exists. However, submission of a paper is independent of submission of a solver and should follow those guidelines.

The organizers of the competiton are not allowed to participate to the competition division. They are allowed to participate to the demonstration division.

Competition division

Solvers in the competition division will be tested in identical execution environments, rated, and compared with each other. There will be winners in various categories.

To enter the competition division, the SAT solver must be submitted in source and with a build script (normally Makefile) that allows the organizers to easily build the SAT solver from source into a statically linked binary (flag -static). Solvers written in C or C++ will be compiled using the latest version of GCC using UsualOptimisations. Solvers written in Java will run on the latest JVM (1.6) using the options specified by the authors. For solvers written in another language, or if you need a specific compiler, please Contact The Organisers.

The authors must allow the SAT competition organizers to release the source code of their solver for research purpose on its web site or must provide a URL to that source code.

Rationale: Every material submitted to the SAT competition will be available for research purposes after the competition.

Demonstration division

The demonstration division is intended to provide greater flexibility for exposure of solvers when the authors cannot or wish not to comply with the requirements for the competition division.

Solvers in the demonstration division might not be tested in identical execution environments; they will not be rated or compared with other solvers. There will not be winners. Their test results simply will be recorded and publicized.

To enter the demonstration division, the SAT solver normally should be submitted either as source, with the same guidelines as the competition division, or as a statically linked binary for linux using UsualOptimisations and flag -static.

The authors normally should allow the SAT competition to release the binary of their solver for research purpose on its web site or should provide a URL to that source code. If source is also submitted in the demonstration division, it is not released without the authors' explicit permission.

Entries for which the authors do not wish to release either source or binaries for research purposes may be accepted in the demonstration division under special circumstances, at the discretion of the organizers, if it appears that participation will benefit the research community. One possibility is that tests will be run remotely with communication over the internet, at the authors' site. Please ContactTheOrganisers to make special arrangements.

Technical issues

Randomness and pseudorandomness

Randomness is not permitted. Each solver should produce repeatable results when run on the same input formula with the same conditions of time and space availability. Pseudorandomness is perfectly acceptable if the program controls it by specifying a random seed whose calculation is repeatable under the same conditions as just mentioned.

Recommended Optimisations

For C/C++:

-O3 -NDEBUG -static -fno-strict-aliasing

For Java:

java -server -Xms<SATRAM> -Xmx<SATRAM>

Who are the winners

Award categories

A Competition category is the combination of one Solver category and one Benchmark category.

Solver categories: SAT, UNSAT, SAT+UNSAT

Benchmark categories: Industrial, Hand_crafted, Random.

Series fall within Benchmark categories. Problems fall within Series. "Problems" are synonomous with "Formulas".

OPEN PROBLEMS may be accepted as though they are UNSAT at the discretion of the organizers -- primarily, they should have some kind of theoretical interest, not just be difficult. OPEN PROBLEMS might become a Series.

For each Competition category, we award the solvers with the three highest scores (bronze, silver, gold medals).

Computing scores

Each solver's score within its Competition category will be based on the number of problems solved within the time budgets, the amount of time used, and the number of series in which the solver solved at least one problem. The scoring rationale is discussed in the next section.

Some generalities:

Solvers receive various credits for successes. Credits are computed by dividing purses for various achievements among those solvers that achieved the requirement for the purse. Shares of all purses are computed to sufficient accuracy to break ties correctly. Purses come in three types: solution, speed, and series.

There are no penalties for terminating abnormally, or giving up by answering UNKNOWN. However, an incorrect claim of SAT or UNSAT normally eliminates the solver from the competition and any shares of purses it has won are redistributed.

Tentative scoring plans:

Each problem has a solution purse, which is divided equally among all competition solvers that solve the problem. The standard solution purse is 1000 points. Normally, all problems in a Competition category will have the standard solution purse, but there might be exceptions. If no solver solves a certain problem, its solution purse is not distributed.

Each problem has a speed purse, which is divided unequally among all competition solvers that solve the problem. The speed purse will be a fixed multiple of the solution purse for all problems in the entire competition; it will give a weighting between solving and speed. If no solver solves a certain problem, its speed purse is not distributed.

The formula to divide the speed purse of a problem is the following, where p is problem id and s and i are solver ids and times are in seconds.

speedFactor(p, i) = timeLimit(p) / ( 1 + timeUsed(p, i ) )

where i solved p. It is 0 if i did not solve p.

speedAward(p, s) = speedPurse(p) * speedFactor(p, s) / sum_i ( speedFactor(p, i) )

Thus, the division is pro rata by speedFactor.

Each series has a series purse, which is divided equally among all competition solvers that solve at least one problem in the series. If no solver solves any problem in a certain series, its series purse is not distributed.

Normally, all series in a Competition category will have the same series purse, which will be a fixed multiple of the standard solution purse, for all series in the entire competition, but there might be exceptions. The series purses reward breadth of application.

For SAT 2005, all series containing 5 or more benchmark instances have the same series purse, which is a fixed multiple (serM ) of the standard solution purse. (Recall that scoring is separately applied within each combination of category and specialty, e.g., SAT within RANDOM, or SAT+UNSAT within CRAFTED.) All series containing 4 or fewer benchmarks have the same series purse, which is a fixed multiple (serM / 3) of the standard solution purse. The coefficients and multiples for SAT 2005 were: stdP = 1000.0; spdM = 1.0; serM = 3.0.

Scoring Rationale

Each problem (benchmark) has a "prize" for solving it, or a "purse" as they say in horse racing. The standard purse is 1000 points.

If k solvers solve the problem within the time limit, they split the purse k ways.

This provides a natural way to give more credit for solving "hard" problems without introducing arbitrary weights to measure "hardness" -- the winners simply split the purse among themselves.

A state-of-the-art contributor (SOTAC) solver in the CASC terminology is one that solved some problem that no one else solved. That is an automatic 1000 and is likely to be worth more than being great at "stealing candy from babies", i.e., knocking off all the easy problems. If 2 solvers out of 40 are able to solve a problem, that should be more significant than if 30 / 40 solved it, but for SOTAC there is no difference.

For more details about the scoring scheme, see Allen Van Gelder's SAT 2005 purse poster.