SAT Competition 2011
A competitive event of the SAT 2011 Conference
June 19th - June 22nd 2011, Ann Arbor, MI, USA
Last modification: $LastChangedDate: 2011-06-30 05:35:32 +0200 (jeu. 30 juin 2011) $.

Results available

The competition results are now available.

What's new this year?

There are several new features in the SAT competition this year:
New Hardware
The competition will run on a new cluster at CRIL, composed of nodes with two Bi-Xeon Quad core processors and 32 GB of RAM. The operating system is CentOS 5.4, x86_64 (kernel 2.6.18, glibc 2.5). The competition is run in two stages. During the first stage, solvers will be allocated 7GB of memory. Each instance of a sequential solver will be allocated 2 cores, each instance of a parallel solver will be allocated 4 cores. This means that at one time, a node will be running either 4 runs of a sequential solver (2 per processor), or 2 runs of a parallel solver (1 per processor). Two different solvers will not run concurrently on a node. For the second stage - based on the results of which the best solvers will be awarded - only one sequential solver will be launched on each processor (2 solvers per node) and only one parallel solver will be launched on a node (1 solver per node), with a memory limit of 15GB of RAM.
Sequential/Parallel Neutrality
This year, there is no special track dedicated to sequential or parallel solvers. Sequential and parallel solvers are grouped into one single competition, but with two different rankings. The first ranking is based on wall clock time and will promote solvers which use all available resources to give an answer as quickly as possible. The second ranking is based on CPU time and will promote solvers which use resources as efficiently as possible. This latter ranking is the one that was used in the previous competitions. In the wall clock based ranking, timeout will be imposed on the wall clock time. In the CPU based ranking, timeout will be imposed on CPU time. It is expected that parallel solvers will perform better in the first ranking while sequential solvers will perform better in the second ranking.
New Award Categories
The competition will award both the fastest SAT solvers in terms of wall-clock time and in terms of CPU time. The most innovative ("non CDCL") SAT solver will be awarded a special "Moshe Vardi Prize".
Choose Your Category
Unlike the previous competitions, in which all solvers where run on all benchmarks, in order to save computational resources this time submitters are asked to select in which category/-ies (application, crafted, random) their solver will compete. The most efficient solvers (selected by the jury) will still compete in every category during the second stage.
Minimally Unsatisfiable Subset (MUS) Special Track
Due to the success of MUS techniques on various applications (especially as core engines in MAXSAT solvers), a special track for MUS systems will be organized for the first time.
Data Analysis Track
Since there are many different ways to analyze the results of the competition, the Data Analysis Track will offer to anyone the possibility to run its own analysis of the competition and have it published both on the competition web site and as a poster during the SAT conference. This track is an opportunity to experiment different ranking schemes, as well as analyze the strengths and weaknesses of both solvers and benchmarks. Contributors will have to submit a program that will be run by the organizers on anonymized results.

Competition tracks

Here is a quick view of the competition. See detailed rules for complete details.

Main track

Rules at a glance: Selection of winners is based on a two-stage ranking. For the second stage, the maximum allowed CPU time is increased. Benchmarks are expressed in CNF (Dimacs format), and partitioned in three categories: Applications, Crafted, and Random. In each category, there are three specializations: SAT, UNSAT and SAT+UNSAT. There will be one winner for each category and specialization. Anonymized result data will be used in judging the winner of each category. To be eligible to enter the competition, solvers must be available in their original source code format for research purposes. Solvers for which only binaries are made available will be accepted only for demonstration. The board of judges has the final say in making decisions on eligibility and on the winners of each category.

Minisat hack track

This track, originally introduced for the SAT 2009 competition, is organized again. This year the latest version of Minisat (2.2.0) is used as the base solver.

The motivation of this track is twofold. On the one hand, we want to lower the threshold for Master's and PhD students to enter the SAT competition by bringing new and innovative ideas to the original Minisat solver. On the other hand, the aim is to see how far the performance of Minisat can be improved by making only minor changes to the source code. We strongly encourage usual participants to the main track (including non-students) with "hacked" Minisat solver to participate in this track.

Our motivation behind using Minisat as the base solver: Minisat is a well-known and widely adopted solver that can be seen as a canonical CDCL solver, offering an efficient (and necessary) CDCL basis and easily modifiable code. Indeed, Minisat has been often improved by introducing only minor changes to its code ("hacking it"). Furthermore, looking back at recent progresses in CDCL solvers, most of them are based on minor changes (<10 lines each): phase caching, blocked literals, and Luby series for rapid restarts.

The motivation behind competing with minor hacks is that it can help the community to detect which types of modifications pay off: in order to isolate the actual causes of improvements when modifying solvers, it is necessary to perform scientifically valid experiments on the effects of individual changes made to a solver (e.g., individual new techniques introduced to the base solver).

Depending on the number of submissions in this track, the organizers may decide to use a shorter time limit or a smaller set of benchmarks than in the main competition track. Actual awards will not be given in this track, but we hope that the whole community will benefit from analyzing the results and the proposed hacks. However, the best solvers identified in this track will automatically enter the main competition to be compared on the same basis as other solvers, with the possibility of being awarded in the main track.

Many thanks to Niklas Eén and Niklas Sörensson for allowing us to conduct this special track.

Certified Unsat Track

In the certified unsat track, the competing SAT and QBF solvers produce some verification output and are only tested on unsatisfiable (for SAT) or invalid (for QBF) benchmarks. We are also looking for verification tools to be submitted.

Because of the experimental nature of this track, there are no prizes and no official ranking system. See the poster shown at SAT 2007 for that year's track. We expect 2011 to be conducted similarly. There are several accepted proof formats, each with precise specifications. These formats are designed to be neutral to the SAT solver's methodology as far as possible.

For QBF, there is a new format, called QIR, which is an encoding for a Q-resolution proof that is easier to verify than existing formats, and might be more useful for debugging because it is easier for humans to analyze it. QIR is also suitable for SAT by considering all variables to be existential.

Submitted verification tools do not need to accept an "official" format natively, if we can set up a simple translation script. Similarly, submitted solvers might produce output in a different form, which can be post-processed into an "official" format.

All formats may be found at the ProofChecker web page, but we expect the most popular for SAT to be the RUP format.

The purpose of the RUP proof format is to allow a solver to "retro-fit" some verification output without too much trouble, and to allow for some possibility of derivation rules other than resolution. The short story is that solvers in the vein of grasp, chaff, berkmin, minisat, and descendants can simply output their conflict clauses (the final one being the empty clause).

Also, we believe that look-ahead solvers will be able to output a RUP proof without great difficulty. Notice that it is not necessary to actually use all the clauses you output, but if the redundancy is too great, you might hit filesize limits or time limits. Practical experience is needed to see what can be achieved.

Minimally Unsatisfiable Subset Track

Under the MUS track, two different subtracks are organized:

Data Analysis track

The goal of this track is to allow anyone to provide the community with new insights on the competition results, during the SAT conference.

Additional competitions

Two other competitive events are organized as part of the SAT 2011 conference:

Submission Procedure

Both benchmarks and solver submissions will be managed through the SAT competition framework after registration.

Organizers' and judges' submissions

MD5 hash of the organizers' and judges' submissions:

Important dates

Judges

Organizers

For general questions on the competition, please contact the main track organizers at organizers@satcompetition.org .

Main and Minisat Hack tracks: Minimally Unsatisfiable Subset track Certified Unsat track: