Results available
The competition results are now available.
What's new this year?
There are several new features in the SAT competition this year:- New Hardware
- The competition will run on a new cluster at CRIL, composed of nodes with two Bi-Xeon Quad core processors and 32 GB of RAM. The operating system is CentOS 5.4, x86_64 (kernel 2.6.18, glibc 2.5). The competition is run in two stages. During the first stage, solvers will be allocated 7GB of memory. Each instance of a sequential solver will be allocated 2 cores, each instance of a parallel solver will be allocated 4 cores. This means that at one time, a node will be running either 4 runs of a sequential solver (2 per processor), or 2 runs of a parallel solver (1 per processor). Two different solvers will not run concurrently on a node. For the second stage - based on the results of which the best solvers will be awarded - only one sequential solver will be launched on each processor (2 solvers per node) and only one parallel solver will be launched on a node (1 solver per node), with a memory limit of 15GB of RAM.
- Sequential/Parallel Neutrality
- This year, there is no special track dedicated to sequential or parallel solvers. Sequential and parallel solvers are grouped into one single competition, but with two different rankings. The first ranking is based on wall clock time and will promote solvers which use all available resources to give an answer as quickly as possible. The second ranking is based on CPU time and will promote solvers which use resources as efficiently as possible. This latter ranking is the one that was used in the previous competitions. In the wall clock based ranking, timeout will be imposed on the wall clock time. In the CPU based ranking, timeout will be imposed on CPU time. It is expected that parallel solvers will perform better in the first ranking while sequential solvers will perform better in the second ranking.
- New Award Categories
- The competition will award both the fastest SAT solvers in terms of wall-clock time and in terms of CPU time. The most innovative ("non CDCL") SAT solver will be awarded a special "Moshe Vardi Prize".
- Choose Your Category
- Unlike the previous competitions, in which all solvers where run on all benchmarks, in order to save computational resources this time submitters are asked to select in which category/-ies (application, crafted, random) their solver will compete. The most efficient solvers (selected by the jury) will still compete in every category during the second stage.
- Minimally Unsatisfiable Subset (MUS) Special Track
- Due to the success of MUS techniques on various applications (especially as core engines in MAXSAT solvers), a special track for MUS systems will be organized for the first time.
- Data Analysis Track
- Since there are many different ways to analyze the results of the competition, the Data Analysis Track will offer to anyone the possibility to run its own analysis of the competition and have it published both on the competition web site and as a poster during the SAT conference. This track is an opportunity to experiment different ranking schemes, as well as analyze the strengths and weaknesses of both solvers and benchmarks. Contributors will have to submit a program that will be run by the organizers on anonymized results.
Competition tracks
Here is a quick view of the competition. See detailed rules for complete details.
Main track
Rules at a glance: Selection of winners is based on a two-stage ranking. For the second stage, the maximum allowed CPU time is increased. Benchmarks are expressed in CNF (Dimacs format), and partitioned in three categories: Applications, Crafted, and Random. In each category, there are three specializations: SAT, UNSAT and SAT+UNSAT. There will be one winner for each category and specialization. Anonymized result data will be used in judging the winner of each category. To be eligible to enter the competition, solvers must be available in their original source code format for research purposes. Solvers for which only binaries are made available will be accepted only for demonstration. The board of judges has the final say in making decisions on eligibility and on the winners of each category.
Minisat hack track
This track, originally introduced for the SAT 2009 competition, is organized again. This year the latest version of Minisat (2.2.0) is used as the base solver.
The motivation of this track is twofold. On the one hand, we want to lower the threshold for Master's and PhD students to enter the SAT competition by bringing new and innovative ideas to the original Minisat solver. On the other hand, the aim is to see how far the performance of Minisat can be improved by making only minor changes to the source code. We strongly encourage usual participants to the main track (including non-students) with "hacked" Minisat solver to participate in this track.
Our motivation behind using Minisat as the base solver: Minisat is a well-known and widely adopted solver that can be seen as a canonical CDCL solver, offering an efficient (and necessary) CDCL basis and easily modifiable code. Indeed, Minisat has been often improved by introducing only minor changes to its code ("hacking it"). Furthermore, looking back at recent progresses in CDCL solvers, most of them are based on minor changes (<10 lines each): phase caching, blocked literals, and Luby series for rapid restarts.
The motivation behind competing with minor hacks is that it can help the community to detect which types of modifications pay off: in order to isolate the actual causes of improvements when modifying solvers, it is necessary to perform scientifically valid experiments on the effects of individual changes made to a solver (e.g., individual new techniques introduced to the base solver).
Depending on the number of submissions in this track, the organizers may decide to use a shorter time limit or a smaller set of benchmarks than in the main competition track. Actual awards will not be given in this track, but we hope that the whole community will benefit from analyzing the results and the proposed hacks. However, the best solvers identified in this track will automatically enter the main competition to be compared on the same basis as other solvers, with the possibility of being awarded in the main track.
Many thanks to Niklas Eén and Niklas Sörensson for allowing us to conduct this special track.
Certified Unsat Track
In the certified unsat track, the competing SAT and QBF solvers produce some verification output and are only tested on unsatisfiable (for SAT) or invalid (for QBF) benchmarks. We are also looking for verification tools to be submitted.
Because of the experimental nature of this track, there are no prizes and no official ranking system. See the poster shown at SAT 2007 for that year's track. We expect 2011 to be conducted similarly. There are several accepted proof formats, each with precise specifications. These formats are designed to be neutral to the SAT solver's methodology as far as possible.
For QBF, there is a new format, called QIR, which is an encoding for a Q-resolution proof that is easier to verify than existing formats, and might be more useful for debugging because it is easier for humans to analyze it. QIR is also suitable for SAT by considering all variables to be existential.
Submitted verification tools do not need to accept an "official" format natively, if we can set up a simple translation script. Similarly, submitted solvers might produce output in a different form, which can be post-processed into an "official" format.
All formats may be found at the ProofChecker web page, but we expect the most popular for SAT to be the RUP format.
The purpose of the RUP proof format is to allow a solver to "retro-fit" some verification output without too much trouble, and to allow for some possibility of derivation rules other than resolution. The short story is that solvers in the vein of grasp, chaff, berkmin, minisat, and descendants can simply output their conflict clauses (the final one being the empty clause).
Also, we believe that look-ahead solvers will be able to output a RUP proof without great difficulty. Notice that it is not necessary to actually use all the clauses you output, but if the redundancy is too great, you might hit filesize limits or time limits. Practical experience is needed to see what can be achieved.
Minimally Unsatisfiable Subset Track
Under the MUS track, two different subtracks are organized:- Plain MUS Track
Solvers have to identify as quickly as possible a minimal unsatisfiable subset of clauses, that is, a subset of clauses which is unsatisfiable and whose subsets are all satisfiable.
- Group-oriented MUS Track
In this track, clauses are partitioned into groups (for example, clauses which encode one gate of a circuit-level description may form a group). Solvers must identify a minimally unsatisfiable subset of these groups.
Data Analysis track
The goal of this track is to allow anyone to provide the community with new insights on the competition results, during the SAT conference.
- Contributors must submit programs (analyzers) that take as input the anonymized results and generate any data they want. These programs may also read the competition benchmarks. These analyzers will be run by the organizers, in order to make such an analysis as neutral as possible. It is expected that these analyzers will not require significant CPU resources.
- Each analysis is published on the web site of the competition, with the other results. Contributors will also have the opportunity to present their analysis as a poster during the SAT conference. The analyzers and a description of the analysis should be published too, in order to be available to the whole community;
- Organizers can choose to present the most relevant analysis during the presentation at the SAT conference
- Contributors will have the opportunity to test their analyzer on the 2009 results.
- The deadline for submitting an analyzer will be April, 4th.
- Since this is the first time we organize such a track, we will do our best to accomodate the needs of submitters. People interested by this track should contact the organizers to check their requirements.
Additional competitions
Two other competitive events are organized as part of the SAT 2011 conference:Submission Procedure
Both benchmarks and solver submissions will be managed through the SAT competition framework after registration.
- Organizers and judges that want to participate must
publish a MD5 hash number of the archive of their source code
(Competition division) or their binary (demonstration division) or of a precise description of their submission
strictly before February 6, 2011. The version entering the contest must be the same. The goal is
to prove publicly that organizers submitted their solver before having
access to any solver or benchmark submitted to the competition.
- Submission of solvers, benchmarks and their descriptions will be managed through the registration site which is now opened. Contestants must first obtain a login/password on this site. This login will also allow them to access the intermediate competition results. Then, contestants have the opportunity to upload solvers, benchmarks and their description (preferably as PDF files) on the site. Until the submission deadline, they have the opportunity to update their submission.
- In each track, an author may not submit more than 3 different sequential solvers and no more than 3 parallel solvers. Two solvers are different as soon as their sources (or binaries) differ or as soon as the command line arguments used for running the solver or the compilation options used to compile it are different.
- Detailed instructions for preparing benchmark submissions are available here.
- In each category, the final set of benchmarks is selected accordingly to strict rules. When needed, random selection of benchmarks is done accordingly to an initial random seed number composed with each judge random seed. Chance of selection of a benchmark depends on its hardness (in each category, this hardness is evaluated by the respective performances of the top 3 solvers of the 2009 competition). The entire process of selection can be enterily reproduced by following the same rules, with the same initial random number seed. Series of benchmarks (same generator, same BMC problem with different bounds, ...) are proposed by organizers, with the help of submitters. They are validated by judges, based on benchmarks descriptions only (not on benchmarks themselves).
Organizers' and judges' submissions
MD5 hash of the organizers' and judges' submissions:- f72932eb16684ee853cfb1f2afe2ba7f sat4j-mus-v20110206.jar (D. Le Berre, solver binary)
- c497f12e2c0bfdd55cb0e90753f11d75 org.sat4j.core.jar (D. Le Berre, solver binary)
- 2a44b64d102fd127cb22562d07f4274d muser.bz2 (J. Marques Silva, solver binary)
- a12e32960624463bb1934039fff35cbf solver1.tex (O. Roussel, description of a solver)
- ccf3680ef9f3e2909f0a9b87d026e6b3 solver2.tex (O. Roussel, description of a solver)
- 645f4c8dda7608af929225b163078de1 solver1-minisathack.pdf (A. Sabharwal, description of a solver)
- 7def65d9245316d8f25505970498d88d solver2-maintrack.pdf (A. Sabharwal et al, description of a solver)
Important dates
- Organizers' solvers submitted and available by February 6, 2011
- Registration and submission opens on February 1, 2011
- Benchmark submission deadline: February 13, 2011
- Solver submission deadline: March 2, 2011 (main track and minisat hack track), March 20, 2011 (MUS tracks)
- Data analysis submission deadline: April 23, 2011
- Certified Unsat Track submission deadline: May 15, 2011 (extended)
- First stage results by the end of April 2011
- Final results disclosed at the SAT 2011 conference
Judges
- Uwe Egly, TU Vienna
- Alexander Nadel, Intel Haifa
- Ashish Sabharwal, IBM Watson Research Center
- Moshe Vardi, Rice University
Organizers
For general questions on the competition, please contact the main track organizers at organizers@satcompetition.org .
Main and Minisat Hack tracks:- Matti Järvisalo, University of Helsinki
- Daniel Le Berre, Université d'Artois
- Olivier Roussel, Université d'Artois
- Matti Järvisalo, University of Helsinki
- Daniel Le Berre, Université d'Artois
- Joao Marques-Silva, University College Dublin
- Olivier Roussel, Université d'Artois
- Allen Van Gelder, University of California at Santa Cruz