Next: Benchmark Submission
Up: The rules
Previous: Checking results and outputs:
From a practical point of view, the competition ran in several
steps, going from March to May 2002. The initial step was exclusively
for authors: a machine was opened over the web to allow them to
compile/test their sources code in ``realistic environment''. After
that, the competition began:
- compliance testing:
- solvers: each of them was compiled and tested on a few
benchmarks to check that the solvers conformed input/output
requirements of SAT-Ex framework. During that step, some bugs (in
the usual sense) were detected and reported to authors. But note that
it was not the aim of that step. Some fixed version were accepted.
- benchmarks: at the same time, new submitted benchmarks
were shuffled (literal renaming, clauses reordering). Comment line
were also removed. Some benchmarks were discarded because of
incorrect syntactical format.
- first round: all solvers ran on all ``correct''
benchmarks during 40 minutes (see section 2.4 for
the computer description). We first ran all the solvers on industrial
benchmarks, then handmade benchmarks and finally randomly generated
ones (this last ones were run for 20 minutes only, on faster machines).
In this step, the launching heuristic was applied,
and, according to it, each complete solver was launched on each
applicable benchmark one time. Randomized solvers (incomplete or not)
were launched 3 times on each applicable benchmarks, on industrial
and hand-made benchmarks only. To take these 3 executions into
account, the median CPU-time was taken and the instance was solved if
at least one execution solved it (that means that a randomized solver
can solve a particular instance and be charged of the maximum
cpu-time, if only one of the three launches has succeeded).
- second round: the top five solvers ran on a part of the
remaining unsolved instances during 6 hours. If a solver returned
an incorrect result (typically, UNSAT instead of SAT) in the first
round, then it was not qualified for this stage (even
``unofficially'').
Next: Benchmark Submission
Up: The rules
Previous: Checking results and outputs:
LE BERRE Daniel
2002-09-16