diff --git a/benchmarks.shtml b/benchmarks.shtml index db7a59f..3c34913 100644 --- a/benchmarks.shtml +++ b/benchmarks.shtml @@ -50,19 +50,169 @@ Released for free under a Creative Commons Attribution 3.0 License

Benchmarks

+

Availability

- The latest release of the SMT-LIB benchmark library is hosted on Zenodo under the SMT-LIB community. + The SMT-LIB initiative provides a large collection of SMT-LIB benchmarks. + These benchmarks are used for research on SMT solving and by the SMT Competition. +

+

+ The latest release of the SMT-LIB benchmark library is hosted on Zenodo + under the SMT-LIB community. Also available there are all previous releases starting from 2023.

- -

StarExec

-

Historical releases of the benchmark library, from 2013 to 2023, are available on StarExec. +

+ Some benchmarks that are not compliant with the SMT-LIB standard, such as + those that use theories that are not yet specified, are available on + GitHub.

-

Submitting Benchmarks

-

For details on how to submit benchmarks to SMT-LIB, please refer to the README file in - the benchmark-submission - SMT-LIB GitHub repository. +

Benchmark Catalog

+

+ To make the benchmark collection more accessible, we provide an SQLite + database that collects benchmark metadata including past competition + results. This database and its documentation can also be found on + Zenodo. +

+

StarExec

+

Historical releases of the benchmark library, from 2013 to 2023, were + available on StarExec.

+

Submitting Benchmarks

+

+ The SMT-LIB benchmark collection would not exist without the countless + benchmarks submitted by users and developers of SMT solvers. + We welcome all kinds of benchmarks, from individual benchmarks that + test semantic edge cases, to large benchmark sets generated by verification + problems. +

+

+ For details on how to submit benchmarks to SMT-LIB, please refer to the README file in + the benchmark-submission + SMT-LIB GitHub repository. +

+

Archive Structure

+

+ On Zenodo, benchmarks are grouped in two collections: non-incremental and incremental benchmarks. + The incremental benchmarks contain more than one call to check-sat. + After extracting a benchmark archive, the benchmarks are stored in + folders with the structure +

+
[non-]incremental/<logic>/<set-name>/.../<name>.smt2
+

+ Non-incremental benchmarks are always + be separated from the incremental benchmarks, even if a set of + benchmarks mostly consists of incremental benchmarks. +

+

+ The benchmarks are separated according to their logic. +

+

+ The <set-name> is either + <date>-<application>-<submitter>, + <date>-<application>, or + <date>-<submitter>. The + <data> is written as YYYYMMDD. + The application or submitter strings are typically chosen by the + person submitting the benchmarks. +

+

+ Benchmarks can be nested within a deeper directory structure below + the set directory. This nesting should represent the structure + of the benchmarks and is application dependent. +

+

+ Benchmarks are always valid SMT-LIB2 files. They contain at least + one check-sat command and end with an exit + command. +

+

+ Benchmarks contain one set-info :status command for each + check-sat command. In non-incremental benchmarks, the + set-info :status command is part of the header. In + incremental benchmarks the set-info :status commands is + placed in the line just above the corresponding + check-sat command. +

+

Benchmark Header

+

+ All benchmarks contain a metadata header. + For new benchmarks must header has the following structure, but for + older benchmarks some fields may be missing. +

+

+(set-info :smt-lib-version <version>)
+(set-logic <logic>)
+(set-info :source |<description>|)
+(set-info :license "https://creativecommons.org/licenses/by/4.0/")
+(set-info :category <category>)
+(set-info :status <status>)
+      
+

+ where: +

+

+

+ Most benchmarks are distribute under the Creative Commons + Attribution 4.0 International License, but contributors can specify + their own licence in the benchmark itself using the + (set-info :license "licence string") command. +

+

+ The <description> field is used to provide general + information about the benchmark. The first lines are of the form + <field>: <data> where the + <field> values are +

    +
  1. Generated by: the name(s) of those who generated the benchmark;
  2. +
  3. Generated on: generation date with format YYYY-MM-DD;
  4. +
  5. Generator: tool which generated the benchmark (if any);
  6. +
  7. Application: the general goal;
  8. +
  9. Target solver: the solvers that were initially used to check the benchmarks;
  10. +
  11. Publications: references to related publications;
  12. +
  13. Time limit: the typical time limit in seconds imposed by the intended application of the benchmark.
  14. +
+

+

+ Fields without meaningful values can be omitted. For example, a + manually written benchmark will usually not have a + Generator. +

+

+ After these structured lines, the <description> + field can contain additional information as free text. + We encourage benchmark contributors to use this free text to describe characteristics + of the benchmark that might be of interest to solver authors. + For example, observed performance discrepancies between solver + configurations, or structures in the benchmark that are particularly + difficult. +

+

A fantasy example of a typical header:

+

+(set-info :smt-lib-version 2.6)
+(set-logic QF_UFLIA)
+(set-info :source |
+Generated by: Sabrina Samplescientist
+Generated on: 2016-12-31
+Generator: Sledgehammer
+Application: Isabelle proof of Goedel theorem
+Target solver: CVC4
+Time limit: 2.0
+Benchmarks generated by the proof search tool Sledgehammer for Isabelle/HOL.
+We observed that the `app` operator introduced by Sledgehammer to eliminate
+higher-order function application is especially difficult for SMT solvers.
+|)
+(set-info :license "https://creativecommons.org/licenses/by/4.0/")
+(set-info :category "industrial")
+(set-info :status unsat)
+