Skip to content

Domains

The public website currently exposes a focused subset of benchmark tasks rather than the entire internal task tree. This page summarizes the public domain coverage of that subset.

Use this page to understand which scientific areas are currently represented on the public website. It reflects the public `test` subset, not every task present internally.

Public Domain Snapshot

The breakdown below is rendered from generated website data.

Why Domain Coverage Matters

ASI-Bench is intended to evaluate scientific workflows rather than one narrow task family. Domain coverage matters because strong agent performance in one area, such as numerical PDE solving, does not automatically transfer to other areas such as computational biophysics or statistical physics.

Even in the current public slice, tasks already differ in:

  • scientific framing
  • data format and expected outputs
  • runtime and package requirements
  • evaluation methodology
  • how much domain knowledge is needed to make progress

Current Public Policy

The domain summary shown here reflects the public test subset only. It is therefore intentionally narrower than the full internal repository.

As the benchmark matures, this page can evolve into:

  • a richer domain map
  • a domain comparison chart
  • a landing page for domain-specific task browsing