— publication, — repository
Android |
|
|---|---|
| DroixBench — a collection of 24 reproducible crashes in open-source Android apps | |
C/C++ |
|
| Bugs-C++ (pronounced Bugsy) — A dockerized benchmark/infrastructure for C/C++ defects providing easy-to-use interfaces similar to Defects4J | |
| C-Pack-IPAs — A C90 Program Benchmark of Introductory Programming Assignments (IPAs), that contains semantically correct, semantically incorrect, and syntactically incorrect programs and a test suite for each IPA. | |
| Codeflaws — 3902 bugs from Codeforces programming competition for evaluating program repair tools across different defect classes | |
| DBGBench — 291 (in)correct patches from real software professionals for 27 real bugs in C for the qualitative evaluation of automated repair techniques | |
| ITSP — a parallel corpus of 661 buggy-repaired program pairs submitted by CS-1 students for 74 unique assignments spread across 10 course weeks | |
| IntroClass — automated program repair benchmark that consists of 998 defects in small student-written programming assignments | |
| ManyBugs — automated program repair benchmark that consists of 185 defects from large popular open-source projects | |
| TutorCode — 1,239 C++ buggy codes incorporating human tutor guidance and solution descriptions, accessible via API | |
Java |
|
| Bears — an extensible Java bug benchmark for automatic program repair studies | |
| Bugs.jar — a large-scale, diverse dataset of bugs for Java program repair | |
| Defects4J — a database of existing faults to enable controlled testing studies for Java | |
| GitBug-Java — A Reproducible Benchmark of Recent Java Bugs (199 bugs from 2023) | |
| Vul4J — a dataset of reproducible Java vulnerabilities | |
| growingBugs — a benchmark for Java defects providing easy-to-use interfaces similar to Defects4J | |
JavaScript |
|
| BugsJS — a benchmark of 453 real, manually validated JavaScript bugs from 10 popular JavaScript server-side programs | |
| FixJS — a dataset of bug-fixing JavaScript commits | |
Multilingual |
|
| BugSwarm — a dataset of thousands of real software bugs and their fixes | |
| Defexts — a curated dataset of reproducible real-world bugs for modern JVM languages (Kotlin, Groovy, Scala) | |
| Minecraft — a benchmark with C/C++, Java, and Python programs constructed via automated mining of software bug fixes with precise code context | |
| QuixBugs — a parallel corpus of 40 programs in both Python and Java, each with a bug on one line | |
PHP |
|
| BugsPHP — a dataset for automated program repair in PHP | |
Python |
|
| BugsInPy — a database of existing bugs in Python programs to enable controlled testing and debugging studies | |
| Refactory — a dataset of 1783 buggy and 2442 correct student submissions for 5 Python programming assignments | |