— publication, — repository

Android

DroixBench — a collection of 24 reproducible crashes in open-source Android apps

C/C++

Bugs-C++ (pronounced Bugsy) — A dockerized benchmark/infrastructure for C/C++ defects providing easy-to-use interfaces similar to Defects4J
C-Pack-IPAs — A C90 Program Benchmark of Introductory Programming Assignments (IPAs), that contains semantically correct, semantically incorrect, and syntactically incorrect programs and a test suite for each IPA.
Codeflaws — 3902 bugs from Codeforces programming competition for evaluating program repair tools across different defect classes
DBGBench — 291 (in)correct patches from real software professionals for 27 real bugs in C for the qualitative evaluation of automated repair techniques
ITSP — a parallel corpus of 661 buggy-repaired program pairs submitted by CS-1 students for 74 unique assignments spread across 10 course weeks
IntroClass — automated program repair benchmark that consists of 998 defects in small student-written programming assignments
ManyBugs — automated program repair benchmark that consists of 185 defects from large popular open-source projects
TutorCode — 1,239 C++ buggy codes incorporating human tutor guidance and solution descriptions, accessible via API

Java

Bears — an extensible Java bug benchmark for automatic program repair studies
Bugs.jar — a large-scale, diverse dataset of bugs for Java program repair
Defects4J — a database of existing faults to enable controlled testing studies for Java
GitBug-Java — A Reproducible Benchmark of Recent Java Bugs (199 bugs from 2023)
Vul4J — a dataset of reproducible Java vulnerabilities
growingBugs — a benchmark for Java defects providing easy-to-use interfaces similar to Defects4J

JavaScript

BugsJS — a benchmark of 453 real, manually validated JavaScript bugs from 10 popular JavaScript server-side programs
FixJS — a dataset of bug-fixing JavaScript commits

Multilingual

BugSwarm — a dataset of thousands of real software bugs and their fixes
Defexts — a curated dataset of reproducible real-world bugs for modern JVM languages (Kotlin, Groovy, Scala)
Minecraft — a benchmark with C/C++, Java, and Python programs constructed via automated mining of software bug fixes with precise code context
QuixBugs — a parallel corpus of 40 programs in both Python and Java, each with a bug on one line

PHP

BugsPHP — a dataset for automated program repair in PHP

Python

BugsInPy — a database of existing bugs in Python programs to enable controlled testing and debugging studies
Refactory — a dataset of 1783 buggy and 2442 correct student submissions for 5 Python programming assignments