It would be nice to repro the SWE-Bench + Claude result in the standard setting (no tests available + 1 submission)