AI-powered model auditing agent with multi-agent debate for robust evaluation of machine learning models.
This repository has been tested extensively with Python 3.10.15. Typical install time via uv is less than a minute.
uv sync
uv run python main.py --model resnet50 --dataset CIFAR10 --weights path/to/weights.pthpip install -e .
python main.py --model resnet50 --dataset CIFAR10 --weights path/to/weights.pthuv sync --extra medical # or pip install -e ".[medical]"python main.py --model resnet50 --dataset CIFAR10 --weights models/model.pth# ISIC skin lesion classification
python main.py --model siim-isic --dataset isic --weights models/isic/model.pth
# HAM10000 dataset
python main.py --model deepderm --dataset ham10000 --weights models/ham10000.pthWe prepared a small toy model, trained on CIFAR10 so the Auditor can be tested. All that is needed is a valid Anthropic API Key as can be seen below (see section 'Environment Variables').
python main.py --model resnet18 --dataset CIFAR10 --weights examples/cifar10/cifar10.pthExpected runtime varies depending on user response speed and subset time but should take less than 10 minutes in total.
--subset N: Use N samples for faster evaluation--no-debate: Disable multi-agent debate--single-agent: Use single agent instead of multi-agent debate--device: Specify device (cpu, cuda, mps)
Set your API keys:
export ANTHROPIC_API_KEY="your-key"
export OPENAI_API_KEY="your-key" # if using non-Anthropic modelsmain.py- Interactive model auditor with multi-agent debatetestbench.py- Automated evaluation scriptutils/agent.py- Multi-agent conversation systemarchitectures/- Custom model architecturesprompts/- System prompts for different evaluation phasesmodels/- Pre-trained model weightsresults/- Evaluation results and conversation logs