8:00 - 9:30 | Breakfast and networking | |
9:30 - 10:00 | Opening Remarks and Workshop Overview | |
10:00 - 10:45 | Keynote: How can proof scaling make AI safer (TBA) | AbstractTo be announced |
10:45 - 11:15 | Break | |
11:15 - 12:00 | Keynote: Opportunities in guaranteed safe AI (TBA) | AbstractTo be announced |
12:00 - 1:30 | Lunch and networking | |
1:30 - 2:00 | FVAPPS (Ronak Mehta and Quinn Dougherty) | Abstract We introduce the Formally Verified Automated Programming Progress Standards, or FV-APPS, a benchmark for writing programs and proving their correctness. Previously, APPS provided a benchmark and dataset for programming puzzles to be completed in python and checked against unit tests, of the kind seen in technical assessments in the software engineering industry. We generalize the unit tests to Lean 4 theorems, given without proof (Lean's "sorry" keyword), and challenge the machine learning and program synthesis communities to solve both each general purpose programming problem and it's associated correctness specifications. It is available for play at https://huggingface.co/datasets/quinn-dougherty/fvapps |
2:00 - 2:30 | TBD | AbstractTo be determined |
2:30 - 3:00 | Break and Networking | |
3:00 - 3:30 | Pantograph (Leni Aniva) | Abstracthttps://arxiv.org/abs/2410.1642 |
3:30 - 4:00 | Proofs in the Wild (Mike Dodds) | AbstractAbstract to be added |
4:00 - 4:30 | Break and networking | |
4:30 - 5:00 | Guarantees-based Mechanistic Interpretability (Jason Gross) | Abstracthttps://arxiv.org/abs/2406.11779 |
5:00 - 5:30 | Day one wrap-up, day two orientation | |
5:30 - 8:00 | Dinner/drinks/social | |