Neural Concept Verifier

Scaling Prover-Verifier Games to high-dimensional inputs via concept encodings for verifiable, interpretable AI.

Prover-Verifier Games (PVGs) offer a path toward formally verifiable classifiers, but they have not been applied to complex, high-dimensional inputs like images. Concept-based models handle such inputs interpretably but typically rely on low-capacity linear predictors. Neural Concept Verifier (NCV) bridges both worlds.

NCV combines PVGs with minimally supervised concept discovery. A concept encoder first translates raw high-dimensional inputs into structured concept representations. A prover then selects a subset of concepts; a nonlinear verifier classifies using only those concepts. This gives verifiability at the concept level — not just the pixel level — while supporting expressive predictors.

Experiments show NCV outperforms classic concept-based models and pixel-based PVG baselines on high-dimensional, logically complex datasets and mitigates shortcut learning.

Paper: