A realization-sensitive benchmark architecture for candidate conscious systems. MORI refuses the easy inference from behavioral fluency to underlying substrate — separating what a system does from what it is. The evaluation harness is in active development. A teaser of what's coming.
Functional-modal competence, dynamical integration, and realization sensitivity are scored separately, with non-compensatory floors — behavioral fluency cannot rescue a system whose realization evidence falls short.
A Goodhart-sensitive evaluation principle: producing target content only under adversarial framing counts as evidence against substrate, not for it.
A positive substrate claim requires agreement across behavioral consistency, representation probing, and feature-level inspection. Behavior alone is never enough.
Built to distinguish systems that exhibit a behavior from systems that realize the structure behind it.
Locked thresholds, documented amendment trails, and explicit falsifiability criteria — conclusions are constrained by commitments made before the data.
Probes are constructed to catch target-language production that does not survive realization-sensitive scrutiny.
Grounded in two companion papers — a theory of modal consciousness and the benchmark architecture that operationalizes it.
MORI is in active development. The methodology is taking shape now, and the evaluation harness follows. Watch this space.
Visit moribenchmark.org