Common critiques of the Brain-Score platform

Testing on the same data over and over again can help modelers game the system.
Why should we trust the Brain-Score rankings on these hand-picked benchmarks?
We need to test models on specific hypotheses, not observational data.
Why should models only hit specific benchmarks? I don't care about these benchmarks!

These and others are common critiques in our community of the Brain-Score platform. We hear you!


We take pride in developing and making openly available an integrative benchmarking platform that allows the community to operationalize and actively engage on this and other feedback. In 2022, we organized the first Brain-Score competition for model submissions in which we evaluated the models on our existing benchmarks. In 2024 — we have turned the table! This year, in the spirit of an adversarial collaboration — we invite experimentalists and the community at large to turn their legitimate concerns into concrete benchmarks that will challenge and hopefully expose the explanatory gaps between our current models of primate vision and the biological brain.


2024 Brain-Score Competition


Our ability to recognize objects in the world relies on visual processing along the ventral visual stream, a set of hierarchically-organized cortical areas. Recently, machine-learning researchers and neuroscientists have produced computational models that have achieved moderate success at explaining primate object recognition behavior and the neural representations that support it. This second edition of the Brain-Score Competition aims to find benchmarks on which the predictions of our current top models break down.


In 2022, the first Brain-Score Competition led to new and improved models of primate vision that predicted existing benchmarks reasonably well. In this year's Brain-Score Competition 2024, we will close the loop on testing model predictions by rewarding those benchmarks that show where models are the least aligned to the primate visual ventral stream. The Competition is open to the scientific community and we provide an infrastructure to evaluate a variety of models on new behavioral and neural benchmarks in a standardised and unified manner. In addition, we will incentivize benchmark submissions by providing visibility to participants and monetary prizes (details TBA) to the winning benchmarks.


Submissions are open until June 30, 2024. For regular updates related to the competition, please follow Brain-Score on Twitter and join our community. Good Luck!


Overview

Participants should submit their benchmarks through the Brain-Score platform. Brain-Score currently accepts any benchmark engaging on testing model predictions of core object recognition behavior and neural responses (spike rates) along the ventral visual stream areas. We encourage all submissions of behavioral and neural benchmarks. To facilitate benchmark submission, we provide helper code and tutorials. You can also use existing datasets or metrics and re-combine them in novel ways. Note that by using the Brain-Score platform, no model knowledge is required -- you can treat them like another primate subject.

Benchmarks have to use models in their current "operating regime". This means that there are three behavioral tasks (imagenet-labeling, probabilities, odd-one-out), and four brain regions to record from (V1, V2, V4, IT). Until April, we will accept suggestions of other behavioral tasks provided you work with us to make sure the models can engage on them. The stimuli should be static images (e.g. no videos this round). In this round, benchmarks should not test temporal dynamics (but you can still specify a single "time_bin" from the models to record from.)

Base v. Brain Models
Figure adapted from Schrimpf et al. 2020

Competition Tracks

Behavioral Track

This track will reward those benchmarks that show-case model short-comings in predicting behavior (human or non-human primate). The winning submissions will be the behavioral benchmarks with the lowest model scores, mean-averaged over all models, i.e. as close as possible to 0. The benchmarks can target any behavioral task that the models engage on (labeling, class probabilities, odd-one-out; see the model interface). There will be multiple prizes, details TBA.

Neural Track

This track will reward those benchmarks that show-case model short-comings in predicting neural activity across the primate visual ventral stream. The winning submissions will be the neural benchmarks with the lowest model scores, mean-averaged over all models, i.e. as close as possible to 0. The benchmarks can target any region(s) in the visual ventral stream: V1, V2, V4, IT (see the model interface). Brain recordings can be from human or non-human primates. There will be multiple prizes, details TBA.

Models

The Brain-Score Competition 2024 will evaluate submitted benchmarks against roughly two dozen models.

Workshop

We will organize a workshop co-occurring with CCN at MIT in Cambridge, MA in early August 2024. This workshop will serve to bring the community interested in testing and building models of brain processing together.


We will invite selected participants in the Brain-Score competition to present their benchmarking work during the workshop.

Tutorial

To enter the competition, create an account on the Brain-Score website and submit a benchmark. You can submit a benchmark by sending in a zip file on the website. For new datasets, you can either host yourself or reach out to us and we will give you access to S3. Please check our overview tutorials as well as our full length tutorial for detailed information about the submission process.

We tried to make our tutorial as clear and easy to follow as possible for anyone with minimum Python knowledge. However, if you have any issues, feel free to contact us!

Organizers

Martin Schrimpf

EPFL

Kohitij Kar

York University

Contact

We recommend that participants join our Slack Workspace and follow Brain-Score on Twitter for any questions, updates, benchmark support, and other assistance.

FAQ

How do I submit a benchmark?
We have a full length tutorial that walks users through the submission process.
Why did you choose these models to score on the benchmark submissions?
The competition models are a combination of the top-10 models on existing Brain-Score benchmarks, and 10 additional models that we felt are of interest to the community. Please join the workshop and join the slack channel if you have input on these questions!
Who will get prizes?
See Tracks for the prize breakdown.
Can I submit a benchmark developed by a third-party?
If the benchmark (i.e., combination of dataset and metric) are not on brain-score.org, you can submit a new benchmark. This includes uploading existing data or implementing an existing metric. Please make sure you have the rights to use the data, do not upload personal identifying information, and cite the data collectors or metric developers!
Aren't there obvious ways to cheat?
Yes. We are relying on the goodwill of submitters. We will manually check especially potentially winning submissions so that the benchmark is faithfully attempting to test the alignment of models to the primate visual ventral stream.
What about benchmarks outside of vision?
This competition focuses on benchmarking models of the primate visual system. We hope to extend to other domains in the future, check out e.g. Brain-Score Language.
Are the models public?
Yes. You can access all models via the Brain-Score library's load_model function.
Can I also submit models?
Brain-Score always accepts model submissions. See here and here for tutorials. Note that this competition is aimed at benchmarks though, and there are no prizes for models in this edition.