Sample stimuli

sample 0 sample 1 sample 2 sample 3 sample 4 sample 5 sample 6 sample 7 sample 8 sample 9

How to use

from brainscore_vision import load_benchmark
benchmark = load_benchmark("Maniquet2024-tasks_consistency")
score = benchmark(my_model)

Model scores

Min Alignment Max Alignment

Rank

Model

Score

1
.781
2
.774
3
.750
4
.748
5
.747
6
.746
7
.746
8
.745
9
.742
10
.741
11
.739
12
.738
13
.737
14
.735
15
.734
16
.734
17
.732
18
.731
19
.729
20
.727
21
.727
22
.724
23
.718
24
.717
25
.717
26
.716
27
.716
28
.715
29
.715
30
.713
31
.712
32
.712
33
.710
34
.707
35
.705
36
.705
37
.703
38
.703
39
.701
40
.701
41
.701
42
.699
43
.698
44
.697
45
.695
46
.692
47
.692
48
.691
49
.690
50
.689
51
.688
52
.688
53
.688
54
.688
55
.687
56
.686
57
.686
58
.686
59
.685
60
.685
61
.685
62
.684
63
.684
64
.683
65
.681
66
.681
67
.681
68
.680
69
.680
70
.680
71
.680
72
.679
73
.679
74
.678
75
.678
76
.676
77
.676
78
.676
79
.676
80
.674
81
.674
82
.672
83
.672
84
.671
85
.669
86
.669
87
.669
88
.668
89
.668
90
.667
91
.667
92
.667
93
.667
94
.667
95
.667
96
.667
97
.667
98
.666
99
.666
100
.666
101
.666
102
.665
103
.663
104
.661
105
.660
106
.660
107
.659
108
.659
109
.658
110
.657
111
.656
112
.656
113
.656
114
.656
115
.656
116
.656
117
.654
118
.653
119
.653
120
.651
121
.650
122
.649
123
.649
124
.649
125
.648
126
.648
127
.648
128
.648
129
.647
130
.647
131
.647
132
.647
133
.646
134
.646
135
.646
136
.646
137
.644
138
.644
139
.644
140
.642
141
.640
142
.640
143
.639
144
.638
145
.638
146
.638
147
.635
148
.632
149
.627
150
.625
151
.625
152
.624
153
.624
154
.618
155
.617
156
.617
157
.616
158
.616
159
.615
160
.613
161
.610
162
.609
163
.607
164
.606
165
.604
166
.596
167
.578
168
.578
169
.576
170
.569
171
.568
172
.565
173
.565
174
.561
175
.553
176
.550
177
.550
178
.549
179
.545
180
.541
181
.541
182
.535
183
.534
184
.531
185
.531
186
.528
187
.525
188
.525
189
.521
190
.521
191
.520
192
.507
193
.507
194
.502
195
.499
196
.498
197
.498
198
.494
199
.484
200
.484
201
.484
202
.482
203
.482
204
.479
205
.478
206
.478
207
.476
208
.470
209
.470
210
.470
211
.462
212
.462
213
.461
214
.450
215
.448
216
.437
217
.416
218
.412
219
.407
220
.395
221
.395
222
.395
223
.395
224
.383
225
.380
226
.379
227
.367
228
.366
229
.363
230
.358
231
.349
232
.349
233
.344
234
.343
235
.341
236
.326
237
.325
238
.323
239
.323
240
.296
241
.266
242
.237
243
.210
244
.204
245
.067
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264

Benchmark bibtex

@article {Maniquet2024.04.02.587669,
	author = {Maniquet, Tim and de Beeck, Hans Op and Costantino, Andrea Ivan},
	title = {Recurrent issues with deep neural network models of visual recognition},
	elocation-id = {2024.04.02.587669},
	year = {2024},
	doi = {10.1101/2024.04.02.587669},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2024/04/10/2024.04.02.587669},
	eprint = {https://www.biorxiv.org/content/early/2024/04/10/2024.04.02.587669.full.pdf},
	journal = {bioRxiv}
}

Ceiling

1.00.

Note that scores are relative to this ceiling.

Data: Maniquet2024

Metric: tasks_consistency