Sample stimuli

sample 0 sample 1 sample 2 sample 3 sample 4 sample 5 sample 6 sample 7 sample 8 sample 9

How to use

from brainscore_vision import load_benchmark
benchmark = load_benchmark("Geirhos2021stylized-error_consistency")
score = benchmark(my_model)

Model scores

Min Alignment Max Alignment

Rank

Model

Score

1
.871
2
.859
3
.851
4
.830
5
.787
6
.758
7
.751
8
.750
9
.746
10
.737
11
.734
12
.723
13
.708
14
.706
15
.705
16
.705
17
.704
18
.702
19
.697
20
.697
21
.693
22
.680
23
.680
24
.678
25
.678
26
.666
27
.654
28
.653
29
.646
30
.646
31
.646
32
.646
33
.628
34
.627
35
.623
36
.603
37
.600
38
.590
39
.588
40
.583
41
.579
42
.570
43
.555
44
.538
45
.516
46
.515
47
.511
48
.494
49
.488
50
.467
51
.463
52
.457
53
.456
54
.454
55
.450
56
.449
57
.443
58
.443
59
.439
60
.438
61
.438
62
.433
63
.430
64
.429
65
.427
66
.418
67
.417
68
.417
69
.417
70
.411
71
.409
72
.405
73
.405
74
.395
75
.392
76
.391
77
.389
78
.381
79
.370
80
.369
81
.365
82
.362
83
.358
84
.358
85
.351
86
.346
87
.342
88
.340
89
.324
90
.322
91
.321
92
.320
93
.319
94
.319
95
.317
96
.315
97
.315
98
.315
99
.315
100
.314
101
.314
102
.313
103
.311
104
.307
105
.305
106
.303
107
.302
108
.301
109
.300
110
.299
111
.299
112
.298
113
.297
114
.295
115
.295
116
.293
117
.292
118
.290
119
.289
120
.288
121
.281
122
.281
123
.278
124
.274
125
.274
126
.273
127
.272
128
.271
129
.270
130
.267
131
.266
132
.263
133
.263
134
.262
135
.262
136
.258
137
.255
138
.254
139
.254
140
.254
141
.254
142
.252
143
.251
144
.251
145
.243
146
.236
147
.233
148
.233
149
.232
150
.230
151
.229
152
.229
153
.224
154
.223
155
.219
156
.218
157
.216
158
.213
159
.208
160
.202
161
.202
162
.201
163
.199
164
.198
165
.198
166
.198
167
.198
168
.197
169
.195
170
.195
171
.194
172
.192
173
.192
174
.191
175
.187
176
.185
177
.185
178
.184
179
.182
180
.181
181
.181
182
.176
183
.166
184
.166
185
.163
186
.160
187
.159
188
.159
189
.157
190
.155
191
.154
192
.146
193
.145
194
.141
195
.139
196
.139
197
.139
198
.139
199
.139
200
.139
201
.139
202
.139
203
.139
204
.139
205
.139
206
.139
207
.134
208
.133
209
.132
210
.131
211
.131
212
.129
213
.127
214
.127
215
.123
216
.121
217
.121
218
.120
219
.119
220
.119
221
.116
222
.114
223
.113
224
.111
225
.110
226
.110
227
.107
228
.107
229
.106
230
.106
231
.103
232
.103
233
.102
234
.100
235
.098
236
.097
237
.097
238
.097
239
.097
240
.096
241
.094
242
.094
243
.093
244
.093
245
.093
246
.093
247
.093
248
.092
249
.090
250
.089
251
.088
252
.088
253
.087
254
.087
255
.086
256
.084
257
.084
258
.084
259
.080
260
.079
261
.078
262
.078
263
.078
264
.075
265
.075
266
.074
267
.070
268
.067
269
.066
270
.065
271
.064
272
.064
273
.064
274
.062
275
.061
276
.060
277
.060
278
.057
279
.057
280
.049
281
.047
282
.046
283
.045
284
.045
285
.044
286
.044
287
.044
288
.040
289
.040
290
.040
291
.038
292
.037
293
.028
294
.026
295
.026
296
.025
297
.016
298
.016
299
.015
300
.015
301
.015
302
.014
303
.012
304
.012
305
.008
306
.006
307
.005
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456

Benchmark bibtex

@article{geirhos2021partial,
              title={Partial success in closing the gap between human and machine vision},
              author={Geirhos, Robert and Narayanappa, Kantharaju and Mitzkus, Benjamin and Thieringer, Tizian and Bethge, Matthias and Wichmann, Felix A and Brendel, Wieland},
              journal={Advances in Neural Information Processing Systems},
              volume={34},
              year={2021},
              url={https://openreview.net/forum?id=QkljT4mrfs}
        }

Ceiling

0.50.

Note that scores are relative to this ceiling.

Data: Geirhos2021stylized

Metric: error_consistency