Sample stimuli

sample 0 sample 1 sample 2 sample 3 sample 4 sample 5 sample 6 sample 7 sample 8 sample 9

How to use

from brainscore_vision import load_benchmark
benchmark = load_benchmark("Geirhos2021stylized-error_consistency")
score = benchmark(my_model)

Model scores

Min Alignment Max Alignment

Rank

Model

Score

1
.871
2
.859
3
.851
4
.830
5
.787
6
.758
7
.751
8
.750
9
.746
10
.737
11
.734
12
.723
13
.708
14
.706
15
.705
16
.705
17
.704
18
.702
19
.697
20
.697
21
.693
22
.680
23
.680
24
.678
25
.678
26
.666
27
.654
28
.653
29
.646
30
.646
31
.646
32
.646
33
.628
34
.627
35
.623
36
.603
37
.600
38
.590
39
.588
40
.583
41
.579
42
.570
43
.555
44
.538
45
.516
46
.515
47
.511
48
.494
49
.488
50
.467
51
.463
52
.457
53
.456
54
.454
55
.450
56
.449
57
.443
58
.443
59
.439
60
.438
61
.438
62
.433
63
.430
64
.429
65
.427
66
.418
67
.417
68
.417
69
.417
70
.411
71
.409
72
.405
73
.405
74
.395
75
.392
76
.391
77
.389
78
.381
79
.370
80
.369
81
.365
82
.362
83
.358
84
.358
85
.351
86
.346
87
.342
88
.340
89
.324
90
.322
91
.321
92
.320
93
.319
94
.319
95
.317
96
.315
97
.315
98
.315
99
.315
100
.314
101
.314
102
.313
103
.311
104
.307
105
.305
106
.303
107
.302
108
.301
109
.300
110
.299
111
.299
112
.298
113
.297
114
.295
115
.295
116
.293
117
.292
118
.290
119
.289
120
.288
121
.281
122
.281
123
.278
124
.274
125
.274
126
.273
127
.272
128
.271
129
.270
130
.267
131
.266
132
.263
133
.263
134
.262
135
.262
136
.258
137
.255
138
.254
139
.254
140
.254
141
.254
142
.252
143
.251
144
.251
145
.243
146
.236
147
.233
148
.233
149
.232
150
.230
151
.229
152
.229
153
.224
154
.223
155
.219
156
.218
157
.216
158
.213
159
.208
160
.202
161
.202
162
.201
163
.199
164
.198
165
.198
166
.198
167
.198
168
.197
169
.195
170
.195
171
.194
172
.192
173
.192
174
.191
175
.187
176
.185
177
.185
178
.184
179
.182
180
.181
181
.181
182
.176
183
.166
184
.166
185
.163
186
.160
187
.159
188
.159
189
.157
190
.155
191
.154
192
.146
193
.145
194
.141
195
.139
196
.139
197
.139
198
.139
199
.139
200
.139
201
.139
202
.139
203
.139
204
.139
205
.139
206
.139
207
.139
208
.134
209
.133
210
.132
211
.131
212
.131
213
.129
214
.127
215
.127
216
.123
217
.121
218
.121
219
.120
220
.119
221
.119
222
.116
223
.114
224
.113
225
.111
226
.110
227
.110
228
.107
229
.107
230
.106
231
.106
232
.103
233
.103
234
.102
235
.100
236
.098
237
.097
238
.097
239
.097
240
.097
241
.096
242
.094
243
.094
244
.093
245
.093
246
.093
247
.093
248
.093
249
.092
250
.090
251
.089
252
.088
253
.088
254
.087
255
.087
256
.086
257
.084
258
.084
259
.084
260
.080
261
.079
262
.078
263
.078
264
.078
265
.075
266
.075
267
.074
268
.070
269
.067
270
.066
271
.065
272
.064
273
.064
274
.064
275
.062
276
.061
277
.060
278
.060
279
.057
280
.057
281
.049
282
.047
283
.046
284
.045
285
.045
286
.044
287
.044
288
.044
289
.040
290
.040
291
.040
292
.038
293
.037
294
.028
295
.026
296
.026
297
.025
298
.016
299
.016
300
.015
301
.015
302
.015
303
.014
304
.012
305
.012
306
.008
307
.006
308
.005
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460

Benchmark bibtex

@article{geirhos2021partial,
              title={Partial success in closing the gap between human and machine vision},
              author={Geirhos, Robert and Narayanappa, Kantharaju and Mitzkus, Benjamin and Thieringer, Tizian and Bethge, Matthias and Wichmann, Felix A and Brendel, Wieland},
              journal={Advances in Neural Information Processing Systems},
              volume={34},
              year={2021},
              url={https://openreview.net/forum?id=QkljT4mrfs}
        }

Ceiling

0.50.

Note that scores are relative to this ceiling.

Data: Geirhos2021stylized

Metric: error_consistency