Sample stimuli

sample 0 sample 1 sample 2 sample 3 sample 4 sample 5 sample 6 sample 7 sample 8 sample 9

How to use

from brainscore_vision import load_benchmark
benchmark = load_benchmark("Rajalingham2018-i2n")
score = benchmark(my_model)

Model scores

Min Alignment Max Alignment

Rank

Model

Score

1
.664
2
.652
3
.646
4
.625
5
.623
6
.617
7
.608
8
.607
9
.607
10
.606
11
.600
12
.600
13
.598
14
.596
15
.594
16
.593
17
.590
18
.589
19
.588
20
.585
21
.585
22
.584
23
.584
24
.584
25
.582
26
.581
27
.579
28
.578
29
.578
30
.578
31
.577
32
.576
33
.575
34
.574
35
.573
36
.573
37
.573
38
.573
39
.570
40
.570
41
.568
42
.567
43
.566
44
.564
45
.564
46
.564
47
.563
48
.563
49
.562
50
.561
51
.561
52
.561
53
.560
54
.560
55
.560
56
.560
57
.558
58
.558
59
.558
60
.555
61
.555
62
.555
63
.555
64
.554
65
.554
66
.554
67
.552
68
.551
69
.549
70
.549
71
.549
72
.549
73
.549
74
.549
75
.546
76
.546
77
.546
78
.545
79
.545
80
.545
81
.545
82
.543
83
.543
84
.542
85
.541
86
.541
87
.541
88
.540
89
.540
90
.539
91
.538
92
.537
93
.537
94
.537
95
.537
96
.537
97
.536
98
.536
99
.536
100
.535
101
.535
102
.534
103
.534
104
.534
105
.534
106
.534
107
.533
108
.533
109
.532
110
.532
111
.531
112
.531
113
.530
114
.528
115
.528
116
.528
117
.528
118
.528
119
.528
120
.527
121
.527
122
.527
123
.526
124
.526
125
.526
126
.524
127
.524
128
.524
129
.523
130
.523
131
.523
132
.523
133
.522
134
.522
135
.521
136
.521
137
.521
138
.521
139
.520
140
.520
141
.520
142
.520
143
.519
144
.518
145
.518
146
.517
147
.517
148
.516
149
.515
150
.515
151
.515
152
.515
153
.515
154
.515
155
.514
156
.513
157
.513
158
.513
159
.512
160
.512
161
.512
162
.512
163
.511
164
.511
165
.511
166
.511
167
.511
168
.510
169
.510
170
.509
171
.509
172
.508
173
.508
174
.507
175
.507
176
.506
177
.505
178
.505
179
.504
180
.503
181
.503
182
.503
183
.503
184
.503
185
.502
186
.502
187
.502
188
.500
189
.500
190
.500
191
.500
192
.499
193
.499
194
.499
195
.499
196
.499
197
.498
198
.498
199
.497
200
.496
201
.496
202
.495
203
.494
204
.494
205
.493
206
.493
207
.492
208
.491
209
.490
210
.488
211
.488
212
.488
213
.488
214
.487
215
.485
216
.484
217
.481
218
.481
219
.480
220
.480
221
.479
222
.479
223
.478
224
.478
225
.478
226
.477
227
.477
228
.477
229
.476
230
.475
231
.475
232
.475
233
.474
234
.474
235
.474
236
.474
237
.473
238
.472
239
.472
240
.471
241
.470
242
.470
243
.469
244
.466
245
.465
246
.464
247
.462
248
.461
249
.461
250
.458
251
.458
252
.456
253
.456
254
.454
255
.454
256
.452
257
.451
258
.451
259
.449
260
.449
261
.448
262
.448
263
.448
264
.447
265
.447
266
.447
267
.446
268
.446
269
.445
270
.445
271
.445
272
.444
273
.443
274
.443
275
.441
276
.440
277
.438
278
.438
279
.437
280
.437
281
.437
282
.435
283
.434
284
.433
285
.433
286
.430
287
.428
288
.428
289
.427
290
.425
291
.425
292
.424
293
.424
294
.419
295
.415
296
.413
297
.413
298
.410
299
.410
300
.410
301
.408
302
.407
303
.406
304
.405
305
.403
306
.401
307
.396
308
.395
309
.392
310
.386
311
.383
312
.381
313
.376
314
.375
315
.373
316
.372
317
.371
318
.370
319
.370
320
.370
321
.370
322
.367
323
.366
324
.365
325
.363
326
.362
327
.360
328
.360
329
.358
330
.356
331
.354
332
.351
333
.348
334
.348
335
.346
336
.344
337
.341
338
.341
339
.335
340
.334
341
.333
342
.333
343
.333
344
.332
345
.330
346
.330
347
.324
348
.324
349
.322
350
.320
351
.315
352
.311
353
.310
354
.307
355
.307
356
.306
357
.305
358
.292
359
.292
360
.291
361
.286
362
.286
363
.285
364
.284
365
.283
366
.279
367
.276
368
.276
369
.270
370
.270
371
.267
372
.265
373
.263
374
.261
375
.256
376
.256
377
.256
378
.256
379
.256
380
.256
381
.256
382
.256
383
.256
384
.255
385
.254
386
.251
387
.250
388
.245
389
.244
390
.243
391
.243
392
.242
393
.234
394
.231
395
.226
396
.225
397
.220
398
.219
399
.216
400
.211
401
.211
402
.209
403
.209
404
.208
405
.200
406
.187
407
.186
408
.185
409
.177
410
.167
411
.165
412
.161
413
.160
414
.157
415
.157
416
.156
417
.150
418
.148
419
.144
420
.137
421
.131
422
.129
423
.127
424
.119
425
.116
426
.114
427
.113
428
.112
429
.108
430
.108
431
.108
432
.107
433
.104
434
.104
435
.103
436
.103
437
.102
438
.101
439
.098
440
.096
441
.095
442
.092
443
.090
444
.084
445
.084
446
.083
447
.083
448
.083
449
.082
450
.078
451
.076
452
.075
453
.071
454
.067
455
.065
456
.065
457
.061
458
.060
459
.060
460
.057
461
.054
462
.054
463
.049
464
.047
465
.046
466
.045
467
.041
468
.040
469
.032
470
.030
471
.027
472
.023
473
.020
474
.020
475
.014
476
.014
477
.012
478
.011
479
.011
480
.010
481
.009
482
.009
483
.009
484
.004
485
.000
486
.000
487
.000
488
.000
489
.000
490
.000
491
.000
492
.000
493
.000
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509

Benchmark bibtex

@article {Rajalingham240614,
                author = {Rajalingham, Rishi and Issa, Elias B. and Bashivan, Pouya and Kar, Kohitij and Schmidt, Kailyn and DiCarlo, James J.},
                title = {Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks},
                elocation-id = {240614},
                year = {2018},
                doi = {10.1101/240614},
                publisher = {Cold Spring Harbor Laboratory},
                abstract = {Primates{	extemdash}including humans{	extemdash}can typically recognize objects in visual images at a glance even in the face of naturally occurring identity-preserving image transformations (e.g. changes in viewpoint). A primary neuroscience goal is to uncover neuron-level mechanistic models that quantitatively explain this behavior by predicting primate performance for each and every image. Here, we applied this stringent behavioral prediction test to the leading mechanistic models of primate vision (specifically, deep, convolutional, artificial neural networks; ANNs) by directly comparing their behavioral signatures against those of humans and rhesus macaque monkeys. Using high-throughput data collection systems for human and monkey psychophysics, we collected over one million behavioral trials for 2400 images over 276 binary object discrimination tasks. Consistent with previous work, we observed that state-of-the-art deep, feed-forward convolutional ANNs trained for visual categorization (termed DCNNIC models) accurately predicted primate patterns of object-level confusion. However, when we examined behavioral performance for individual images within each object discrimination task, we found that all tested DCNNIC models were significantly non-predictive of primate performance, and that this prediction failure was not accounted for by simple image attributes, nor rescued by simple model modifications. These results show that current DCNNIC models cannot account for the image-level behavioral patterns of primates, and that new ANN models are needed to more precisely capture the neural mechanisms underlying primate object vision. To this end, large-scale, high-resolution primate behavioral benchmarks{	extemdash}such as those obtained here{	extemdash}could serve as direct guides for discovering such models.SIGNIFICANCE STATEMENT Recently, specific feed-forward deep convolutional artificial neural networks (ANNs) models have dramatically advanced our quantitative understanding of the neural mechanisms underlying primate core object recognition. In this work, we tested the limits of those ANNs by systematically comparing the behavioral responses of these models with the behavioral responses of humans and monkeys, at the resolution of individual images. Using these high-resolution metrics, we found that all tested ANN models significantly diverged from primate behavior. Going forward, these high-resolution, large-scale primate behavioral benchmarks could serve as direct guides for discovering better ANN models of the primate visual system.},
                URL = {https://www.biorxiv.org/content/early/2018/02/12/240614},
                eprint = {https://www.biorxiv.org/content/early/2018/02/12/240614.full.pdf},
                journal = {bioRxiv}
            }

Ceiling

0.48.

Note that scores are relative to this ceiling.

Data: Rajalingham2018

240 stimuli match-to-sample task

Metric: i2n