Sample stimuli

sample 0 sample 1 sample 2 sample 3 sample 4 sample 5 sample 6 sample 7 sample 8 sample 9

How to use

from brainscore_vision import load_benchmark
benchmark = load_benchmark("Rajalingham2018-i2n")
score = benchmark(my_model)

Model scores

Min Alignment Max Alignment

Rank

Model

Score

1
.664
2
.652
3
.646
4
.625
5
.624
6
.623
7
.617
8
.608
9
.607
10
.607
11
.606
12
.600
13
.600
14
.596
15
.594
16
.593
17
.590
18
.588
19
.586
20
.585
21
.585
22
.584
23
.584
24
.584
25
.582
26
.581
27
.579
28
.579
29
.578
30
.578
31
.578
32
.578
33
.577
34
.576
35
.575
36
.574
37
.573
38
.573
39
.573
40
.573
41
.572
42
.570
43
.570
44
.568
45
.566
46
.564
47
.564
48
.564
49
.563
50
.563
51
.562
52
.561
53
.561
54
.561
55
.560
56
.560
57
.560
58
.560
59
.558
60
.558
61
.555
62
.555
63
.555
64
.555
65
.555
66
.554
67
.554
68
.554
69
.552
70
.551
71
.549
72
.549
73
.549
74
.549
75
.549
76
.549
77
.546
78
.546
79
.546
80
.545
81
.545
82
.545
83
.545
84
.543
85
.543
86
.542
87
.541
88
.541
89
.541
90
.540
91
.540
92
.539
93
.538
94
.537
95
.537
96
.537
97
.537
98
.536
99
.536
100
.536
101
.535
102
.535
103
.534
104
.534
105
.534
106
.534
107
.534
108
.533
109
.533
110
.532
111
.532
112
.531
113
.530
114
.528
115
.528
116
.528
117
.528
118
.528
119
.528
120
.527
121
.527
122
.527
123
.526
124
.526
125
.526
126
.524
127
.524
128
.524
129
.523
130
.523
131
.523
132
.523
133
.522
134
.522
135
.521
136
.521
137
.521
138
.521
139
.521
140
.520
141
.520
142
.520
143
.520
144
.519
145
.518
146
.518
147
.517
148
.517
149
.516
150
.515
151
.515
152
.515
153
.515
154
.515
155
.514
156
.513
157
.513
158
.513
159
.512
160
.512
161
.512
162
.512
163
.511
164
.511
165
.511
166
.511
167
.511
168
.510
169
.509
170
.509
171
.508
172
.508
173
.507
174
.507
175
.506
176
.505
177
.505
178
.504
179
.503
180
.503
181
.503
182
.503
183
.503
184
.502
185
.502
186
.502
187
.500
188
.500
189
.500
190
.500
191
.499
192
.499
193
.499
194
.499
195
.499
196
.498
197
.498
198
.497
199
.496
200
.496
201
.495
202
.494
203
.494
204
.493
205
.493
206
.492
207
.491
208
.490
209
.488
210
.488
211
.488
212
.488
213
.487
214
.487
215
.485
216
.484
217
.481
218
.481
219
.480
220
.480
221
.479
222
.479
223
.478
224
.478
225
.478
226
.477
227
.477
228
.477
229
.476
230
.475
231
.475
232
.475
233
.474
234
.474
235
.474
236
.474
237
.473
238
.472
239
.472
240
.471
241
.470
242
.470
243
.469
244
.466
245
.465
246
.464
247
.462
248
.461
249
.461
250
.458
251
.458
252
.456
253
.456
254
.454
255
.454
256
.452
257
.451
258
.451
259
.450
260
.449
261
.449
262
.448
263
.448
264
.448
265
.448
266
.447
267
.447
268
.447
269
.446
270
.446
271
.445
272
.445
273
.445
274
.444
275
.443
276
.443
277
.441
278
.440
279
.438
280
.438
281
.437
282
.437
283
.437
284
.435
285
.435
286
.434
287
.433
288
.433
289
.430
290
.428
291
.428
292
.427
293
.426
294
.425
295
.425
296
.424
297
.424
298
.419
299
.415
300
.413
301
.413
302
.410
303
.410
304
.410
305
.408
306
.407
307
.406
308
.405
309
.403
310
.401
311
.396
312
.395
313
.392
314
.386
315
.383
316
.381
317
.376
318
.375
319
.373
320
.372
321
.371
322
.370
323
.370
324
.370
325
.370
326
.367
327
.366
328
.365
329
.363
330
.362
331
.360
332
.360
333
.358
334
.356
335
.354
336
.351
337
.348
338
.348
339
.346
340
.344
341
.341
342
.341
343
.335
344
.334
345
.333
346
.333
347
.333
348
.332
349
.330
350
.324
351
.324
352
.322
353
.320
354
.315
355
.311
356
.310
357
.307
358
.307
359
.306
360
.305
361
.292
362
.292
363
.291
364
.286
365
.286
366
.285
367
.284
368
.283
369
.279
370
.276
371
.276
372
.270
373
.270
374
.267
375
.265
376
.263
377
.261
378
.256
379
.256
380
.256
381
.256
382
.256
383
.256
384
.256
385
.256
386
.256
387
.255
388
.254
389
.251
390
.250
391
.245
392
.244
393
.243
394
.243
395
.242
396
.234
397
.231
398
.226
399
.225
400
.220
401
.219
402
.216
403
.211
404
.211
405
.209
406
.209
407
.208
408
.200
409
.187
410
.186
411
.177
412
.167
413
.165
414
.161
415
.160
416
.157
417
.157
418
.156
419
.150
420
.148
421
.144
422
.137
423
.131
424
.129
425
.127
426
.119
427
.116
428
.114
429
.113
430
.112
431
.108
432
.108
433
.108
434
.107
435
.104
436
.104
437
.103
438
.103
439
.102
440
.101
441
.098
442
.096
443
.095
444
.092
445
.090
446
.084
447
.084
448
.083
449
.083
450
.082
451
.078
452
.076
453
.075
454
.071
455
.067
456
.065
457
.065
458
.061
459
.060
460
.060
461
.057
462
.054
463
.054
464
.049
465
.047
466
.046
467
.045
468
.041
469
.040
470
.032
471
.030
472
.027
473
.023
474
.020
475
.020
476
.014
477
.014
478
.012
479
.011
480
.011
481
.010
482
.009
483
.009
484
.009
485
.004
486
.000
487
.000
488
.000
489
.000
490
.000
491
.000
492
.000
493
.000
494
.000
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511

Benchmark bibtex

@article {Rajalingham240614,
                author = {Rajalingham, Rishi and Issa, Elias B. and Bashivan, Pouya and Kar, Kohitij and Schmidt, Kailyn and DiCarlo, James J.},
                title = {Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks},
                elocation-id = {240614},
                year = {2018},
                doi = {10.1101/240614},
                publisher = {Cold Spring Harbor Laboratory},
                abstract = {Primates{	extemdash}including humans{	extemdash}can typically recognize objects in visual images at a glance even in the face of naturally occurring identity-preserving image transformations (e.g. changes in viewpoint). A primary neuroscience goal is to uncover neuron-level mechanistic models that quantitatively explain this behavior by predicting primate performance for each and every image. Here, we applied this stringent behavioral prediction test to the leading mechanistic models of primate vision (specifically, deep, convolutional, artificial neural networks; ANNs) by directly comparing their behavioral signatures against those of humans and rhesus macaque monkeys. Using high-throughput data collection systems for human and monkey psychophysics, we collected over one million behavioral trials for 2400 images over 276 binary object discrimination tasks. Consistent with previous work, we observed that state-of-the-art deep, feed-forward convolutional ANNs trained for visual categorization (termed DCNNIC models) accurately predicted primate patterns of object-level confusion. However, when we examined behavioral performance for individual images within each object discrimination task, we found that all tested DCNNIC models were significantly non-predictive of primate performance, and that this prediction failure was not accounted for by simple image attributes, nor rescued by simple model modifications. These results show that current DCNNIC models cannot account for the image-level behavioral patterns of primates, and that new ANN models are needed to more precisely capture the neural mechanisms underlying primate object vision. To this end, large-scale, high-resolution primate behavioral benchmarks{	extemdash}such as those obtained here{	extemdash}could serve as direct guides for discovering such models.SIGNIFICANCE STATEMENT Recently, specific feed-forward deep convolutional artificial neural networks (ANNs) models have dramatically advanced our quantitative understanding of the neural mechanisms underlying primate core object recognition. In this work, we tested the limits of those ANNs by systematically comparing the behavioral responses of these models with the behavioral responses of humans and monkeys, at the resolution of individual images. Using these high-resolution metrics, we found that all tested ANN models significantly diverged from primate behavior. Going forward, these high-resolution, large-scale primate behavioral benchmarks could serve as direct guides for discovering better ANN models of the primate visual system.},
                URL = {https://www.biorxiv.org/content/early/2018/02/12/240614},
                eprint = {https://www.biorxiv.org/content/early/2018/02/12/240614.full.pdf},
                journal = {bioRxiv}
            }

Ceiling

0.48.

Note that scores are relative to this ceiling.

Data: Rajalingham2018

240 stimuli match-to-sample task

Metric: i2n