Scores on benchmarks
Model rank shown below is with respect to all public models.| .240 |
average_language
rank 22
5 benchmarks |
|
| .193 |
neural_language
rank 24
4 benchmarks |
|
| .580 |
Pereira2018-linear
rank 22
2 benchmarks |
|
| .593 |
Pereira2018.243sentences-linear
v1
rank 23
|
|
|
||
| .568 |
Pereira2018.384sentences-linear
v1
rank 22
|
|
|
||
| .286 |
behavior_language
rank 11
1 benchmark |
|
| .286 |
Futrell2018-pearsonr
v1
[reference]
rank 11
|
|
|
||
| .408 |
engineering_language
rank 15
30 benchmarks |
|
| .408 |
SyntaxGym
[reference]
rank 15
30 benchmarks |
|
| .786 |
syntaxgym-center_embed
v1
[reference]
rank 15
1 benchmark |
|
| .607 |
syntaxgym-center_embed_mod
v1
[reference]
rank 15
|
|
|
||
| .711 |
syntaxgym-npi_orc_any
v1
[reference]
rank 13
|
|
|
||
| .158 |
syntaxgym-npi_orc_ever
v1
[reference]
rank 14
|
|
|
||
| .921 |
syntaxgym-npi_src_any
v1
[reference]
rank 11
|
|
|
||
| .500 |
syntaxgym-npi_src_ever
v1
[reference]
rank 13
|
|
|
||
| .526 |
syntaxgym-number_orc
v1
[reference]
rank 12
|
|
|
||
| .737 |
syntaxgym-number_prep
v1
[reference]
rank 11
|
|
|
||
| .684 |
syntaxgym-number_src
v1
[reference]
rank 13
|
|
|
||
| .368 |
syntaxgym-reflexive_orc_fem
v1
[reference]
rank 7
|
|
|
||
| .632 |
syntaxgym-reflexive_orc_masc
v1
[reference]
rank 11
|
|
|
||
| .632 |
syntaxgym-reflexive_prep_fem
v1
[reference]
rank 2
|
|
|
||
| .632 |
syntaxgym-reflexive_prep_masc
v1
[reference]
rank 10
|
|
|
||
| .211 |
syntaxgym-reflexive_src_fem
v1
[reference]
rank 10
|
|
|
||
| .474 |
syntaxgym-reflexive_src_masc
v1
[reference]
rank 12
|
|
|
||
| .913 |
syntaxgym-subordination
v1
[reference]
rank 9
3 benchmarks |
|
| .870 |
syntaxgym-subordination_orc-orc
v1
[reference]
rank 13
|
|
|
||
| .957 |
syntaxgym-subordination_pp-pp
v1
[reference]
rank 10
|
|
|
||
| .913 |
syntaxgym-subordination_src-src
v1
[reference]
rank 12
|
|
|
||
How to use
from brainscore_language import load_model
model = load_model("lm1b")
model.start_task(...)
model.start_recording(...)
model.look_at(...)
Brain Encoding Response Generator (BERG)
Through the BERG you can easily generate neural responses to text sentences of your choice using any Brain-Score language model.
For more information on how to use BERG, see the documentation and tutorial.
Benchmarks bibtex
@proceedings{futrell2018natural,
title={The Natural Stories Corpus},
author={Futrell, Richard and Gibson, Edward and Tily, Harry J. and Blank, Idan and Vishnevetsky, Anastasia and
Piantadosi, Steven T. and Fedorenko, Evelina},
conference={International Conference on Language Resources and Evaluation (LREC)},
url={http://www.lrec-conf.org/proceedings/lrec2018/pdf/337.pdf},
year={2018}
}
@inproceedings{gauthier-etal-2020-syntaxgym,
title = "{S}yntax{G}ym: An Online Platform for Targeted Evaluation of Language Models",
author = "Gauthier, Jon and Hu, Jennifer and Wilcox, Ethan and Qian, Peng and Levy, Roger",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-demos.10",
pages = "70--76",
abstract = "Targeted syntactic evaluations have yielded insights into the generalizations learned by neural network language models. However, this line of research requires an uncommon confluence of skills: both the theoretical knowledge needed to design controlled psycholinguistic experiments, and the technical proficiency needed to train and deploy large-scale language models. We present SyntaxGym, an online platform designed to make targeted evaluations accessible to both experts in NLP and linguistics, reproducible across computing environments, and standardized following the norms of psycholinguistic experimental design. This paper releases two tools of independent value for the computational linguistics community: 1. A website, syntaxgym.org, which centralizes the process of targeted syntactic evaluation and provides easy tools for analysis and visualization; 2. Two command-line tools, {`}syntaxgym{`} and {`}lm-zoo{`}, which allow any user to reproduce targeted syntactic evaluations and general language model inference on their own machine.",
}