work on figures so that they fit better into the size requirements used by linguistic vanguard
rtf output path for the journal
fixed date
Files without an explicit license or source notice (e.g. data and config files, short helper scripts) are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/. Files with an explicit source notice are copyright their respective notices and are subject to the licenses the copyright holder places upon them.
ggplot2
reshape2
effects
xtable
lme4
plyr
If you only wish to generate the figures and documentation from Linguistic Vanguard, then you only need Python, R and the listed R packages. Additional Python packages, OpenSesame and RStudio are only necessary if you wish to run the experiment or otherwise do further exploration.
Please note that in our previous work, we used "old" lme4
. The optimizers and convergence checks in the newer versions are generally much better yet more sensitive to issues of scaling. In our older work, the variables were generally only centered; here, they are both scaled and centered (i.e., they are $z$-transformed), but the model fits are near identical. (Individual estimates may differ in some of the less significant digits.)
python load_data.py
python pickle2csv.py result.pickle
R --vanilla < regression.R
individual_report.Rmd
with a recent version of RStudiopython pickle2csv.py sample01a.pickle
R --vanilla < regression.R
The file load_data.py
generates stimuli by extracting the nouns and verbs from the stimuli provided by Alexander Dröge (standarddeutsch_items.csv
, citation to come).
Because the source stimuli consists of items with a clear semantic directionality (both in terms of individual semantic features and world knowledge), the subjects, objects and verbs are mixed in a random way to produce a new set of items.
Moreover, the original stimuli have been expanded so that each "subject" and "object" is present in both accusative-singular, nominative-singular and plural (which is always case ambiguous in German), and this extra variants are used in the generation of new stimuli.
For each new item, two NPs are chosen at random, one each from the "subject" and the "object" pools. Morphological case for each NP is also chosen at random, thus allowing for items where morphological case is not a reliable cue, either due to ambiguity or ungrammaticality. The verb is chosen similarly, with a weak constraint that the verb always agrees in number with the NP taken from the subject pool, even if that NP is now accusative. It is thus possible that a sentence has an accusative object that agrees in number with the verb and a nominative object that doesn't.
The list of stimuli is permuted for order and then serialized to disk via pickling. The pickle serves as input for the experiment.
opensesame
convert detailed output pickle to minimal csv file
models and plots