Incremental Reading Language Modelling

This repository contains the data and code needed to compute word surprisal values for the L1 and L2 stimuli used in the experiment. Ultimately these surprisal values will be used to predict reading times of L1 and L2 speakers of English.

Data

The language models are trained on preprocessed versions of the WikiText-2 dataset introduced by Merity et al.. The dataset can be downloaded from https://huggingface.co/datasets/wikitext.

Language Models

N-gram

KenLM

PCFG

Roark incremental parser

RNNG

Recurrent neural network grammars

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
data/stimuli		data/stimuli
experimental_setup		experimental_setup
gamm_model_checking		gamm_model_checking
gamm_stats		gamm_stats
output		output
shell_scripts		shell_scripts
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Incremental Reading Language Modelling

Data

Language Models

N-gram

PCFG

RNNG

LSTM

Transformer

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

EdTeKLA/IncrementalReadingLanguageModelling

Folders and files

Latest commit

History

Repository files navigation

Incremental Reading Language Modelling

Data

Language Models

N-gram

PCFG

RNNG

LSTM

Transformer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages