Dictionary Parsing and Simulation for the Project of Language Optimality and Evolution#

This project aims to provide evidence against the claim of Piantadosi et al. 2012, by showing that the pattern of homophony (i.e., repeated usage of a phonological form for different words) can be replicated via a random process. This random simulation only takes into account the co-occurrence information of adjacent syllables in words.

This repository contains Python scripts that are produced to perform the following tasks. Detailed code comments will be added soon.

syllabification.py: Parses the pronunciations (i.e., phonological forms) of words into syllables, following the principle of maximzing the cluster of onset consonants of each syllable.
simulate_homophony_by_syllables: Simulates the pattern of homophony of English via a randomized bigram model constructed from syllables.

The data folder contains the results of parsing the CMU Pronunciation Dictionary and of the random word generation.

cmudict_onset: The (short) list of possible onsets that are legitimate, which was manually checked by Aletheia Cui and Ava Irani.
cmudict_syllabified: The phonological forms that are syllabified.
cmudict_nostress.txt: Same as cmudict_syllabified except that stress markers were removed.
cmudict_clustered.txt: Words are clustered by their phonological forms.
cmudict_stems.txt: Same as cmudict_nostress.txt except that all inflected forms of words were removed.
generated_words.txt: The list of synthetic phonological forms that are randomly generated in the simulation.
homophony_count.txt: The countings of pronunciation types and tokens, which are close to the actual pattern of English.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
data		data
README.md		README.md
simulate_homophony_by_syllables.py		simulate_homophony_by_syllables.py
syllabification.py		syllabification.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dictionary Parsing and Simulation for the Project of Language Optimality and Evolution#

About

Uh oh!

Releases

Packages

Languages

htcai/dict_parse_simul

Folders and files

Latest commit

History

Repository files navigation

Dictionary Parsing and Simulation for the Project of Language Optimality and Evolution#

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages