This is the repository for our Linearity final project at Olin College.
By analyzing the next-letter pairings of a large body of reference text and performing singular value decomposition on this data, we were able to identify the best match language of an arbitrary text.
Points of interest:
Report.pdf - our final report
Identification Results.pdf - the identification results from running a collection of sample text through our programs
LinearitySVDshort.mlx - MATLAB code used to guess the language of each text
LinearitySVDFinal.mlx - Same as -short, but makes graphs of reference data
SVDProjectTesting.mlx - Implementing SVD in MATLAB the longer way
adjacencyMatrix.py - Used to generate letter adjacency data from text files
Graphs - Graphs of the first & second singular vector plots for the reference data
Reference Texts - Reference text in each language
Unknown Texts - Sample files which we identified the lanuguage of
README.md - You are here
~ Jane Sieving & Sabrina Pereira, Spring 2018 ~