Skip to content

CodecoolBase/babel-day-task

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Babel Day #01

Welcome to the very first Babel Day. A day in which we speak many programming languages at once.

babel tower

Why do we do this

IT industry is changing continuously. Programming languages popularity is a rollercoaster. Today JavaScript is popular, tomorrow it'll be C#, next week it'll probably be Rust, and then JavaScript again. If you want to survive in this jungle you have to get flexible. This event is aiming to teach you this.

Your task

In this repository you can find two text files taken from this repository. These files contain the frequencies of words used in subtitles available on opensubtitles.org. Your task is to process these files and extract some interesting data.

Task 1

Write a program that finds the following things in both files:

  • longest word
  • shortest word
  • average word length
  • number of words (count)

Sample output (fake data):

===pl_full.txt===
longest word: cholerniedługiesłowo
shortest word: jajo
average length: 3
word count:  333

===en_full.txt===
longest word: areallylongword
shortest word: s
average length: 8
word count:  123456

Task 2

A more challenging task. Find words that exist in both files and count them. (Hint) Pay attention to the optimization. Given files are quite big so calculating the common part in the most straightforward way might take way too long.

Sample output (fake data):

common words: ["Codecool", "babel", "day"]
count: 3

Hint

One hour is a really short time to implement something in a language that you don't know yet. Thus split responsibilities in your team wisely.

Optional Task

Display all words that fall below 20th percentile. Resources:

About

Task for the #1 BabelDay

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published