This repository parses conversation data from Google Hangouts and gives
diagnostics on the number of messages in conversations. Two scripts are
currently supported: parser.py and visualize.py. The parsing script parses
raw JSON data from Google Takeout and creates pickled summary files for each
conversation. The visualization script creates a histogram of messages over
time using the pickled conversation summaries.
- Clone this repository
- Download your hangouts data
- Navigate to Google Takeout
- Choose "Select None" and manually select Hangouts
- Download the data in zip format and move the
Hangouts.jsonfile into therawfolder in this repository
- Install dependencies via
pippip install -r requirements.txt- No dependencies are required for the
parser.pyscript, butvisualize.pyrequires the dependencies
- Run the parser
- Note: if you did not place your hangouts data as
raw/Hangouts.jsonyou can specify the path to the.jsonfile as an argument to theparser.pyscript via the-fflag
- Note: if you did not place your hangouts data as
python parser.py- Run the visualization
- The
<conversation_id>can be found in the output of theparser.pyscript
- The
python visualize.py -f output/<conversation_id>.pklThis code is freely available under the GNU Public License (GPL).
All of the data processing in these scripts happens locally on your computer. The data you provide to the script is NOT uploaded to an external server. Feel free to examine the code if you are concerned.
This repository was inspired by MasterScrat/Chatistics. Chatistics can parse Facebook Messenger and Telegram data, but not Hangouts group messages. I originally intended to contribute to that repository and add Hangouts group message support, but my design drifted far from the existing design in that repository so I created a new project. Shoutout to MasterScrat for great work and thanks for the inspiration!