This repo contains the materials for IB 516 Analytical Workflows co-developed by Mark Novak and Ben Dalziel at Oregon State University.
The repository's most relevant folders are:
- course_info - grading rubrics and the course syllabus;
- classes - sub-folders for each course topic;
- readings - pdfs of the required and suggested readings.
Have you proposed a modeling chapter for your dissertation but need support getting things up and running? Are you sitting on a data set ready for analysis and visualization but don't know how or where to begin? Maybe you're far along in some series of analyses and feel ``lost in the trees.''
This course will help you with these challenges by practicing the development and implementation of efficient, reproducible workflows for your projects. Every project should (and can) be modular and fully automated, hence reproducible, portable and easily modified. Rerunning a model under a different set of parameters should (and can) be as simple as a few keystrokes. Regenerating all analyses, figures and tables after finding a typo in your code or dataset should (and can) be painless.
Efficient workflows start at project conception and end only if the project idea is itself a dead end. Thus, in this course, we'll work to practice (1) refining and articulating project ideas and goals, (2) creating modular and automated analyses, and (3) using best-practices in coding and project management. We'll learn how to use Git, GitHub, LaTeX, and Markdown. I will mostly use R within RStudio, but users of other programming languages and text editors are welcome and encouraged. You will need either (1) a large dataset or (2) a dynamical model or simulation. For either, you'll need to have a goal and a well-developed vision for achieving it. (We won't be learning statistics or mathematics.) The use of other people's data or published models is acceptable, but working on your own thesis work / projects is strongly encouraged.
If you find a broken link or typo, please create an Issue to let me know where it is! You can also create an Issue to leave feedback, pose questions, or suggest new or alternative materials (e.g., new publications) to include. You can also write to me at mark.novak@oregonstate.edu.
For Winter 2025, we are meeting on Mondays & Wednesdays 10-11:50am in Kidder Hall 236.
Click on a topic to see the day's to-do's (required before-class reading and set-up).
| Wk | Day | Date | Topic |
|---|---|---|---|
| 1 | Mon | Jan 6 | No class (Mark at ASN) |
| Wed | Jan 8 | Course overview & Philosophy | |
| 2 | Mon | Jan 13 | Structuring projects |
| Wed | Jan 15 | Git w/ GitHub - Part 1 | |
| 3 | Mon | Jan 20 | No class (Martin Luther King Jr. Day) |
| Wed | Jan 22 | Git w/ GitHub - Part 1 continued | |
| 4 | Mon | Jan 27 | Implementation & Troubleshooting (Mark AWOL) |
| Wed | Jan 29 | Project proposals | |
| 5 | Mon | Feb 3 | Coding best practices |
| Wed | Feb 5 | Git w/ GitHub - Part 2 | |
| 6 | Mon | Feb 10 | Implementation & Troubleshooting |
| Wed | Feb 12 | Large Language Models | |
| 7 | Mon | Feb 17 | Typesetting with Markdown |
| Wed | Feb 19 | Faster computing | |
| 8 | Mon | Feb 24 | High Performance Computing |
| Wed | Feb 26 | Implementation & Troubleshooting | |
| 9 | Mon | Mar 3 | Typesetting with LaTeX |
| Wed | Mar 5 | Implementation & Troubleshooting (Mark serving on NSF panel) | |
| 10 | Mon | Mar 10 | Project presentations |
| Wed | Mar 12 | Project presentations & Wrap-up |