-
Notifications
You must be signed in to change notification settings - Fork 7
WIP: Signal Distribution and Forecast Accuracy Analysis Report #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@tlarrue would love your initial feedback when you have a minute. I picked up what you started and sought to abstract out as much as possible (e.g. want to be able to run for an arbitrary number of model versions). I also added a threadpool to speed up the forecast pulling, and unit tests. |
I didn't try to run, but I think this is looking great so far. The unit tests are a great call. In general, we're going to have to do a lot of testing here, including manual testing, to make sure the plots are robust to different datasets. As far as the abstractions, I am wondering why we're trying to do this for an arbitrary amount of model versions? What would be the reason for 3+ model versions being shown here? I'm actually thinking we might want to keep it even more strict for the public and do some sort of lookup of the previous model version (if exists). Also just noting that there is more work to be done in the rank_compare functions, especially if we want to make the logic transparent/easy-to-follow for partners. Notably, on top of the time zones, the functions currently don't "re-optimize" within a charging window (it just actively charges according to the first forecast). We probably don't want to encourage this! Lastly, just on repo structure. There is an analysis folder. Do you want to place this stuff there and split up more of this code into different files so it's super clear to partners how we're calculating these stats and how they can replicate? |
| @@ -1,2 +1,3 @@ | |||
| from watttime.api import * | |||
| from watttime.tcy import TCYCalculator | |||
| from watttime.tcy import TCYCalculator | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This puts TCYCalculator in this file's namespace, permitting import wattttime.TCYCalculator. One disadvantage to doing this is that it broadens the surface area of imports anytime there is a watttime namespace import. What is the advantage to re-namespacing it here?
| from watttime.api import * | ||
| from watttime.tcy import TCYCalculator | ||
| from watttime.tcy import TCYCalculator | ||
| from watttime.report import ModelAnalysis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suspicion is that this a workaround to load this into the jupyter notebook. If the jupyter notebook said from watttime import ModelAnalysis then this would just be a convenience feature but instead it says from watttime import report. Is this a workaround? If so, let's discuss a more direct approach and if not, what is the advantage to this re-namespacing?
| from watttime import api | ||
|
|
||
| # hacky way to allow running this script locally | ||
| sys.path.append(str(Path(__file__).parents[1].resolve())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which import fails if we exclude this? from watttime import api looks like an absolute reference that should work. One potential issue is the namespacing clashing of our package and this sub-package.
|
|
||
| @dataclass | ||
| class ModelAnalysis: | ||
| ba: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
region?
| ba: str | ||
| model_date: str | ||
| signal_type: str | ||
| eval_start: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-pick: The api module allows both str and datetime.
watttime-python-client/watttime/api.py
Line 185 in 6907a27
| start: Union[str, datetime], |
watttime-python-client/watttime/api.py
Line 59 in 6907a27
| def _parse_dates( |
| """ | ||
|
|
||
| # Create subplots | ||
| unique_bas = set([j.ba for j in jobs]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unique_regions?
| line=dict(width=2), | ||
| showlegend=(i == 1), | ||
| ), | ||
| row=i, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename i to ndx? row_ndx? region_ndx? region_index?
| y_min = y_max = 0 | ||
|
|
||
| # Iterate through each BA and create a bar plot | ||
| for i, ba_abbrev in enumerate(unique_bas, start=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This naming is inconsistent without our api
watttime-python-client/watttime/api.py
Line 343 in 6907a27
| - region: The abbreviation of the region. |
Our implied pattern is to always use region unless we are explicitly using region_full_name. As such I suggest we drop the use of both ba and abbrev throughout and use region and variants for sake of avoiding name clashes (i.e. region, this_region, region_, etc...)
| Parameters: | ||
| df (pd.DataFrame): DataFrame containing the rank correlation data. | ||
| Columns: ['abbrev', 'name', '24hr', '48hr', '72hr']. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
region and region_full_name would be more consistent
| eval_days=self.eval_days, forecast_sample_days=FORECAST_SAMPLE_DAYS | ||
| ) | ||
|
|
||
| self.jobs = list( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest not using job(s) here and instead aligning ModelAnalysis with this variable. The natural alignment for the current draft would be to just name this something more like what ModelAnalysis dataset is: self.analysis_data, self.analyses, etc...
I would reserve the use of the word job for (i) uses of a parallelization package (ii) deeper down in this package (see comment there)
|
|
||
| for i, ba_abbrev in enumerate(unique_bas, start=1): | ||
| ba_abbrev = ba_abbrev.upper() | ||
| _jobs = [j for j in jobs if j.ba == ba_abbrev] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you rename jobs to something more like data or analysis_data, then you do not need this underscore. Then 'job' is an okay variable name here if you document it well, i.e. a plotting job on each of the analysis datasets.
|
|
||
|
|
||
| @dataclass | ||
| class ModelAnalysis: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This object and its name are okay as a first pass, but I think there is room to be highly specific about what our object(s) are meant to do. This object is doing these things:
- Lazy loading raw api data
- Serving as a convenience wrapper around api data
- Merging two historical and forecast data to ease analysis
- It is not meant to implement analysis (the name may encourage this later!)
Maybe it would be better to separate a lazyloader object and a ForecastDataComparer. Then the lazyloader object can be used for historical-only analysis as well as to power the ForecastDataComparer, which itself does not have to be concerned with the api?
As an easier iteration to keep this PR in motion, I would also support a simple renaming to prevent the interpretation of this as an object meant to implement analyses.
This PR introduces a new module:
report.pywhich pulls together some long-standing internal tools that WattTime has used to partially assess the quality of a new MOER model and its constituent forecast model. The goal of this PR is to allow any user to run these tests against our model(s) to increase transparency about the accuracy and trends of our models.We are using papermill to run a templated jupyter notebook, and save the output as HTML with the "input" cells removed. Eventually we would like to find a place to host these HTML outputs (e.g. github pages, watttime.org website?). Whenever a new model is released, it may become part of our model release plan to generate a new HTML analysis report comparing the new model with its predecessor.
The primary entry point for this HTML generation will be a script (report.py).
Additional plots or statistical tests may be added in the future depending on user requests.
TODO in the PR:
Not in this PR
ModelAnalysisFactoryto abstract the handling ofjobsthroughoutWattTimeForecast.get_historical_forecastmethod use a threadpool executor to speed up pulls (may not be desired).