-
Notifications
You must be signed in to change notification settings - Fork 187
[BUG] Fixes #301 and some other issues #354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* in pygam.GAM._modelmat added selection of features, edge_knots and dtypes by term so only the necessary features are checked and features for factor terms can be set to 0 or 1 if another term is being checked * in pygam.GAM._flatten_mesh added two lines that set by feature to 1 if not None. This fixes issue of partial dependence of tensor terms being 0 everywhere if by feature is present * in pygam.GAM.partial_dependence removed check_X as right after it is checked in self._modelmat * updated docstring of pygam.GAM._linear_predictor * removed line that does nothing in terms.SplineTerm.build_column
|
looks good to me |
kdergachev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but I see ruff complains about single quotes in pygam.py. Will fix in the next commit and there should be no more problems.
|
Seems like me deleting an empty line in gen_imgs.py broke something in tests for windows. I have no idea why that happened. Is that the reason it was left there in the first place? |
@kdergachev The windows tests are flaky. |
| modelmat : sparse matrix of len n_samples | ||
| containing model matrix of the spline basis for selected features | ||
| """ | ||
| # take features, dtypes and edge_knots based on supplied term values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK i like this.
the point is that we only check the features that we need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, conveniently, check_X already allowed for that. Changing arguments was enough to fix the original issue.
Tests case where a factor term is added as a by feature to another term and its categories do not contain 1 (partial_dependence sets it to 1 so that it does not affect final result, but if check_X checks it an exeption will be thrown).
|
In the last commit added test for the same issue, but if it is caused by a factor term that does not have 1 in its domain that is at the same time a by feature in another term. Not entirely sure it is necessary, but that is what I meant in the first comment to this PR. |
This should fix the problem in #301 and some others that #302 might not, e.g. I believe it might break due to factor feature being set to 1.0 if it is a by feature in another term. Along the way found that by feature in tensor term is not set to 1 anywhere so partial dependence returns all 0. As _meshgrid seems to be written specifically for partial_dependence and generate_X_grid made it set by feature to 1 if not None.