-
Notifications
You must be signed in to change notification settings - Fork 187
Open
Labels
Description
Thanks for the great package! Quick question about functionality that exists in mgcv that I can't seem to replicate in pygam.
mgcv example code
This comes directly from ?mgcv::gam.models
library(mgcv)
## Factor `by' variable example (with a spurious covariate x0)
## simulate data...
dat <- gamSim(4)
## fit model...
b <- gam(y ~ fac+s(x2,by=fac)+s(x0),data=dat)
plot(b,pages=1)
The above mgcv code generates a spline fit for each level of fac.
Example in pygam
When I set up a quick toy example in pygam, it's not clear to me how I could replicate the spline by factor variable capability that exists in mgcv.
from pygam import LinearGAM, s, intercept
from pygam.datasets import toy_interaction
X, y = toy_interaction(return_X_y=True)
## Make the second column a factor
X[:25000,1] = 0
X[25000:,1] = 1
gam = LinearGAM(s(0, n_splines=4, by=1)).fit(X, y)
mm = gam.terms.build_columns(X)
print(mm.todense()[0,:])
# [[0. 0. 0. 0. 1.]]
I would like the by argument to be used as a factor similar to mgcv when an R factor column is used. In which case I would expect the model matrix above to have 9 columns instead of 5.
Is there a way to use a categorical by variable in pygam?
burchill and kevinmickey