Skip to content

Factor `by' variable example similar to mgcv? #238

@jmuhlenkamp

Description

@jmuhlenkamp

Thanks for the great package! Quick question about functionality that exists in mgcv that I can't seem to replicate in pygam.

mgcv example code

This comes directly from ?mgcv::gam.models

library(mgcv)

## Factor `by' variable example (with a spurious covariate x0)
## simulate data...

dat <- gamSim(4)

## fit model...
b <- gam(y ~ fac+s(x2,by=fac)+s(x0),data=dat)
plot(b,pages=1)

The above mgcv code generates a spline fit for each level of fac.

Example in pygam

When I set up a quick toy example in pygam, it's not clear to me how I could replicate the spline by factor variable capability that exists in mgcv.

from pygam import LinearGAM, s, intercept
from pygam.datasets import toy_interaction
X, y = toy_interaction(return_X_y=True)

## Make the second column a factor
X[:25000,1] = 0
X[25000:,1] = 1

gam = LinearGAM(s(0, n_splines=4, by=1)).fit(X, y)
mm = gam.terms.build_columns(X)
print(mm.todense()[0,:])
# [[0. 0. 0. 0. 1.]]

I would like the by argument to be used as a factor similar to mgcv when an R factor column is used. In which case I would expect the model matrix above to have 9 columns instead of 5.

Is there a way to use a categorical by variable in pygam?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions