Interactions between latent process variables #45
-
|
Hi Charles, First of all, thanks so much for putting together this package and the accompanying papers. I've found CTSEM to be extremely useful and your work has really helped me begin to wrap my head around it. I'm working on a model where I want to model the change in X as a function of Y and Y^2. In other words, I'm assuming the effect of Y on dx is non-linear. I suppose this is a special case of the more general issue of modeling interactions between latent variables (e.g., where dX and dY are both predicted by X, Y, and XY), hence the title of my post. So the following question addresses the more general case of modeling interactions between variables, but I'd also like to know if there's a simpler solution for the special case above. As I understand it, there are a few ways this could be tackled. The first would be to add a third latent variable, XY, and model the change in XY as a function of X and Y. If my understanding is correct, dXY should be approximately Y * dX + X * dY. It doesn't seem like there's a way, however, for dX and dY to be fed into the drift matrix. In fact, this seems circular, as dX and dY are computed by the drift matrix in the first place. At least this is my understanding. The other way would be not to include a third variable but instead assume that the drift parameters representing the cross effects are state-dependent. E.g. something along the lines of: eta = [X Y] where a and d are the autoregressive effects, and b and c are the cross effects, with b and c being state-dependent such that: b = b1 + Yb2 In this case, b1 and c1 are the main effects and b2 and c2 are the interaction terms. I think I understand how to set this up in the package via the ctModel function. But I'm trying to code this up from scratch in stan, so that I make sure I understand how it works. So assuming the state-dependent parameterisation is the way to go here, my question is how does this change the computation of the likelihood? Does the analytic solution presented in your 2018 paper still apply? If so, how does it account for the state-dependent drift parameters? If the analytic solution no longer applies, is there an easier way to implement this type of interaction model? Thanks very much in advance for your advice! Cheers, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
|
Yes I don't think the third variable approach is the way to go, though I have vague memories that it has been done this way in SEM before. I think you have the right idea re the drift matrix specification, the main complication is needing to list any parameters that are part of the nonlinear equations in the PARS argument, though the examples should make this clear. Re the likelihood, this gets more complicated. The 2018 paper describes a linear kalman filter, which will perform poorly for nonlinear cases. A paper I have on Rasch models with ctsem ( https://psycnet.apa.org/record/2019-22131-001 ) describes an approach based on sampling the latent states -- once you condition on the latent states the likelihood is just as normal. The downside of this is it often just doesn't sample well because of complex dependencies between the states giving difficult to navigate posterior geometry, at least as far as stan is concerned. The default approach in ctsem at present uses a first order extended kalman filter, where the integration from one state to the next (to compute predictions / uncertainty) relies on the 'exact' (i.e. matrix exponential based) approach described in most of my papers, but applies this to the jacobian of the state -- this jacobian is precomputed analytically (in most cases, though for certain limited cases there is a finite difference loop within ctsem, as this avoids recompiling the model). This works well for 'simpler' kinds of non-linearity (where a linearisation around the point estimate is roughly accurate for some length of time) but is definitely not perfect, for serious non-linearity you'll need special purpose SDE solvers. It's kind of complicated in general, and the stan code is pretty horrendous after building this up over time and trying to avoid too many wasted calculations. Unless you really need to do this I'd kind of suggest you're better off understanding the model/s via simulation etc and not stressing about how they are actually fit ;) |
Beta Was this translation helpful? Give feedback.
-
|
Sub-populations can only be easily handled using fixed effects / covariates at present yeah. If you want a random effects approach, a limited number of sub-populations for a limited number of random effects would be possible (though a bit complicated) to specify, but will expand the dimension of the system matrices and things will really slow down if you need many. Extensions are always welcome if you've got a neat idea how to handle it ;) |
Beta Was this translation helpful? Give feedback.
Yes I don't think the third variable approach is the way to go, though I have vague memories that it has been done this way in SEM before. I think you have the right idea re the drift matrix specification, the main complication is needing to list any parameters that are part of the nonlinear equations in the PARS argument, though the examples should make this clear.
Re the likelihood, this gets more complicated. The 2018 paper describes a linear kalman filter, which will perform poorly for nonlinear cases. A paper I have on Rasch models with ctsem ( https://psycnet.apa.org/record/2019-22131-001 ) describes an approach based on sampling the latent states -- once you condition on the latent…