Hi,
I’m fitting a recurrent SLDS (rSLDS) to simultaneous ensemble spiking from two brain areas. The dataset is large (200+ neurons, ~90 min per session), so even after binning the (neurons × time) sample count is substantial.
I’m looking for best practices on hyperparameter selection in this model, specifically the latent dimensionality (D) and the number of discrete states (K). I know ELBO and/or explained variance can be used as model performance measures to optimize the hyperparameters but
- Should I run a grid search over (K, D), or is there a more efficient approach you recommend?
- Any recommended practices for scaling this (e.g., coarse-to-fine search, or other tricks)?
I’ve read Glaser et al., NeurIPS 2020, and Nair et al., Cell 2023, but they don’t detail the optimization strategy. I'd appreciate it if you could provide any suggestions or pointers.
Also, for preprocessing the data, is it recommended to bin spikes and smooth them with a Gaussian kernel, or to use raw binned spike counts without smoothing? I understand this choice would affect the appropriate observation distribution in the model but want to know if there's a preference.