WIP: Prototype of minimal codebase #2

EiffL · 2025-11-25T12:56:17Z

This pull request introduces the initial implementation of the SHINE pipeline, providing a complete configuration-driven workflow for Bayesian shear inference using JAX, NumPyro, and GalSim. The changes include new configuration models, data loading and synthetic generation, scene/model construction, inference engines, and a main entrypoint script. These updates establish a modular, extensible foundation for both synthetic and (future) real data analysis.

Configuration and Workflow Foundation:

Added a comprehensive configuration system using Pydantic models in shine/config.py, supporting flexible priors, image, PSF, galaxy, and inference options, with a YAML-based config handler.
Created an example YAML configuration file (configs/test_run.yaml) specifying all parameters for a test run, including galaxy, PSF, image, and inference settings.

Data Handling and Synthetic Generation:

Implemented shine/data.py with a DataLoader class that generates synthetic galaxy image data using GalSim when no real data is provided, including noise and PSF handling.

Model and Inference Engines:

Added shine/scene.py featuring a SceneBuilder class that constructs a NumPyro model for the scene, supporting flexible priors and differentiable rendering with JAX-GalSim.
Introduced shine/inference.py with HMC (NUTS) and MAP inference engines, supporting ArviZ output for posterior analysis and MAP parameter extraction.

Entrypoint and Workflow Integration:

Added shine/main.py, a CLI script to run the full pipeline: configuration loading, data generation, model building, inference execution (HMC or MAP), and result saving (including residuals and reduced chi-squared for MAP).
Updated the design diagram in DESIGN.md to reflect the new modular workflow and clarify component relationships.

CentofantiEze

Emma and I gave a good read to all the modules. The implementation is well structured and seems functional. We left a few comments and questions, specially in the parts that might need further refinement. We haven't tried to run the code yet.

Main comments

MAP: the main concern is the way the MAP is handled. In this code it seems that the MAP is a standalone inference option (we can do MCMC or MAP). However, we want the map to be an (optional) extra step to be performed prior to running the chains, to tune the initial starting points.
Ellipticities: the intrinsic ellipticity variables are missing.
Modularity. PSF modelling and galaxy morphology should be encapsulated in separate submodules/subpackages.
All the galaxies are being generated at the centre of the postage stamp (overlapped).
It seems that the synthetic observations are individual postage stamps but the scene modelling creates a single field.

CentofantiEze · 2025-11-28T13:31:53Z

configs/test_run.yaml

+  type: Gaussian
+  sigma: 0.1
+
+gal:


This parameters are for the data generation or the generative forward modelling?

Missing ellipticity prior information.

CentofantiEze · 2025-11-28T13:40:32Z

shine/config.py

+    type: str = "Exponential"  # Changed default from Sersic to Exponential
+    n: Optional[Union[float, DistributionConfig]] = None  # Make optional for Exponential
+    flux: Union[float, DistributionConfig]
+    half_light_radius: Union[float, DistributionConfig] = Field(..., alias="half_light_radius")


What does Field do? If it is for adding an alias to half_light_radius shouldn't the alias be hlr? And why is Field only used here?

CentofantiEze · 2025-11-28T14:01:32Z

shine/data.py

+    def generate_synthetic(config: ShineConfig) -> Observation:
+        import galsim
+
+        # 1. Define PSF


Eventually this will be a single line:

psf = psf_utils.get_psf(config.psf)

and all the code should be written in the psf_utils sub package/module.

The same applies to the following lines.

CentofantiEze · 2025-11-28T14:04:52Z

shine/data.py

+        g1 = get_mean(config.gal.shear.g1)
+        g2 = get_mean(config.gal.shear.g2)
+        shear = galsim.Shear(g1=g1, g2=g2)
+


Missing intrinsic ellipticity.

CentofantiEze · 2025-11-28T14:06:33Z

shine/data.py

+        g2 = get_mean(config.gal.shear.g2)
+        shear = galsim.Shear(g1=g1, g2=g2)
+
+        # Create Galaxy Object - Use Exponential (Sersic n=1)


In principle if we follow the previous logic it should check the galaxy type:

if config.gal.type == "exponential": ...

CentofantiEze · 2025-11-28T15:06:15Z

shine/scene.py

+            sigma = self.config.image.noise.sigma
+            numpyro.sample("obs", dist.Normal(model_image, sigma), obs=observed_data)
+
+        return model


This doesn't return a log likelihood function but the forward model function.

CentofantiEze · 2025-11-28T15:09:31Z

shine/inference.py

+        if extra_args is None:
+            extra_args = {}
+
+        kernel = NUTS(self.model, dense_mass=self.dense_mass)


Do NUTS and MCMC deal with initial samples?

CentofantiEze · 2025-11-28T15:11:08Z

shine/inference.py

+from typing import Dict, Any
+import arviz as az
+
+class HMCInference:


This should be named simply Inference because it could implement various inference methods (HMC, MCMC, etc).

CentofantiEze · 2025-11-28T15:15:42Z

shine/main.py

+        # Print summary
+        print(results.posterior)
+
+    elif args.mode == "map":


We are using map estimation for getting the initial samples closer to the solution, thus it should be run before the inference.

MAP shouldn't be an inference mode option. It should be an optional step to run prior to inference (mcmc) to improve the starting point of the chains.

CentofantiEze · 2025-11-28T15:24:08Z

shine/main.py

+        # We want 'model_image'.
+        # We didn't expose 'model_image' as a deterministic site in the builder.
+        # Let's modify the builder to expose it, OR we can just inspect the 'obs' distribution mean in the trace.
+


The lines below might not be useful given the comments above (MAP shouldn't be an independent inference method).

EiffL added 2 commits November 24, 2025 23:56

adding minimum example

dfb54b2

fix

ee9c0db

CentofantiEze reviewed Nov 28, 2025

View reviewed changes

WIP: Prototype of minimal codebase #2

Are you sure you want to change the base?

WIP: Prototype of minimal codebase #2

Uh oh!

Conversation

EiffL commented Nov 25, 2025

Uh oh!

CentofantiEze left a comment

Choose a reason for hiding this comment

Main comments

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants