Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
100 commits
Select commit Hold shift + click to select a range
cedc4ac
Refactor: Complete rewrite to modular SAM2 architecture
preshanth Sep 30, 2025
a61f3cf
Creating a script to generate and validate and then report the loss w…
preshanth Sep 30, 2025
f2cc22a
Updated to add pynvml
preshanth Sep 30, 2025
8230585
Updating generate data in the cli
preshanth Sep 30, 2025
725ca61
fixing the config inputs
preshanth Sep 30, 2025
b4effae
Black formatting
preshanth Sep 30, 2025
99fdc91
Doing batched operations
preshanth Sep 30, 2025
10f4241
Updating data generation to not have stretch
preshanth Sep 30, 2025
3a46698
Adding more RFI in the synthetic configs
preshanth Sep 30, 2025
f1bbb6f
Updating to allows patching only when less than image size
preshanth Sep 30, 2025
74bbbcc
Moving mask to python int from numpy
preshanth Oct 1, 2025
c4cf1a6
validate gpu
preshanth Oct 1, 2025
e5ad2c7
Changing profiling to be not greedy
preshanth Oct 1, 2025
05e1fca
Update to clean up more memory leaks
preshanth Oct 1, 2025
772cea5
Updating to ensure that synthetic RFI training uses the actual data
preshanth Oct 1, 2025
8bca72b
Turning of always profile
preshanth Oct 1, 2025
721bda0
Missed flipping the dict flag
preshanth Oct 1, 2025
28e2207
Update to fix memory leaks in validation
preshanth Oct 2, 2025
9a31740
Updated to catch all byte conversions before writing json
preshanth Oct 2, 2025
689087f
Updating to reduce batch size to avoid apache overflow
preshanth Oct 2, 2025
0f10fd0
wrong keyword removed
preshanth Oct 2, 2025
43cd827
going to numpy for training and HF for upload
preshanth Oct 2, 2025
df13e45
Updating for training using numpy datasets
preshanth Oct 2, 2025
c35b6ca
Fixing the training bottlenecks arising from memory
preshanth Oct 3, 2025
c0e6c1b
Updating to flush after synthetic generator
preshanth Oct 3, 2025
cdbc401
Turning off mad flagging
preshanth Oct 3, 2025
77b7c31
Starting parallel training
preshanth Oct 4, 2025
4a4b0af
Moving training params to config
preshanth Oct 4, 2025
6b18bc5
Updating the log nomalization such that the scales are now preserved.…
preshanth Oct 4, 2025
7a7d3ad
Updating to move out legacy code and older dcos
preshanth Oct 6, 2025
433a59e
Updating to introduce an H100 config
preshanth Oct 7, 2025
3543d19
Updating to include timestamps everywhere
preshanth Oct 7, 2025
f0ceff2
Updating for better logging
preshanth Oct 7, 2025
e9f6502
Updating to try a newer loading memory in batches and cycling through…
preshanth Oct 7, 2025
e7ad339
Updating to not be too greedy in RAM usage
preshanth Oct 7, 2025
e291491
Marking numpy arrays as read only
preshanth Oct 7, 2025
166d4e3
Moving numpy to pytorch multiprocessing. This should prevent copies w…
preshanth Oct 7, 2025
bf7fd53
BREAKING CHANGE: All datasets now use .pt format instead of .npz fo…
preshanth Oct 8, 2025
b96d8a7
Fixing to not call samrfi cli
preshanth Oct 8, 2025
bfe7224
Replacing lambda call with pool.apply_async
preshanth Oct 8, 2025
9a3d5d9
Updating to generate all datasets at once
preshanth Oct 8, 2025
d498df4
Updating to generated batches and in parallel
preshanth Oct 8, 2025
f470064
Back to streaming multiprocessing based data loading
preshanth Oct 8, 2025
bc86f85
Updated generation to have multiprocessing
preshanth Oct 8, 2025
49469da
Fixed process pool generation
preshanth Oct 8, 2025
be9605f
Fixing print variable names
preshanth Oct 8, 2025
d2d17ed
separating validation from training
preshanth Oct 8, 2025
3b6337e
Adding seprate flags for skipping training and validation
preshanth Oct 8, 2025
e0d0a64
Add SAM-RFI vs CASA flagging comparison with image metrics
preshanth Oct 16, 2025
792679c
sam2 refactor + cleanup
Kitchi Dec 6, 2025
67f92ae
Implement GPU-accelerated transform pipeline for 10-100x speedup
Kitchi Dec 9, 2025
a7edb95
Fix GPU transforms to match CPU implementation exactly
Kitchi Dec 9, 2025
3202486
Fix GPU augmentation to match CPU implementation exactly (physics-pre…
Kitchi Dec 12, 2025
6c1fefa
Updating to clean up configs and validation testing
preshanth Dec 12, 2025
d50180f
Add RAM caching with GPU transforms
preshanth Dec 13, 2025
094c2a3
Add raw training data config
preshanth Dec 13, 2025
1a1765c
Fix RAMCachedDataset output format for SAMDataset compatibility
preshanth Dec 13, 2025
0f5f6b1
SAM2 unpinning the memory
preshanth Dec 13, 2025
9504c6c
Missing brace
preshanth Dec 13, 2025
91ea432
Fixing to float32 to reduce ram footprint
preshanth Dec 13, 2025
c8c1570
Add best model checkpointing and randomized RFI generation
preshanth Dec 13, 2025
c63b510
Fix best model save path attribute
preshanth Dec 13, 2025
91c5db3
Updating configs to ensure that we use 1 polarization. The 4 pol is t…
preshanth Dec 13, 2025
0853c20
Altering to run with random ranges
preshanth Dec 13, 2025
c21a712
Fixing broken masks averaging
preshanth Dec 13, 2025
30dbca0
This is a major change with a lot of breaking but working changes.
preshanth Dec 14, 2025
c89ca92
Introducing resume after stopping should we need it
preshanth Dec 14, 2025
de4d9fe
Fixing the config loading and making sure all params can be accessed …
preshanth Dec 24, 2025
51e3c12
Updating for formatting and for validation
preshanth Dec 24, 2025
8e47b31
Updating to fix validation
preshanth Dec 24, 2025
8bf0e57
Updating to validate in place and not make random ms copies
preshanth Dec 24, 2025
5b1d746
Updating to fix polarizations
preshanth Dec 24, 2025
97147c2
Update to remove a param
preshanth Dec 24, 2025
def7213
Fixing shape mismatches and needless prediction operations
preshanth Dec 24, 2025
73a867e
Changing compute order
preshanth Dec 24, 2025
c3c3bea
Removing a squeeze
preshanth Dec 24, 2025
768f11a
Debug statements
preshanth Dec 24, 2025
0b9d552
Fixing mask sizes
preshanth Dec 24, 2025
6445ef0
Updating predictor
preshanth Dec 25, 2025
d7da606
Cuda fix with spawn rather than fork
preshanth Dec 25, 2025
a3ed154
Updating configs to increase amount of RFI in the data
preshanth Dec 26, 2025
6307d1f
Fixing flag template errors and adding unit tests
preshanth Dec 26, 2025
5494a65
Updating to add unit tests and integration tests
preshanth Dec 26, 2025
0e081ee
Add arbitrary array size support, adaptive thresholding, and validati…
preshanth Dec 29, 2025
ce01f1c
Updating to check in some useful scripts and the updated metric.py
preshanth Dec 30, 2025
9597015
Add CI and fix code quality issues
preshanth Dec 30, 2025
58bb022
Pin linter versions to ensure consistency between pre-commit and CI
preshanth Dec 30, 2025
0317b5e
Updating to make casa optional and to skip CI without CASA.
preshanth Dec 30, 2025
d3eeb69
Removing casa import via msloader
preshanth Dec 30, 2025
7ecb778
Trying to fix the isort issue
preshanth Dec 30, 2025
6d8e1f5
Updating to remove isort check upstream. Leaving it in pre-commit.
preshanth Dec 30, 2025
58eb522
Removing package not used and moving gpu only packages to [gpu]
preshanth Dec 30, 2025
3230364
Making sure torch cpu is there for tests
preshanth Dec 30, 2025
0fb0679
Fixing other ModelCache test locations for transformer dependence
preshanth Dec 30, 2025
a003aa3
Including transfomers in CI
preshanth Dec 30, 2025
1bff774
Fixing casatools implicit deps
preshanth Dec 30, 2025
ec3b41a
Updating readme with current state
preshanth Dec 30, 2025
811f336
HF integration for the SAM2 models for push and pull.
preshanth Dec 30, 2025
4b19c13
Updates to author list for SAM2
preshanth Dec 30, 2025
00f6a28
Cleanup
preshanth Dec 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
@@ -1 +1 @@
*.ipynb -linguist-detectable
*.ipynb -linguist-detectable
81 changes: 81 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
name: CI

on:
push:
branches:
- sam2_refactor
- main
pull_request:
branches:
- sam2_refactor
- main

jobs:
test:
name: Test Python ${{ matrix.python-version }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11", "3.12"]

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pandas>=2.2.0 numpy>=1.26.0 --only-binary :all:
pip install -e .[ci]

- name: Run unit tests
run: |
pytest tests/unit -v -m "not requires_casa"

- name: Run integration tests
run: |
pytest tests/integration -v -m "not requires_casa"

code-quality:
name: Code Quality
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: 'pip'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install black==24.1.1 ruff==0.1.15 isort==5.13.2

- name: Check formatting with Black
run: |
black --check --line-length=100 src/ tests/

# isort check disabled - pre-commit handles this locally
# - name: Check import sorting with isort
# run: |
# isort --check-only --profile black --line-length=100 src/ tests/

- name: Auto-fix with Ruff
run: |
ruff check --fix src/ tests/

- name: Report remaining Ruff issues (warning only)
continue-on-error: true
run: |
ruff check src/ tests/
21 changes: 21 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -160,3 +160,24 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# SAM-RFI specific ignores
# Large downloaded/generated files
models/ # Auto-downloaded SAM2 weights from HuggingFace
datasets/ # Generated training/validation datasets
tmp/ # Temporary files from training/testing
validation_results/ # GPU validation outputs

# Model files
*.pth # PyTorch model checkpoints
*.safetensors # HuggingFace model weights
*.pt # PyTorch weights

# Dataset files
*.npz # Numpy dataset files (can be large)

# CASA logs
casa-*.log

# Archive (keep structure but ignore if regenerated)
archive/
48 changes: 48 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Pre-commit hooks for SAM-RFI
# Install: pip install pre-commit && pre-commit install
# Run manually: pre-commit run --all-files

repos:
# Black - Python code formatter
- repo: https://github.com/psf/black
rev: 24.1.1
hooks:
- id: black
language_version: python3.12
args: [--line-length=100]

# isort - Import sorting
- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort
args: [--profile=black, --line-length=100]

# Ruff - Fast Python linter
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.15
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]

# Basic file checks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
args: [--maxkb=5000]
- id: check-json
- id: check-toml
- id: mixed-line-ending

# MyPy - Static type checking (optional, can be slow)
# Uncomment if you want type checking on commit
# - repo: https://github.com/pre-commit/mirrors-mypy
# rev: v1.8.0
# hooks:
# - id: mypy
# additional_dependencies: [types-pyyaml, types-tqdm]
# args: [--ignore-missing-imports]
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ sphinx:
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
- requirements: docs/requirements.txt
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
SOFTWARE.
Loading