Skip to content

Conversation

@compbiolover
Copy link
Contributor

@compbiolover compbiolover commented Dec 23, 2025

Description

WIP: Using a fixed bin histogram approach to bin points and see if we need to raise the y max cap (calculate_dynamic_y_cap). We first evaluate if the y max of the data is less than or equal to the default cap and don't apply any capping if this is the case. We then evaluate if the number of points above the default cap are greater than the threshold number (default 5). If the number of points above the default cap do not exceed the threshold number the default cap value (40) is used. If the number of points exceed the threshold number we dynamically generate a new cap via the fixed histogram bins. These bins are of width 10 and our number of bins are the number we need to go from the default cap value of 40 to our hard cap value of 200. We start from the lowest bucket and accumulate points as we walk up the bins until we encounter a bucket where the number of points exceeds the threshold cap. We cap right before this bin. We have a hard cap (default value of 200) that regardless of if the data distribution says we should have a higher cap we cap the data at this cap. This means we can often have cases where the number of points at the top of the plot exceed our threshold value of 5. We send hard cap and maxCappedPoints from client to rust. Default cap comes from manhattan.js's MANHATTAN_LOG_QVALUE_CUTOFF constant. We introduce the jitter and the additional golden/yellow jitter box as even when no hard cap is applied there are still lines of samples and right now this is only with thousands of samples. Thinking long term I imagine the GWAS will certainly need the jitter when operating on millions of dots. Rust will need to be recompiled

Closes

GRIN2 roadmap number 4

To test

Recompile rust. Go to ASH and run with all lesion data types checked. Also test with no filters on all ASHOP samples. You should see capping applied for both. Finally, test with tdbtest. For this you should see no capping applied

Checklist

Check each task that has been performed or verified to be not applicable.

  • Tests: Added and/or passed unit and integration tests, or N/A
  • Todos: Commented or documented, or N/A
  • Notable Changes: updated release.txt, prefixed a commit message with "fix:" or "feat:", added to an internal tracking document, or N/A
  • Rust: Checked to see whether Rust needs to be re-compiled because of this PR, or N/A

@compbiolover compbiolover requested a review from xzhou82 December 23, 2025 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants