Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 24, 2025

Remove-me-section

Sequence loader was a 2.4k-line monolith; refactored into smaller files for readability.

Generic request

  • PR name follows the pattern #1234 – issue name
  • branch name does not contain '#'
  • base branch (master or release/xx) is correct
  • PR is linked with the issue
  • task status changed to "Code review"
  • code follows product standards
  • regression tests updated

For release/xx branch

  • backmerge to master (or newer release/xx) branch is created

Optional

  • unit-tests written
  • documentation updated

Backmerge request

  • PR name follows the pattern Backmerge: #1234 – issue name
  • PR is linked with the issue
  • base branch (master or release/xx) is correct
  • code contains only backmerge changes

Bump request

  • PR name follows the pattern Bump version to ...
  • add brackets [ ] for 'skip ci' and put it into the body
  • milestone is linked to PR
  • all tickets are closed inside the relevant milestone

The original sequence loader file exceeded the 2k-line limit, making it hard to maintain. This PR splits the implementation into focused units while preserving behavior.

  • Refactor structure
    • Keep core BaseMolecule sequence parsing in sequence_loader.cpp (~390 lines).
    • Move KetDocument-specific logic to sequence_loader_ket.cpp.
    • Isolate HELM parsing logic into sequence_loader_helm.cpp.
  • Consistency fix
    • Initialize unexpected_eod with a concrete message in HELM loader.

Example (usage unchanged):

Scanner scanner(input);
MonomerTemplateLibrary lib;
SequenceLoader loader(scanner, lib);
loader.loadFasta(molecule, SequenceLoader::SeqType::DNASeq);
Original prompt

This section details on the original issue you should resolve

<issue_title>sequence_loader.cpp has 2,239 lines of code, which is greater than 2,000 authorized. Split it into smaller files</issue_title>
<issue_description>Problem:
core/indigo-core/molecule/src/sequence_loader.cpp has 2,239 lines of code, which is greater than 2,000 authorized. Split it into smaller files.

Why is this an issue?
A source file that grows too much tends to aggregate too many responsibilities and inevitably becomes harder to understand and therefore to maintain. Above a specific threshold, it is strongly advised to refactor it into smaller pieces of code which focus on well defined tasks. Those smaller files will not only be easier to understand but also probably easier to test.

Problem locations:
core/indigo-core/molecule/src/sequence_loader.cpp</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits December 24, 2025 13:57
Co-authored-by: AlexeyGirin <26869421+AlexeyGirin@users.noreply.github.com>
Co-authored-by: AlexeyGirin <26869421+AlexeyGirin@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor sequence_loader.cpp into smaller files Split SequenceLoader implementation into focused translation units Dec 24, 2025
Copilot AI requested a review from AlexeyGirin December 24, 2025 14:02
@AlexeyGirin AlexeyGirin marked this pull request as ready for review January 3, 2026 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sequence_loader.cpp has 2,239 lines of code, which is greater than 2,000 authorized. Split it into smaller files

2 participants