Skip to content

Conversation

@jzbahrai
Copy link
Collaborator

Summary | Résumé

TODO: 1-3 sentence description of the changed you're proposing.

Related Issues | Cartes liées

Test instructions | Instructions pour tester la modification

TODO: Fill in test instructions for the reviewer.

Release Instructions | Instructions pour le déploiement

None.

Reviewer checklist | Liste de vérification du réviseur

  • This PR does not break existing functionality.
  • This PR does not violate GCNotify's privacy policies.
  • This PR does not raise new security concerns. Refer to our GC Notify Risk Register document on our Google drive.
  • This PR does not significantly alter performance.
  • Additional required documentation resulting of these changes is covered (such as the README, setup instructions, a related ADR or the technical documentation).

⚠ If boxes cannot be checked off before merging the PR, they should be moved to the "Release Instructions" section with appropriate steps required to verify before release. For example, changes to celery code may require tests on staging to verify that performance has not been affected.

Copilot AI review requested due to automatic review settings December 30, 2025 22:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces parallel processing for CSV recipient files to improve performance when handling large SMS datasets. The implementation uses ThreadPoolExecutor to process rows concurrently for SMS files with 1000+ rows, while maintaining sequential processing for smaller datasets and non-SMS templates.

Key Changes:

  • Added parallel processing with ThreadPoolExecutor for large SMS datasets (≥1000 rows)
  • Refactored get_rows() to support both sequential and parallel processing paths
  • Added comprehensive test coverage for parallel processing scenarios including error detection, row ordering, and international numbers

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
notifications_utils/recipients.py Implements parallel processing logic with new methods _process_single_row, _get_rows_parallel, and worker function _process_row_chunk
tests/test_recipient_csv.py Adds 9 new test cases covering parallel processing behavior, error handling, and edge cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jzbahrai jzbahrai requested a review from Copilot December 31, 2025 16:15
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

total_rows = min(len(rows_data), self.max_rows)

# Determine optimal chunk size and worker count
chunk_size, max_workers = control_chunk_and_worker_size(data_size=total_rows, chunk_size=None, max_workers=None)
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The control_chunk_and_worker_size function is imported but never defined in this repository. This will cause a runtime error when parallel processing is triggered. Ensure this function is implemented in notifications_utils/parallel.py or imported from the correct module.

Copilot uses AI. Check for mistakes.
error_row = Row(
OrderedDict(),
index=idx,
error_fn=lambda k, v: None,
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda function lambda k, v: None is duplicated in three places (lines 302, 319, and error handling). Consider extracting this into a named function or class constant to improve maintainability and clarify its purpose as a no-op error function.

Copilot uses AI. Check for mistakes.
self.placeholders_as_column_keys,
self.template_type,
self.international_sms,
self.template.content if self.template else None,
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing template.content alone is insufficient for message length validation. The validate_sms_message_length function expects template_content_length to be the actual content string (not its length), but this naming suggests confusion about what's being passed. Review the validation logic in _process_row_chunk (lines 813-817, 839-843) to ensure it correctly validates message length with just the template content string.

Copilot uses AI. Check for mistakes.
Comment on lines +867 to +868
if has_message_too_long:
row_obj.message_too_long = True
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message_too_long flag is being set directly on the Row object after creation, which bypasses the Row's constructor and error handling logic. This differs from sequential processing where the Row's internal error function handles this validation. Consider whether this approach maintains consistency with the Row class's design.

Copilot uses AI. Check for mistakes.
Comment on lines 837 to 843
# Check if this column has a message too long error
formatted_key = Columns.make_key(column_name)
if template_content and formatted_key in placeholders_keys and template_type == "sms":
try:
validate_sms_message_length(column_value or "", template_content)
except ValueError:
has_message_too_long = True
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message length validation logic is duplicated within the same function (also appears in lines 813-817 within chunk_error_fn). The duplication occurs because validation is needed both for error detection in chunk_error_fn and for setting the has_message_too_long flag. Consider refactoring to avoid this duplication.

Copilot uses AI. Check for mistakes.
except Exception as e:
# Log error if Flask app context is available
try:
current_app.logger.error(f"Error processing chunk starting at {chunk_start}: {e}")
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current_app from Flask is used but never imported in this file. Add from flask import current_app to the imports, or verify if Flask context is guaranteed to be available during parallel processing.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants