Skip to content

Conversation

@ljones140
Copy link
Contributor

@ljones140 ljones140 commented Dec 16, 2025

Part of #664

Replace uploadStream() with upload() for documents <100MB. Fallback to uploadStream() for large files (>100MB).

Problem:
Multiple instances uploading same blob with uploadStream() causes race conditions when committing blocks, resulting in 'invalid block list' errors from Azure Storage.

Solution:

  • Documents <100MB: Use upload() for atomic single-operation upload (no blocks, no commit phase, no race conditions possible)
  • Documents >=100MB: Use uploadStream() with original maxConcurrency=5 (still has multi-instance race risk, but handles memory better than buffering entire file)

Note: maxConcurrency parameter only affects chunk parallelism within a single upload operation, not race conditions between multiple instances. For large files with critical data integrity, consider implementing Azure blob leases (pessimistic locking) as documented by Azure.

Replace uploadStream() with upload() for documents <100MB.
Fallback to uploadStream() for large files (>100MB).

Problem:
Multiple instances uploading same blob with uploadStream() causes race
conditions when committing blocks, resulting in 'invalid block list'
errors from Azure Storage.

Solution:
- Documents <100MB: Use upload() for atomic single-operation upload
  (no blocks, no commit phase, no race conditions possible)
- Documents >=100MB: Use uploadStream() with original maxConcurrency=5
  (still has multi-instance race risk, but handles memory better than
  buffering entire file)

Note: maxConcurrency parameter only affects chunk parallelism within
a single upload operation, not race conditions between multiple instances.
For large files with critical data integrity, consider implementing
Azure blob leases (pessimistic locking) as documented by Azure.
Copy link
Collaborator

@elrayle elrayle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition to start hardening the reliability of uploads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants