Skip to content

Conversation

@Fikitti
Copy link
Contributor

@Fikitti Fikitti commented Dec 11, 2025

Implemented a new experimental way to import USFM files (new button in the importers webview), that uses the rebuilding logic of IDML and other improters I implemented. It still needs to be fully tested, but this first iteration should be mostly functional with most USFM files. (important, users can not merge cells, it will break it)

Summary by CodeRabbit

Release Notes

New Features

  • Added USFM (Unified Standard Format Marker) support for importing and exporting biblical texts with round-trip capability (experimental)
  • Introduced Biblica Bible Swapper importer plugin
  • Added file save functionality from webview interface
  • USFM formats now included in rebuild/export format options

Tests

  • Enhanced test configuration for improved reliability

✏️ Tip: You can customize this high-level summary in your review settings.


Note

Experimental USFM round-trip support

  • Adds usfm-experimental importer (parser, inline mapper, cell aligner, exporter) enabling verse-only target imports and precise round-trip rebuild using structureMetadata.lineMappings
  • Registers the new USFM plugin in the importer registry alongside the original USFM importer

Rebuild Export integration

  • Detects USFM notebooks and exports via exportCodexContentAsUsfmRoundtrip with progress, error handling, and timestamped filenames
  • Updates Export UI to list USFM in Rebuild Export formats

Webview and provider enhancements

  • Implements saveFile message and handleSaveFile in NewSourceUploaderProvider to save base64 payloads via VS Code save dialog
  • Wraps finalizeAudioImport notebook write in async handler

Other

  • Registers biblica-swapper importer
  • Increases timeout in an integration test to stabilize file ops

Written by Cursor Bugbot for commit 96ad76c. This will update automatically on new commits. Configure here.

Implemented a new experimental way to import USFM files (new button in the importers webview), that uses the rebuilding logic of IDML and other improters I implemented.
It still needs to be fully tested, but this first iteration should be mostly functional with most USFM files. (important, users can not merge cells, it will break it)
Hope this works!
@coderabbitai
Copy link

coderabbitai bot commented Dec 12, 2025

Walkthrough

A new USFM (Unified Standard Format Marker) import and export system is introduced, comprising webview-based importer components (form, parser, validators, cell aligner, inline marker converter) and server-side round-trip exporter. Export routing is extended to classify and process USFM files. A file save handler is added to the NewSourceUploader provider.

Changes

Cohort / File(s) Summary
USFM Export Handler
src/exportHandler/exportHandler.ts
Added exportCodexContentAsUsfmRoundtrip() to orchestrate USFM round-trip export with progress UI, file validation, original USFM loading, and timestamped output. Enhanced exportCodexContentAsRebuild() to detect and classify USFM files by importer type, marker, or file extension, routing them to the new exporter. Function declared twice in file (duplication).
UI & Test Updates
src/projectManager/projectExportView.ts
src/test/suite/integration/project-healing.test.ts
Added USFM to export format messaging. Modified test callback from arrow function to traditional function expression with explicit 10000ms timeout to enable this context.
File Save Handler
src/providers/NewSourceUploader/NewSourceUploaderProvider.ts
Added handleSaveFile() private method to process base64 data, prompt save dialog, write file, and notify webview. Imported SaveFileMessage type. Wired "saveFile" command in message router. Handler appears duplicated in file.
Importer Registry
webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx
Added imports and entries for USFM Experimental and Biblica Bible Swapper importer plugins. Both plugins inserted twice in importerPlugins array (duplication).
USFM Parser & Utilities
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts
usfmParser.ts: Parses USFM files line-by-line into chapters and verses, extracts book metadata, maintains line mappings for round-trip. usfmInlineMapper.ts: Bidirectional USFM-to-HTML inline marker conversion with footnote handling and fallback regex path. usfmExporter.ts: Reconstructs USFM by merging translated Codex content using line mappings or cell IDs, preserving markers and structure.
USFM Importer Form & Logic
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
UsfmImporterForm.tsx: React component for multi-file USFM selection, validation, parsing with progress tracking, optional content alignment for translation imports, and error handling. index.ts: Validation logic (file extension, size, USFM markers), parsing flow producing source and codex notebooks with metadata, and importer plugin definition with export delegation. usfmCellAligner.ts: Multi-tier cell alignment strategy (by label, ID, verse reference) matching imported content to target cells with confidence scoring and unmatched preservation.
Plugin Metadata
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx
Exports usfmExperimentalImporterPlugin constant with id, name, description, icon, supported extensions, component reference, and enabled flag.

Sequence Diagrams

sequenceDiagram
    participant User as User/Webview
    participant Form as UsfmImporterForm
    participant Validator as Validator
    participant Parser as USFM Parser
    participant Aligner as Cell Aligner
    participant Notebook as Notebook Builder
    
    User->>Form: Select USFM files
    Form->>Validator: validateFile()
    Validator->>Validator: Check extension & markers
    Validator-->>Form: FileValidationResult
    
    rect rgb(220, 245, 220)
    Note over Parser,Notebook: Parse & Build Notebooks
    Form->>Parser: parseFile()
    Parser->>Parser: Read & parse USFM<br/>Extract metadata
    Parser-->>Form: ParsedUsfmDocument
    Form->>Notebook: Create source notebook
    Form->>Notebook: Create codex notebook
    Notebook-->>Form: NotebookPair
    end
    
    alt Translation Import
    rect rgb(245, 230, 220)
    Note over Form,Aligner: Align Content (if intent=target)
    Form->>Aligner: usfmCellAligner()
    Aligner->>Aligner: Multi-tier matching<br/>(label, ID, verse)
    Aligner-->>Form: AlignedCell[]
    end
    Form->>User: Show alignment preview
    User->>Form: Confirm alignment
    end
    
    Form->>Form: handleImportCompletion()
    Form-->>User: Success notification
Loading
sequenceDiagram
    participant Handler as exportHandler
    participant USFM as USFM Exporter
    participant FileSystem as Filesystem
    participant Progress as VS Code Progress
    
    Handler->>Progress: Show progress UI
    
    rect rgb(220, 245, 220)
    Note over Handler,FileSystem: Process Each USFM File
    loop For each file in filesToExport
        Handler->>Handler: Read Codex notebook
        Handler->>Handler: Determine source USFM<br/>(from metadata/fallback)
        Handler->>USFM: Load original USFM
        Handler->>Handler: Extract cell content & mappings
        Handler->>USFM: exportUsfmRoundtrip()
        USFM->>USFM: Build line mappings<br/>Extract translations
        USFM->>USFM: Merge translations<br/>Preserve structure
        USFM-->>Handler: Updated USFM content
        Handler->>FileSystem: Write timestamped file
        Handler->>Progress: Update per-file progress
    end
    end
    
    Handler-->>Handler: Error handling & logging
    Progress-->>Handler: Complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:

  • Duplicate function/method declarations: exportCodexContentAsUsfmRoundtrip() and handleSaveFile() are declared twice in their respective files; verify these are not accidental copy-paste errors and understand the intended structure.
  • Importer registry duplication: USFM Experimental and Biblica Bible Swapper entries are inserted twice in the importerPlugins array; confirm whether this is intentional or a bug that needs consolidation.
  • Round-trip export logic: The exportUsfmRoundtrip() function performs complex line-by-line merging with multiple fallback paths (cellId-based, verse reference mapping, HTML-to-USFM conversion); verify correctness of verse handling and break-tag logic.
  • Cell alignment strategy: The multi-tier matching approach in usfmCellAligner.ts uses confidence scoring; ensure the matching precedence and fallback behavior align with expected use cases.
  • Integration points: Verify that metadata preservation (original file data, line mappings, corpus marker) flows correctly from parser → notebook → exporter, and that backward-compatibility paths work as intended.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'USFM New Experimental Importer' accurately reflects the main change: introduction of a new experimental USFM importer system with multiple supporting components.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch martin/usfm-new

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/exportHandler/exportHandler.ts (1)

1447-1450: Update the supported types list in warning message.

The warning message lists "Supported types: DOCX, IDML, Biblica, PDF" but this is now outdated. USFM, OBS, and TMS are also supported (and PDF is commented out).

                 vscode.window.showWarningMessage(
                     `The following files were skipped (unsupported or coming soon):\n${unsupportedList}\n\nSupported types: DOCX, IDML, Biblica, PDF`,
+                    `The following files were skipped (unsupported or coming soon):\n${unsupportedList}\n\nSupported types: DOCX, IDML, Biblica, OBS, TMS, USFM`,
                     { modal: false }
                 );
🧹 Nitpick comments (13)
src/test/suite/integration/project-healing.test.ts (1)

363-365: Type this for Mocha context to avoid noImplicitThis surprises

If your test TS config enables noImplicitThis, consider:

-test("Healing preserves .project directory structure via merge", async function () {
+test("Healing preserves .project directory structure via merge", async function (this: Mocha.Context) {
-    this.timeout(10000); // Increase timeout for file operations
+    this.timeout(10_000); // Increase timeout for file operations
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx (1)

5-13: supportedExtensions uppercase entries are dead (extension matching lowercases)

Given getImporterByExtension() lowercases the filename extension, "SFM" / "USFM" will never be matched:

-    supportedExtensions: ["usfm", "sfm", "SFM", "USFM"],
+    supportedExtensions: ["usfm", "sfm"],
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx (2)

42-42: Consider adding proper type annotation for targetCells.

The targetCells state is typed as any[]. Consider using a more specific type from the plugin types (e.g., the cell type from NotebookPair) for better type safety.


177-186: Consider removing arbitrary delay before completion.

The 2-second setTimeout delay before calling handleImportCompletion appears to be for UI feedback, but it could cause issues if the user navigates away or if an error occurs during this window. Consider using a more controlled approach or removing the delay entirely.

-                setTimeout(async () => {
-                    try {
-                        // For multi-file imports, pass all notebook pairs for batch import
-                        const notebooks =
-                            notebookPairs.length === 1 ? notebookPairs[0] : notebookPairs;
-                        await handleImportCompletion(notebooks, props);
-                    } catch (err) {
-                        setError(err instanceof Error ? err.message : "Failed to complete import");
-                    }
-                }, 2000);
+                try {
+                    // For multi-file imports, pass all notebook pairs for batch import
+                    const notebooks =
+                        notebookPairs.length === 1 ? notebookPairs[0] : notebookPairs;
+                    await handleImportCompletion(notebooks, props);
+                } catch (err) {
+                    setError(err instanceof Error ? err.message : "Failed to complete import");
+                }
src/providers/NewSourceUploader/NewSourceUploaderProvider.ts (1)

1365-1371: Handle edge case in file extension extraction.

If fileName has no extension, fileName.split('.').pop() returns the entire filename, not an empty string or '*'. This could create an unexpected filter entry.

             const saveUri = await vscode.window.showSaveDialog({
                 defaultUri,
                 saveLabel: 'Save',
                 filters: mime
                     ? {
                         'All Files': ['*'],
-                        [mime]: [fileName.split('.').pop() || '*']
+                        [mime]: [path.extname(fileName).slice(1) || '*']
                     }
                     : undefined
             });
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts (2)

10-81: Refactor to eliminate code duplication.

The convertUsfmInlineMarkersInText helper function (lines 10-81) is nearly identical to the inline marker processing logic in convertUsfmInlineMarkersToHtml (lines 159-228). Consider extracting the shared parsing logic into a single reusable function to improve maintainability.


144-156: Footnote replacement logic is sound but could reuse regex.

The reverse-order replacement correctly preserves string positions. However, footnoteRegex2 (line 147) duplicates the pattern from footnoteRegex (line 92). Consider reusing the same regex with lastIndex reset, or extract to a constant.

webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts (2)

155-159: Type cast bypasses type checking.

The as any cast on the metadata object (line 159) bypasses TypeScript's type checking. If createProcessedCell has a defined metadata type, consider aligning the cellMetadata object with that type or updating the type definition.


461-462: TODO: Footnote extraction not implemented.

The footnoteCount is hardcoded to 0 with a TODO comment. While footnotes are converted to HTML inline (via convertUsfmInlineMarkersToHtml), the count and extraction for metadata purposes is not implemented. Consider tracking this as a follow-up task.

Would you like me to open an issue to track implementing footnote extraction and counting?

webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts (1)

101-106: Extensive debug logging may impact performance.

The console.log statements (lines 101-106 and others throughout) are helpful for debugging but could impact performance with large USFM files. Consider conditionally enabling these logs based on a debug flag or using a proper logging framework with log levels.

webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts (2)

21-22: Minor redundancy in extension list.

The SUPPORTED_EXTENSIONS array includes both lowercase and uppercase variants ('sfm', 'SFM', 'usfm', 'USFM'), but validateFileExtension already performs case-insensitive matching. This redundancy is harmless but could be simplified.

-const SUPPORTED_EXTENSIONS = ['usfm', 'sfm', 'SFM', 'USFM'];
+const SUPPORTED_EXTENSIONS = ['usfm', 'sfm'];

99-99: Potential ID collision with Date.now().

Using Date.now() for IDs could cause collisions if multiple files are imported within the same millisecond. Consider using a UUID or combining with a random component for uniqueness.

-                id: `usfm-experimental-source-${Date.now()}`,
+                id: `usfm-experimental-source-${Date.now()}-${Math.random().toString(36).substring(2, 9)}`,
src/exportHandler/exportHandler.ts (1)

997-1008: Consider using a more specific type instead of any.

The cellData is typed as any, which loses type safety. Consider defining a proper interface for the cell data structure being built.

-                    const codexCells = codexNotebook.cells.map(cell => {
-                        const cellData: any = {
+                    interface ExportCell {
+                        kind: number;
+                        value: string;
+                        metadata: any;
+                        id?: string;
+                    }
+                    const codexCells = codexNotebook.cells.map(cell => {
+                        const cellData: ExportCell = {
                             kind: cell.kind,
                             value: cell.value,
                             metadata: cell.metadata,
                         };
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 971a926 and 4b8093e.

📒 Files selected for processing (12)
  • src/exportHandler/exportHandler.ts (3 hunks)
  • src/projectManager/projectExportView.ts (1 hunks)
  • src/providers/NewSourceUploader/NewSourceUploaderProvider.ts (3 hunks)
  • src/test/suite/integration/project-healing.test.ts (1 hunks)
  • webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx (4 hunks)
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx (1 hunks)
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts (1 hunks)
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx (1 hunks)
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts (1 hunks)
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts (1 hunks)
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts (1 hunks)
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/audio-file-organization.mdc)

**/*.{ts,tsx}: When programmatically adding audio files, use zero-padding of 3 digits for chapter and verse numbers (e.g., chapter 1 becomes '001', verse 25 becomes '025')
Audio file paths must be validated before conversion to webview URIs, and audio files must be restricted to the workspace .project/attachments/ directory for security
Audio file paths should be converted to webview-compatible URIs using webview.asWebviewUri() for frontend integration
Directory scanning for audio files should look for files matching the pattern {BOOK}{CCC}{VVV}.* in the .project/attachments/{BOOK}/ directory
Audio buttons should only appear in the webview when valid audio files are found and successfully loaded for a cell
Audio elements should be created on-demand to minimize memory usage, with only one audio file playing at a time per cell

Files:

  • src/projectManager/projectExportView.ts
  • src/test/suite/integration/project-healing.test.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx
  • src/providers/NewSourceUploader/NewSourceUploaderProvider.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts
  • src/exportHandler/exportHandler.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
**/*.{js,ts,html,htm}

📄 CodeRabbit inference engine (.cursor/rules/audio-recording-permissions.mdc)

Always check browser support for navigator.mediaDevices and getUserMedia API before attempting to access microphone

Files:

  • src/projectManager/projectExportView.ts
  • src/test/suite/integration/project-healing.test.ts
  • src/providers/NewSourceUploader/NewSourceUploaderProvider.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts
  • src/exportHandler/exportHandler.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
**/*.{js,ts}

📄 CodeRabbit inference engine (.cursor/rules/audio-recording-permissions.mdc)

**/*.{js,ts}: Properly clean up media streams by calling track.stop() on all tracks after recording is complete to release microphone access
Use MediaRecorder API with event handlers (ondataavailable, onstart, onstop) to manage recording state and collect audio data
Revoke object URLs created with URL.createObjectURL() using URL.revokeObjectURL() to prevent memory leaks when audio data is no longer needed

Files:

  • src/projectManager/projectExportView.ts
  • src/test/suite/integration/project-healing.test.ts
  • src/providers/NewSourceUploader/NewSourceUploaderProvider.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts
  • src/exportHandler/exportHandler.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
**/*.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (.cursor/rules/migrating-webviews-to-shadcn.mdc)

Use relative paths instead of import aliases for ShadCN component imports (e.g., ../components/ui/button rather than aliased paths)

Files:

  • src/projectManager/projectExportView.ts
  • src/test/suite/integration/project-healing.test.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx
  • src/providers/NewSourceUploader/NewSourceUploaderProvider.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts
  • src/exportHandler/exportHandler.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
**/*.{tsx,ts}?(@(component|page))

📄 CodeRabbit inference engine (.cursor/rules/shadcn-cell-editor.mdc)

Use "use client" directive at the top of React components in a Vite + React + TypeScript VSCode webview environment

Files:

  • src/projectManager/projectExportView.ts
  • src/test/suite/integration/project-healing.test.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx
  • src/providers/NewSourceUploader/NewSourceUploaderProvider.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts
  • src/exportHandler/exportHandler.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
**/*.{tsx,ts}

📄 CodeRabbit inference engine (.cursor/rules/shadcn-cell-editor.mdc)

**/*.{tsx,ts}: Import ShadCN UI components from '@/components/ui/' path alias for Button, Card, Tabs, Textarea, Progress, Separator, and Tooltip components
Import icons from 'lucide-react' library for UI icons
Define component prop interfaces using TypeScript interface syntax with descriptive prop names
Use React.useState hook for state management with proper type annotations where necessary
Use async/await for asynchronous operations like API calls and file operations
Structure TabsContent components with proper border and padding classes for consistent styling
Use aria-label attributes on icon buttons for accessibility
Use conditional rendering with ternary operators and helper functions (like getTranscriptionAreaContent()) for complex UI states

Files:

  • src/projectManager/projectExportView.ts
  • src/test/suite/integration/project-healing.test.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx
  • src/providers/NewSourceUploader/NewSourceUploaderProvider.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts
  • src/exportHandler/exportHandler.ts
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
**/*.{jsx,tsx}

📄 CodeRabbit inference engine (.cursor/rules/migrating-webviews-to-shadcn.mdc)

**/*.{jsx,tsx}: Migrate VSCodeButton to Button from ../components/ui/button with appearance mapping: appearance="icon"variant="outline", appearance="secondary"variant="secondary", appearance="primary" or no appearance → variant="default"
Remove VSCode-specific props like appearance when migrating to ShadCN components, and use the variant prop instead
Migrate VSCodeBadge to Badge from ../components/ui/badge
Migrate VSCodeCard to Card, CardContent, CardHeader etc. from ../components/ui/card
Use cn() utility from ../lib/utils for conditional className merging in ShadCN components
Preserve accessibility attributes (aria-labels, titles, etc.) when migrating from VSCode to ShadCN components

Files:

  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx
  • webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx
🧠 Learnings (4)
📚 Learning: 2025-12-12T00:01:22.734Z
Learnt from: CR
Repo: genesis-ai-dev/codex-editor PR: 0
File: .cursor/rules/migrating-webviews-to-shadcn.mdc:0-0
Timestamp: 2025-12-12T00:01:22.734Z
Learning: Applies to **/*.{jsx,tsx} : Migrate `VSCodeCard` to `Card, CardContent, CardHeader` etc. from `../components/ui/card`

Applied to files:

  • webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx
📚 Learning: 2025-12-12T00:01:35.322Z
Learnt from: CR
Repo: genesis-ai-dev/codex-editor PR: 0
File: .cursor/rules/types.mdc:0-0
Timestamp: 2025-12-12T00:01:35.322Z
Learning: Applies to types/index.d.ts : When passing messages between webviews and providers, update the correct type in index.d.ts

Applied to files:

  • src/providers/NewSourceUploader/NewSourceUploaderProvider.ts
📚 Learning: 2025-12-12T00:01:31.825Z
Learnt from: CR
Repo: genesis-ai-dev/codex-editor PR: 0
File: .cursor/rules/shadcn-cell-editor.mdc:0-0
Timestamp: 2025-12-12T00:01:31.825Z
Learning: Applies to **/*.{tsx,ts} : Import icons from 'lucide-react' library for UI icons

Applied to files:

  • webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx
📚 Learning: 2025-12-12T00:01:31.825Z
Learnt from: CR
Repo: genesis-ai-dev/codex-editor PR: 0
File: .cursor/rules/shadcn-cell-editor.mdc:0-0
Timestamp: 2025-12-12T00:01:31.825Z
Learning: Applies to **/*.{tsx,ts} : Import ShadCN UI components from '@/components/ui/' path alias for Button, Card, Tabs, Textarea, Progress, Separator, and Tooltip components

Applied to files:

  • webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx
🧬 Code graph analysis (8)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx (4)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts (3)
  • usfmExperimentalImporter (174-192)
  • validateFile (27-65)
  • parseFile (73-172)
webviews/codex-webviews/src/NewSourceUploader/types/plugin.ts (4)
  • ImporterComponentProps (204-251)
  • AlignedCell (35-42)
  • ImportedContent (23-30)
  • CellAligner (47-51)
webviews/codex-webviews/src/NewSourceUploader/types/common.ts (2)
  • ImportProgress (34-38)
  • NotebookPair (29-32)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts (1)
  • usfmCellAligner (14-210)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx (1)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx (1)
  • UsfmImporterForm (30-408)
src/providers/NewSourceUploader/NewSourceUploaderProvider.ts (2)
webviews/codex-webviews/src/NewSourceUploader/types/plugin.ts (1)
  • SaveFileMessage (353-358)
src/test/manual-structure-test.js (2)
  • vscode (6-6)
  • workspaceFolder (12-12)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts (4)
webviews/codex-webviews/src/NewSourceUploader/types/common.ts (5)
  • FileValidationResult (49-56)
  • ProgressCallback (81-81)
  • ImportResult (40-47)
  • ProcessedNotebook (17-27)
  • NotebookPair (29-32)
webviews/codex-webviews/src/NewSourceUploader/utils/workflowHelpers.ts (2)
  • validateFileExtension (78-84)
  • createProgress (12-20)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts (1)
  • parseUsfmFile (42-466)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts (1)
  • exportUsfmRoundtrip (26-392)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts (1)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts (1)
  • htmlInlineToUsfm (232-444)
webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx (1)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.tsx (1)
  • usfmExperimentalImporterPlugin (5-13)
src/exportHandler/exportHandler.ts (2)
src/test/manual-structure-test.js (2)
  • vscode (6-6)
  • originalFileUri (57-57)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts (1)
  • exportUsfmRoundtrip (26-392)
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts (2)
webviews/codex-webviews/src/NewSourceUploader/types/plugin.ts (3)
  • CellAligner (47-51)
  • ImportedContent (23-30)
  • AlignedCell (35-42)
src/utils/editMapUtils.ts (1)
  • cellLabel (48-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Node 20 • macos-latest
  • GitHub Check: Node 20 • windows-latest
  • GitHub Check: Node 20 • ubuntu-latest
🔇 Additional comments (17)
src/projectManager/projectExportView.ts (1)

337-345: UI update for Rebuild Export formats looks consistent

Nice touch adding USFM to the “original format” description and the visible tag list.

webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx (2)

48-73: LGTM!

The handleFileSelect callback is well-implemented with proper error handling for file previews and sensible limits on preview size.


247-407: LGTM!

The JSX rendering uses ShadCN components correctly with proper accessibility attributes (labels, aria associations). The UI provides good feedback during processing and error states.

src/providers/NewSourceUploader/NewSourceUploaderProvider.ts (1)

320-326: LGTM!

The new saveFile command handler follows the established pattern in the message router, with proper type casting and delegation to the handler method.

webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmInlineMapper.ts (2)

337-438: Regex fallback handles nested structures iteratively with recursion.

The fallback regex approach uses iteration with a maximum limit (20 iterations) to handle nested tags progressively. However, the recursive calls to htmlInlineToUsfm within the iteration (lines 394, 402, 428) could potentially cause deep recursion if content is malformed. This is mitigated by the maxIterations limit on the outer loop but consider adding a depth parameter for safety with deeply nested malicious input.


232-335: LGTM!

The DOMParser-based approach for HTML→USFM conversion is well-implemented with proper handling of footnotes, inline markers, and nested structures. The fallback to regex ensures compatibility in Node.js contexts.

webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmParser.ts (3)

12-32: LGTM!

The ParsedUsfmDocument interface is well-structured with all necessary fields for round-trip export support. The lineMappings type provides good structure for tracking source line to cell relationships.


42-74: LGTM!

The function setup properly initializes all tracking state for parsing, including multi-line verse handling with break tags. The versesOnly parameter enables target import mode that skips non-verse content.


166-432: LGTM!

The main parsing loop correctly handles the USFM format including:

  • Multi-line verses with break tags (\li1, \q1, \b)
  • Header content assigned to chapter 1
  • Continuation lines without markers
  • The versesOnly mode for target imports

The logic preserves structure metadata for round-trip export.

webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts (3)

10-34: LGTM!

The LineMapping interface is consistent with the parser's output, and the function signature with overloaded parameter handling provides good backward compatibility for existing importers.


230-351: Complex multi-line verse handling.

The logic for mapping <br>-separated translation parts back to USFM break lines is necessarily complex to support round-trip export. The implementation correctly:

  • Preserves original break markers (\li1, \q1, \b)
  • Uses metadata to track break tag order
  • Handles cases where translation has more/fewer parts than original

This is a critical piece for maintaining USFM structure during round-trip.


389-391: LGTM!

The export function correctly reconstructs USFM content by preserving markers and structure while substituting translated text. The summary logging provides useful metrics for debugging import/export cycles.

webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/index.ts (3)

27-65: LGTM!

The validation function has appropriate checks for extension, file size, and basic USFM marker presence. Error handling is in place for file read failures.


73-172: LGTM!

The parsing logic correctly reads the file, delegates to the USFM parser, and constructs properly structured notebook pairs with metadata for round-trip export support. Style cells are preserved in the codex notebook while text cells are cleared for translation.


174-192: LGTM!

The plugin definition properly implements the ImporterPlugin interface with round-trip export support. The dynamic import for the exporter and the fallback path for imports without lineMappings are well-handled.

src/exportHandler/exportHandler.ts (2)

856-1054: LGTM overall - robust USFM round-trip export implementation.

The function handles multiple fallback paths for locating the original USFM file, supports both lineMappings-based and fallback export modes, and includes comprehensive debug logging. The error handling per-file allows partial success when exporting multiple files.


1409-1423: LGTM!

The USFM export block follows the established pattern for other format exports in exportCodexContentAsRebuild, with proper error handling and progress reporting.

Comment on lines +871 to +872
const exportFolder = vscode.Uri.file(userSelectedPath);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing directory creation for export folder.

Unlike other export functions in this file (e.g., exportCodexContentAsIdmlRoundtrip, exportCodexContentAsDocxRoundtrip), this function does not call vscode.workspace.fs.createDirectory(exportFolder). This could cause the export to fail if the directory doesn't exist.

     const exportFolder = vscode.Uri.file(userSelectedPath);
+    await vscode.workspace.fs.createDirectory(exportFolder);
🤖 Prompt for AI Agents
In src/exportHandler/exportHandler.ts around lines 871 to 872, the exportFolder
Uri is created but the directory is not ensured to exist; add a call to await
vscode.workspace.fs.createDirectory(exportFolder) (using the same pattern as
exportCodexContentAsIdmlRoundtrip/exportCodexContentAsDocxRoundtrip) immediately
after constructing exportFolder so the folder is created before any write
operations, and propagate or handle any errors as the surrounding function does
for other export flows.

Comment on lines +1277 to +1289
} else if (
corpusMarker === 'usfm' ||
importerType === 'usfm-experimental' ||
importerType === 'usfm' ||
// Also check for NT/OT corpus markers with USFM file extensions (Bible books imported as USFM)
((corpusMarker === 'NT' || corpusMarker === 'OT') &&
originalFileName &&
(originalFileName.endsWith('.usfm') || originalFileName.endsWith('.sfm') || originalFileName.endsWith('.USFM') || originalFileName.endsWith('.SFM'))) ||
(originalFileName && (originalFileName.endsWith('.usfm') || originalFileName.endsWith('.sfm') || originalFileName.endsWith('.USFM') || originalFileName.endsWith('.SFM')))
) {
// USFM files use the USFM round-trip exporter
filesByType['usfm'] = filesByType['usfm'] || [];
filesByType['usfm'].push(filePath);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's check the directory structure and find importer files
fd -t f "importer" webviews/codex-webviews/src/NewSourceUploader/ --type f

Repository: genesis-ai-dev/codex-editor

Length of output: 1666


🏁 Script executed:

# Search for .usfm/.sfm references in importers
rg -n "\.usfm|\.sfm" webviews/codex-webviews/src/NewSourceUploader/ -t ts

Repository: genesis-ai-dev/codex-editor

Length of output: 2302


🏁 Script executed:

# Get the broader context around lines 1277-1289 in exportHandler.ts
cat -n src/exportHandler/exportHandler.ts | sed -n '1260,1310p'

Repository: genesis-ai-dev/codex-editor

Length of output: 3876


🏁 Script executed:

# Check how corpusMarker and importerType are set in metadata
rg -n "corpusMarker\s*=" src/exportHandler/ --type ts -B 2 -A 2

Repository: genesis-ai-dev/codex-editor

Length of output: 10483


🏁 Script executed:

# Check Paratext importer to see how it handles USFM files and metadata
head -100 webviews/codex-webviews/src/NewSourceUploader/importers/paratext/index.ts

Repository: genesis-ai-dev/codex-editor

Length of output: 203


🏁 Script executed:

# Look for where file metadata is created/stored
rg -n "importerType|corpusMarker" src/exportHandler/exportHandler.ts --type ts -B 1 -A 1 | head -80

Repository: genesis-ai-dev/codex-editor

Length of output: 5654


🏁 Script executed:

# Find Paratext importer files
find webviews/codex-webviews/src/NewSourceUploader/importers/paratext -type f -name "*.ts" -o -name "*.tsx"

Repository: genesis-ai-dev/codex-editor

Length of output: 310


🏁 Script executed:

# Search for where corpusMarker is assigned (set, not just read)
rg -n "corpusMarker\s*:" --type ts -B 2 -A 2 | head -150

Repository: genesis-ai-dev/codex-editor

Length of output: 13353


🏁 Script executed:

# Check for importerType assignment in metadata
rg -n "importerType\s*:" --type ts -B 2 -A 2 | head -150

Repository: genesis-ai-dev/codex-editor

Length of output: 15770


🏁 Script executed:

# Check Paratext importer file
cat webviews/codex-webviews/src/NewSourceUploader/importers/paratext/index.tsx

Repository: genesis-ai-dev/codex-editor

Length of output: 861


🏁 Script executed:

# Get the full Paratext parser to see metadata assignments
cat webviews/codex-webviews/src/NewSourceUploader/importers/paratext/parser.ts | head -300

Repository: genesis-ai-dev/codex-editor

Length of output: 11910


🏁 Script executed:

# Search for corpusMarker and importerType assignments in paratext parser
rg -n "corpusMarker|importerType" webviews/codex-webviews/src/NewSourceUploader/importers/paratext/parser.ts -B 2 -A 2

Repository: genesis-ai-dev/codex-editor

Length of output: 53


🏁 Script executed:

# Search for the paratextImporter function and its return/metadata setup
rg -n "paratextImporter|notebookPair|createNotebookPair" webviews/codex-webviews/src/NewSourceUploader/importers/paratext/parser.ts -B 2 -A 5

Repository: genesis-ai-dev/codex-editor

Length of output: 2849


🏁 Script executed:

# Get the end of the parser file where the main export function likely is
tail -200 webviews/codex-webviews/src/NewSourceUploader/importers/paratext/parser.ts

Repository: genesis-ai-dev/codex-editor

Length of output: 9081


🏁 Script executed:

# Find and examine createNotebookPair function
rg -n "export.*createNotebookPair|function createNotebookPair" webviews/codex-webviews/src/NewSourceUploader/importers/common/usfmUtils.ts -A 50 | head -100

Repository: genesis-ai-dev/codex-editor

Length of output: 1725


🏁 Script executed:

# Check what originalFileName is set to in Paratext importer
rg -n "originalFileName" webviews/codex-webviews/src/NewSourceUploader/importers/paratext/parser.ts -B 2 -A 2

Repository: genesis-ai-dev/codex-editor

Length of output: 53


🏁 Script executed:

# Also check usfmUtils to see what getCorpusMarkerForBook returns
rg -n "getCorpusMarkerForBook" webviews/codex-webviews/src/NewSourceUploader/importers/common/usfmUtils.ts -A 15 | head -50

Repository: genesis-ai-dev/codex-editor

Length of output: 1372


🏁 Script executed:

# Find getCorpusMarkerForBook in corpusUtils
rg -n "export.*getCorpusMarkerForBook|function getCorpusMarkerForBook" webviews/codex-webviews/src/utils/corpusUtils.ts -A 20

Repository: genesis-ai-dev/codex-editor

Length of output: 159


🏁 Script executed:

# Also check what originalFileName is set to in createNotebookPair - it's baseName
# Let's verify baseName for Paratext is the book code by looking at the Paratext parser call again
cat webviews/codex-webviews/src/NewSourceUploader/importers/paratext/parser.ts | sed -n '437,465p'

Repository: genesis-ai-dev/codex-editor

Length of output: 1613


🏁 Script executed:

# Search for getCorpusMarkerForBook across the entire codebase
rg -n "getCorpusMarkerForBook" webviews/ src/ --type ts -A 10 | head -100

Repository: genesis-ai-dev/codex-editor

Length of output: 10410


🏁 Script executed:

# Find and examine getCorpusMarkerForBook in sharedUtils
cat sharedUtils/corpusUtils.ts | head -200

Repository: genesis-ai-dev/codex-editor

Length of output: 3202


🏁 Script executed:

# Check if there's a Paratext exporter in exportHandler
rg -n "paratext|Paratext" src/exportHandler/exportHandler.ts -B 2 -A 2

Repository: genesis-ai-dev/codex-editor

Length of output: 1544


🏁 Script executed:

# Check what file types are actually supported in rebuild export
rg -n "filesByType\[" src/exportHandler/exportHandler.ts | grep -A 1 -B 1 "1277\|1289\|1290"

Repository: genesis-ai-dev/codex-editor

Length of output: 365


🏁 Script executed:

# Check USFM importer to see what originalFileName it sets
rg -n "originalFileName|createNotebookPair" webviews/codex-webviews/src/NewSourceUploader/importers/usfm/index.ts -B 2 -A 2 | head -60

Repository: genesis-ai-dev/codex-editor

Length of output: 547


🏁 Script executed:

# Also check if there are any other places where files get .usfm extension in originalFileName
rg -n "originalFileName.*usfm|\.usfm.*originalFileName" webviews/ src/ --type ts

Repository: genesis-ai-dev/codex-editor

Length of output: 980


🏁 Script executed:

# Check all places where originalFileName is set or modified
rg -n "originalFileName\s*=" webviews/codex-webviews/src/NewSourceUploader/importers/ --type ts -B 2 -A 2 | head -150

Repository: genesis-ai-dev/codex-editor

Length of output: 1010


Line 1285 serves as a valid fallback but relies on filename extension as its sole criterion.

The standalone originalFileName check at line 1285 is intentionally broad to catch edge cases where files lack corpusMarker or importerType metadata. However, it assumes that any file with a .usfm/.sfm extension in originalFileName should route to the USFM exporter, regardless of which importer created it.

While this works for USFM-imported files (which strip extensions from originalFileName in createNotebookPair), and Paratext files won't match since they use book codes without extensions, the condition could theoretically misroute files from importers that preserve extensions in originalFileName (e.g., Biblica). If such an importer ever produces a file with .usfm in its originalFileName, it would incorrectly route to the USFM exporter instead of its proper handler.

Consider whether this fallback should require additional validation (e.g., checking importerType is not set to a non-USFM value) to prevent unintended file routing.

🤖 Prompt for AI Agents
In src/exportHandler/exportHandler.ts around lines 1277 to 1289, the fallback
condition at line 1285 routes any file whose originalFileName ends with
.usfm/.sfm to the USFM exporter even when importerType might indicate a
different importer; narrow this fallback by adding an extra check so it only
applies when importerType is absent/empty or explicitly one of the USFM
importers (e.g., 'usfm' or 'usfm-experimental') — alternatively, implement a
negative check that excludes known non-USFM importers (blacklist) before routing
to the USFM exporter; update the conditional accordingly and add a short comment
explaining the reason for the stricter fallback.

Comment on lines +1356 to +1360
// Show save dialog
const workspaceFolder = vscode.workspace.workspaceFolders?.[0];
const defaultUri = workspaceFolder
? vscode.Uri.joinPath(workspaceFolder.uri, fileName)
: undefined;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Sanitize fileName to prevent path traversal in default URI.

The fileName from the webview message is used directly in constructing defaultUri. While the VS Code save dialog ultimately controls where the file is saved, a malicious or malformed fileName containing path components (e.g., ../../../etc/passwd) could set an unexpected default location, potentially confusing users or exposing directory structure.

+            // Sanitize fileName to prevent path traversal
+            const sanitizedFileName = path.basename(fileName);
+
             // Show save dialog
             const workspaceFolder = vscode.workspace.workspaceFolders?.[0];
             const defaultUri = workspaceFolder
-                ? vscode.Uri.joinPath(workspaceFolder.uri, fileName)
+                ? vscode.Uri.joinPath(workspaceFolder.uri, sanitizedFileName)
                 : undefined;

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/providers/NewSourceUploader/NewSourceUploaderProvider.ts around lines
1356 to 1360, the fileName from the webview is used directly to build defaultUri
which allows path traversal; sanitize the incoming fileName before joining it to
the workspace path by extracting only a safe basename (strip any path separators
and parent references like "../" or backslashes), reject or replace unsafe
characters, enforce a reasonable max length, and fallback to a fixed safe
defaultName if validation fails; then pass that sanitized name to
vscode.Uri.joinPath so the save dialog default cannot point outside the intended
workspace.

Comment on lines +19 to 21
import { usfmImporterPlugin } from "./usfm/index.tsx"; // Original USFM importer
import { usfmExperimentalImporterPlugin } from "./usfm/experimental/index.tsx"; // Experimental round-trip importer (standalone with headers in chapter 1)
import { ebibleDownloadImporterPlugin } from "./ebibleCorpus/index.tsx";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# First, let's find and examine the registry.tsx file
fd -name "registry.tsx" webviews/codex-webviews/src/NewSourceUploader/importers

# Then examine its contents
cat webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx

Repository: genesis-ai-dev/codex-editor

Length of output: 301


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the registry.tsx file
find webviews/codex-webviews/src/NewSourceUploader/importers -name "registry.tsx" -type f

# Read the file
cat webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx

Repository: genesis-ai-dev/codex-editor

Length of output: 9256


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for plugin ID definitions across all importer plugins
rg 'id:\s*["\']' webviews/codex-webviews/src/NewSourceUploader/importers/ --type ts --type tsx -A 0 | head -50

Repository: genesis-ai-dev/codex-editor

Length of output: 139


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for plugin ID definitions across all importer plugins
rg 'id:' webviews/codex-webviews/src/NewSourceUploader/importers/ --type ts --type tsx -B 1 -A 1

Repository: genesis-ai-dev/codex-editor

Length of output: 98


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for plugin ID definitions across all importer plugins
rg 'id:' webviews/codex-webviews/src/NewSourceUploader/importers/ -B 1 -A 1 | head -100

Repository: genesis-ai-dev/codex-editor

Length of output: 9136


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find all index.tsx files in importer directories and extract their IDs
find webviews/codex-webviews/src/NewSourceUploader/importers -name "index.tsx" -type f | while read file; do
  echo "=== $file ==="
  rg 'id:\s*"[^"]*"' "$file" | head -1
done

Repository: genesis-ai-dev/codex-editor

Length of output: 347


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Extract all plugin IDs from index.tsx files
echo "Extracting all plugin IDs..."
find webviews/codex-webviews/src/NewSourceUploader/importers -name "index.tsx" -type f | sort | while read file; do
  id=$(rg 'id:\s*"[^"]*"' "$file" | head -1 | sed 's/.*id:\s*"\([^"]*\)".*/\1/')
  if [ -n "$id" ]; then
    echo "$id|$file"
  fi
done | sort > /tmp/plugin_ids.txt

echo "=== All Plugin IDs ==="
cat /tmp/plugin_ids.txt

echo ""
echo "=== Checking for Duplicates ==="
cut -d'|' -f1 /tmp/plugin_ids.txt | sort | uniq -d | while read dup_id; do
  echo "DUPLICATE: $dup_id"
  grep "^$dup_id|" /tmp/plugin_ids.txt
done

Repository: genesis-ai-dev/codex-editor

Length of output: 99


🏁 Script executed:

#!/bin/bash

# List all index.tsx files
echo "=== Found index.tsx files ==="
find webviews/codex-webviews/src/NewSourceUploader/importers -name "index.tsx" -type f | sort

echo ""
echo "=== Extracting plugin IDs ==="

# Extract IDs with proper error handling
for file in $(find webviews/codex-webviews/src/NewSourceUploader/importers -name "index.tsx" -type f | sort); do
  result=$(rg 'id:\s*"[^"]*"' "$file" 2>/dev/null || true)
  if [ -n "$result" ]; then
    id=$(echo "$result" | head -1 | sed 's/.*id:\s*"\([^"]*\)".*/\1/')
    echo "$id | $file"
  else
    echo "NO_ID_FOUND | $file"
  fi
done

Repository: genesis-ai-dev/codex-editor

Length of output: 3254


🏁 Script executed:

#!/bin/bash

echo "=== Checking biblica-swapper plugin ==="
find webviews/codex-webviews/src/NewSourceUploader/importers -path "*biblica*" -name "*.tsx" | head -10

echo ""
echo "=== Contents of biblica/index.tsx ==="
cat webviews/codex-webviews/src/NewSourceUploader/importers/biblica/index.tsx | head -40

echo ""
echo "=== Contents of indesign/index.tsx ==="
cat webviews/codex-webviews/src/NewSourceUploader/importers/indesign/index.tsx | head -40

echo ""
echo "=== Contents of pdf/index.tsx ==="
cat webviews/codex-webviews/src/NewSourceUploader/importers/pdf/index.tsx | head -40

Repository: genesis-ai-dev/codex-editor

Length of output: 2305


🏁 Script executed:

#!/bin/bash

echo "=== Searching for biblicaSwapperImporterPlugin ==="
rg 'biblicaSwapperImporterPlugin' webviews/codex-webviews/src/NewSourceUploader/importers/ -l

echo ""
echo "=== Checking if biblica-swapper directory exists ==="
ls -la webviews/codex-webviews/src/NewSourceUploader/importers/ | grep biblica

echo ""
echo "=== Contents of audio2/index.tsx ==="
cat webviews/codex-webviews/src/NewSourceUploader/importers/audio2/index.tsx | head -50

Repository: genesis-ai-dev/codex-editor

Length of output: 1139


🏁 Script executed:

#!/bin/bash

echo "=== Searching for biblica-swapper anywhere in codebase ==="
find webviews/codex-webviews -type d -name "*biblica*"

echo ""
echo "=== Checking registry.tsx line 30 (biblicaSwapperImporterPlugin import) ==="
sed -n '25,35p' webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx

echo ""
echo "=== Checking all audio directories ==="
find webviews/codex-webviews/src/NewSourceUploader/importers -type d -name "audio*"

echo ""
echo "=== Checking contents of audio directory ==="
ls -la webviews/codex-webviews/src/NewSourceUploader/importers/audio/

echo ""
echo "=== Checking contents of audio2 directory ==="
ls -la webviews/codex-webviews/src/NewSourceUploader/importers/audio2/

Repository: genesis-ai-dev/codex-editor

Length of output: 1655


🏁 Script executed:

#!/bin/bash

echo "=== Contents of audio/index.tsx ==="
cat webviews/codex-webviews/src/NewSourceUploader/importers/audio/index.tsx

echo ""
echo "=== Checking if both audio plugins are imported in registry ==="
rg 'audioImporterPlugin|audio2' webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx

Repository: genesis-ai-dev/codex-editor

Length of output: 1143


Remove broken import and fix duplicate plugin IDs

The registry has two critical issues:

  1. Missing module: biblicaSwapperImporterPlugin imported from ./biblica-swapper/index.tsx does not exist. Remove this import and its registry entry (lines 30, 158-163), or create the missing module.

  2. Duplicate plugin ID: Both audio/index.tsx and audio2/index.tsx export plugins with id: "audio", which breaks getImporterById() lookups. Either remove the unused audio2 directory or rename its plugin ID to something unique.

🤖 Prompt for AI Agents
In webviews/codex-webviews/src/NewSourceUploader/importers/registry.tsx lines
19-21 (and also review lines ~30 and 158-163), remove the broken import and
registry entry for biblicaSwapperImporterPlugin (import from
./biblica-swapper/index.tsx does not exist) by deleting its import and any place
it is added to the registry, or alternatively create the missing
./biblica-swapper/index.tsx module that exports the plugin; also fix the
duplicate plugin ID collision between audio/index.tsx and audio2/index.tsx by
either removing the unused audio2 importer or changing its exported plugin id to
a unique value (and update any references) so getImporterById() lookups are
unambiguous.

Comment on lines +74 to +191
// Track which target cells have been matched
const matchedTargetCells = new Set<any>();

// Process each imported content item
// Only match verses to existing target cells - don't create new cells
for (const importedItem of importedContent) {
if (!importedItem.content.trim()) {
continue; // Skip empty content
}

const importedId = importedItem.id;
let matchedCell: any | null = null;
let alignmentMethod: AlignedCell['alignmentMethod'] = 'custom';
let confidence = 0.0;

// Strategy 1: PRIORITIZE cellLabel matching (most reliable for verse matching)
// Check both importedItem.cellLabel and importedItem.metadata?.cellLabel
const cellLabel = importedItem.cellLabel || (importedItem as any).metadata?.cellLabel;
if (cellLabel) {
const labelStr = String(cellLabel).trim();
const normalizedLabel = labelStr.toUpperCase();

if (targetCellsByLabel.has(labelStr)) {
matchedCell = targetCellsByLabel.get(labelStr);
alignmentMethod = 'custom';
confidence = 0.95; // High confidence for label matching
labelMatches++;
} else if (targetCellsByLabel.has(normalizedLabel)) {
matchedCell = targetCellsByLabel.get(normalizedLabel);
alignmentMethod = 'custom';
confidence = 0.95; // High confidence for label matching
labelMatches++;
}
}

// Strategy 2: Try exact ID match (fallback)
// Try both original case and uppercase
if (!matchedCell && importedId) {
const normalizedId = String(importedId).trim().toUpperCase();
const originalId = String(importedId).trim();

if (targetCellsById.has(originalId)) {
matchedCell = targetCellsById.get(originalId);
alignmentMethod = 'exact-id';
confidence = 1.0;
exactMatches++;
} else if (targetCellsById.has(normalizedId)) {
matchedCell = targetCellsById.get(normalizedId);
alignmentMethod = 'exact-id';
confidence = 1.0;
exactMatches++;
}
}

// Strategy 3: Try verse reference matching (for verses) - last resort
// First try with book code for precise matching, then fallback to chapter:verse
if (!matchedCell && importedId) {
// Match pattern: book code (2+ chars), space(s), chapter number, colon, verse number
const verseMatch = String(importedId).match(/^([A-Z0-9]{2,})\s+(\d+):(\d+[a-z]?)$/i);
if (verseMatch) {
const [, bookCode, chapter, verse] = verseMatch;
const normalizedBookCode = bookCode.toUpperCase();
// Try matching with normalized book code first (more precise)
const verseRefWithBook = `${normalizedBookCode} ${chapter}:${verse}`;
if (targetVersesByRef.has(verseRefWithBook)) {
matchedCell = targetVersesByRef.get(verseRefWithBook);
alignmentMethod = 'custom';
confidence = 0.9; // High confidence for book-specific verse matching
verseMatches++;
} else {
// Fallback to chapter:verse matching (in case book codes differ slightly)
const verseRef = `${chapter}:${verse}`;
if (targetVersesByRef.has(verseRef)) {
matchedCell = targetVersesByRef.get(verseRef);
alignmentMethod = 'custom';
confidence = 0.85; // Medium-high confidence for verse matching
verseMatches++;
}
}
}
}

// Only add aligned cell if we found a match
// Skip unmatched verses - don't create new cells for them
if (matchedCell) {
matchedTargetCells.add(matchedCell);
alignedCells.push({
notebookCell: matchedCell,
importedContent: importedItem,
alignmentMethod,
confidence,
});
} else {
// No match found - skip this verse (don't create new cells)
// Log for debugging but don't add to alignedCells
console.warn(`[USFM Aligner] No match found for verse: ${importedId || 'unknown'}`);
unmatched++;
}
}

// IMPORTANT: Preserve all existing target cells that weren't matched
// This ensures preface cells (chapter 0), headers, and other non-verse cells are kept
for (const targetCell of targetCells) {
if (!matchedTargetCells.has(targetCell)) {
// This cell wasn't matched - preserve it with its original content
alignedCells.push({
notebookCell: targetCell,
importedContent: {
id: (targetCell.metadata?.id || targetCell.id) || '',
content: targetCell.value || targetCell.content || '',
cellLabel: targetCell.metadata?.cellLabel,
metadata: targetCell.metadata || {},
},
alignmentMethod: 'custom', // Preserved existing cell
confidence: 1.0,
});
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return aligned cells in targetCells order (current logic can reorder everything)

Right now you push() matched cells in importedContent order and then append all preserved target cells, which can reorder the final aligned list. If downstream applies updates sequentially or UI previews are order-dependent, this will be painful.

Suggested shape: collect matches into a Map<targetCell, alignedPayload>, then iterate targetCells once and emit either the matched alignment or the preserved cell (keeps stable order and guarantees one entry per target cell).

🤖 Prompt for AI Agents
In
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
around lines 74-191, the current logic pushes matched alignedCells in
importedContent order and then appends preserved target cells, which reorders
entries; change this to build a Map keyed by targetCell -> alignedPayload when
matching (instead of pushing directly), track unmatched count separately, then
after processing importedContent iterate the original targetCells array in order
and push either the mapped matched payload or a preserved payload for that
targetCell (ensuring exactly one output per target cell and stable ordering);
remove the final loop that filters unmatched targetCells into alignedCells and
instead use the single ordered iteration over targetCells to emit results.

Comment on lines +179 to +189
alignedCells.push({
notebookCell: targetCell,
importedContent: {
id: (targetCell.metadata?.id || targetCell.id) || '',
content: targetCell.value || targetCell.content || '',
cellLabel: targetCell.metadata?.cellLabel,
metadata: targetCell.metadata || {},
},
alignmentMethod: 'custom', // Preserved existing cell
confidence: 1.0,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use ?? (not ||) when preserving existing cell content

targetCell.value || targetCell.content || '' will drop legitimate empty strings. Safer:

-                    content: targetCell.value || targetCell.content || '',
+                    content: (targetCell.value ?? targetCell.content ?? ''),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
alignedCells.push({
notebookCell: targetCell,
importedContent: {
id: (targetCell.metadata?.id || targetCell.id) || '',
content: targetCell.value || targetCell.content || '',
cellLabel: targetCell.metadata?.cellLabel,
metadata: targetCell.metadata || {},
},
alignmentMethod: 'custom', // Preserved existing cell
confidence: 1.0,
});
alignedCells.push({
notebookCell: targetCell,
importedContent: {
id: (targetCell.metadata?.id || targetCell.id) || '',
content: (targetCell.value ?? targetCell.content ?? ''),
cellLabel: targetCell.metadata?.cellLabel,
metadata: targetCell.metadata || {},
},
alignmentMethod: 'custom', // Preserved existing cell
confidence: 1.0,
});
🤖 Prompt for AI Agents
In
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmCellAligner.ts
around lines 179 to 189, the code uses the logical OR operator to choose content
(targetCell.value || targetCell.content || ''), which will treat legitimate
empty strings as falsy and replace them; change those fallbacks to use the
nullish coalescing operator (??) so that only null/undefined are replaced (e.g.,
targetCell.value ?? targetCell.content ?? ''), and apply the same change to any
other fields here that should allow empty strings (such as id if intended),
ensuring metadata remains unchanged.

Comment on lines +381 to +387
} else {
// No translation found - keep original
updatedLines.push(mapping.originalLine || lines[i]);
if (mapping.cellId && cellTranslations.has(mapping.cellId)) {
console.warn(`[USFM Export] No translation found for cellId: ${mapping.cellId}`);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Warning condition appears to be unreachable.

The condition on line 384 checks cellTranslations.has(mapping.cellId) in the "no translation found" else branch. However, if cellTranslations.has(mapping.cellId) is true, the translation should have been retrieved on line 172. This warning will never trigger as written.

         } else {
             // No translation found - keep original
             updatedLines.push(mapping.originalLine || lines[i]);
-            if (mapping.cellId && cellTranslations.has(mapping.cellId)) {
-                console.warn(`[USFM Export] No translation found for cellId: ${mapping.cellId}`);
+            if (mapping.cellId) {
+                // Log that we expected a translation but didn't find one
+                console.log(`[USFM Export] No translation found for cellId: ${mapping.cellId}`);
             }
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} else {
// No translation found - keep original
updatedLines.push(mapping.originalLine || lines[i]);
if (mapping.cellId && cellTranslations.has(mapping.cellId)) {
console.warn(`[USFM Export] No translation found for cellId: ${mapping.cellId}`);
}
}
} else {
// No translation found - keep original
updatedLines.push(mapping.originalLine || lines[i]);
if (mapping.cellId) {
// Log that we expected a translation but didn't find one
console.log(`[USFM Export] No translation found for cellId: ${mapping.cellId}`);
}
}
🤖 Prompt for AI Agents
In
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/usfmExporter.ts
around lines 381 to 387, the warning checks cellTranslations.has(mapping.cellId)
inside the "no translation found" else branch which is unreachable; change the
condition to warn when a mapping.cellId exists but the cellTranslations map does
NOT contain it (i.e., invert the check to !cellTranslations.has(mapping.cellId))
or remove the redundant warning entirely—ensure the warning only fires when
there is a cellId and its translation is missing.

@@ -0,0 +1,409 @@
import React, { useState, useCallback } from "react";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Missing "use client" directive.

Per coding guidelines for TSX files in a Vite + React + TypeScript VSCode webview environment, add the "use client" directive at the top of the file.

+"use client";
+
 import React, { useState, useCallback } from "react";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import React, { useState, useCallback } from "react";
"use client";
import React, { useState, useCallback } from "react";
🤖 Prompt for AI Agents
In
webviews/codex-webviews/src/NewSourceUploader/importers/usfm/experimental/UsfmImporterForm.tsx
around line 1, the file is missing the required "use client" directive for TSX
files in our Vite + React + TypeScript webview environment; add a single line
containing "use client" (exactly that string with quotes) at the very top of the
file before any imports to enable client-side semantics.

console.log(`[USFM Export] LineMappings with cellId: ${lineMappings.filter(m => m.cellId && m.cellId !== '').length}`);
} else {
// First param is codexCells (backward compatibility - old experimental exporter)
cells = lineMappingsOrCells as Array<{ kind: number; value: string; metadata: any; }>;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty lineMappings array causes codexCells to be ignored

When lineMappings is an empty array [], the export fails to use the provided codexCells translations. In exportHandler.ts, the check if (lineMappings) passes for an empty array (truthy in JavaScript), calling exportUsfmRoundtrip with three arguments. However, in exportUsfmRoundtrip, the condition lineMappingsOrCells.length > 0 fails for an empty array, causing the function to incorrectly treat the empty array as codexCells and ignore the actual third parameter. This results in no translations being applied to the exported file.

Additional Locations (1)

Fix in Cursor Fix in Web

const cellChapter = seenFirstChapter ? currentChapter : 1;

// Handle verse markers specially - collect multi-line verses
if (marker === 'v' || marker.startsWith('v')) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marker check incorrectly matches \va and \vp as verses

The condition marker === 'v' || marker.startsWith('v') incorrectly matches USFM markers \va (alternate verse number) and \vp (published verse character) as regular verse markers. These markers have different formats and semantics than \v. When these markers appear on their own lines, they would be incorrectly parsed as verses, potentially corrupting the cell structure and breaking round-trip export for USFM files using these markers.

Additional Locations (1)

Fix in Cursor Fix in Web

currentVerse = {
verseNumber,
verseText: verseText ? [verseText] : [],
breakTags: [''],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array misalignment when verse line has no text

When a verse line has no text (e.g., \v 1 followed by \li1 text), the verseText and breakTags arrays become misaligned. The initialization sets verseText: [] when text is empty, but breakTags: [''] always starts with one element. When continuation lines are added, both arrays receive a push(), but they remain off by one. In finishCurrentVerse(), the loop iterates over verseText.length, so the continuation line's text at index 0 incorrectly pairs with the empty break tag at index 0 instead of the actual break tag at index 1. This causes missing <br> tags in HTML output and incorrect round-trip export for USFM files with poetry or list structures.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants