-
Notifications
You must be signed in to change notification settings - Fork 34
Description
❌ This issue is not open for contribution. Visit Contributing guidelines to learn about the contributing process and how to find suitable issues.
Overview
This is the FOUNDATION issue for the constants migration project. It establishes the infrastructure and pattern that Issues #2-5 will follow. This issue must be completed before the other migration issues can proceed.
Context
Currently, le_utils/constants/file_formats.py uses the legacy approach:
- Loads
resources/formatlookup.jsonat runtime withpkgutil.get_data() - Manual Python constants (
MP4 = "mp4",PDF = "pdf", etc.) must be kept in sync - Manual
_FORMATLOOKUPdict andgetformat()helper function - No JavaScript export available
- Tests verify Python/JSON sync
This issue migrates it to the modern spec + code generation approach used by 8 other modules.
Scope
This issue will:
- Enhance
generate_from_specs.pyto support namedtuple-based constants (the key infrastructure work) - Create
spec/constants-file_formats.jsonfollowing the new format - Generate Python and JavaScript files via
make build - Update tests to verify against the spec
- Delete
resources/formatlookup.json - Document the spec format for subsequent tasks
Current Structure
File: le_utils/resources/formatlookup.json (only has 20 formats)
{
"mp4": {"mimetype": "video/mp4"},
"webm": {"mimetype": "video/webm"},
"vtt": {"mimetype": ".vtt"},
"pdf": {"mimetype": "application/pdf"},
...
}Python module (file_formats.py) currently has 40+ manual constants including:
- Formats in JSON:
MP4,WEBM,VTT,PDF,EPUB,MP3,JPG,JPEG,PNG,GIF,JSON,SVG,GRAPHIE,PERSEUS,H5P,ZIM,HTML5(zip),BLOOMPUB,BLOOMD,HTML5_ARTICLE(kpub) - Formats NOT in JSON (these need to be added to spec):
AVI,MOV,MPG,WMV,MKV,FLV,OGV,M4V,SRT,TTML,SAMI,SCC,DFXP - Namedtuple:
class Format(namedtuple("Format", ["id", "mimetype"])): pass - LIST, choices tuple, helper function
getformat()
Target Spec Format
Create spec/constants-file_formats.json with ALL formats including those currently missing from JSON:
{
"namedtuple": {
"name": "Format",
"fields": ["id", "mimetype"]
},
"constants": {
"mp4": {"mimetype": "video/mp4"},
"webm": {"mimetype": "video/webm"},
"avi": {"mimetype": "video/x-msvideo"},
"mov": {"mimetype": "video/quicktime"},
"mpg": {"mimetype": "video/mpeg"},
"wmv": {"mimetype": "video/x-ms-wmv"},
"mkv": {"mimetype": "video/x-matroska"},
"flv": {"mimetype": "video/x-flv"},
"ogv": {"mimetype": "video/ogg"},
"m4v": {"mimetype": "video/x-m4v"},
"vtt": {"mimetype": "text/vtt"},
"srt": {"mimetype": "application/x-subrip"},
"ttml": {"mimetype": "application/ttml+xml"},
"sami": {"mimetype": "application/x-sami"},
"scc": {"mimetype": "text/x-scc"},
"dfxp": {"mimetype": "application/ttaf+xml"},
"mp3": {"mimetype": "audio/mpeg"},
"pdf": {"mimetype": "application/pdf"},
"epub": {"mimetype": "application/epub+zip"},
"jpg": {"mimetype": "image/jpeg"},
"jpeg": {"mimetype": "image/jpeg"},
"png": {"mimetype": "image/png"},
"gif": {"mimetype": "image/gif"},
"json": {"mimetype": "application/json"},
"svg": {"mimetype": "image/svg+xml"},
"graphie": {"mimetype": "application/graphie"},
"perseus": {"mimetype": "application/perseus+zip"},
"h5p": {"mimetype": "application/h5p+zip"},
"zim": {"mimetype": "application/zim"},
"zip": {"mimetype": "application/zip"},
"bloompub": {"mimetype": "application/bloompub+zip"},
"bloomd": {"mimetype": "application/bloompub+zip"},
"kpub": {"mimetype": "application/kpub+zip"}
}
}How to determine mimetypes for missing formats:
- Check MDN Web Docs for standard mimetypes: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types
- For video/subtitle formats, use common IANA registered types or conventional x- prefixes
- For custom formats (graphie, perseus, kpub, etc.), use
application/{format}orapplication/{format}+zippattern - When uncertain, search "mime type for {extension}" or check existing file type databases
Generation Script Enhancement
Update scripts/generate_from_specs.py to handle the namedtuple format:
-
Modify
read_constants_specs()to detect and handle namedtuple format:- Check if spec has
namedtuplekey - If yes, extract namedtuple definition and constants
- If no, use existing simple constant handling
- Check if spec has
-
Update
write_python_file()to support namedtuples:- Add
from collections import namedtupleimport when needed - Generate namedtuple class definition
- Generate
{MODULE}LISTwith namedtuple instances - Generate uppercase constants from keys (e.g.,
MP4 = "mp4") - Generate
_MIMETYPEconstants (e.g.,MP4_MIMETYPE = "video/mp4") for each format - Generate choices tuple with custom display names (from spec or title-cased)
- Generate lookup dict:
_{MODULE}LOOKUP = {item.id: item for item in {MODULE}LIST} - Generate helper function (e.g.,
getformat())
- Add
-
Update
write_js_file()to export rich namedtuple data with PascalCase:- Export constant name → id mapping (default export, e.g.,
MP4: "mp4") - Export
FormatsList- full namedtuple data as array - Export
FormatsMap- Map for efficient lookups
- Export constant name → id mapping (default export, e.g.,
Generated Output Example
Python (le_utils/constants/file_formats.py):
# -*- coding: utf-8 -*-
# Generated by scripts/generate_from_specs.py
from __future__ import unicode_literals
from collections import namedtuple
# FileFormats
class Format(namedtuple("Format", ["id", "mimetype"])):
pass
# Format constants
MP4 = "mp4"
WEBM = "webm"
AVI = "avi"
PDF = "pdf"
# ... (all formats)
# Mimetype constants
MP4_MIMETYPE = "video/mp4"
WEBM_MIMETYPE = "video/webm"
AVI_MIMETYPE = "video/x-msvideo"
PDF_MIMETYPE = "application/pdf"
# ...
choices = (
(MP4, "Mp4"),
(WEBM, "Webm"),
(AVI, "Avi"),
(PDF, "Pdf"),
# ...
)
FORMATLIST = [
Format(id="mp4", mimetype="video/mp4"),
Format(id="webm", mimetype="video/webm"),
Format(id="avi", mimetype="video/x-msvideo"),
# ...
]
_FORMATLOOKUP = {f.id: f for f in FORMATLIST}
def getformat(id, default=None):
"""
Try to lookup a file format object for its `id` in internal representation.
Returns None if lookup by internal representation fails.
"""
return _FORMATLOOKUP.get(id) or NoneJavaScript (js/FileFormats.js):
// Generated by scripts/generate_from_specs.py
// Format constants
export default {
MP4: "mp4",
WEBM: "webm",
AVI: "avi",
PDF: "pdf",
// ...
};
// Full format data with mimetypes
export const FormatsList = [
{ id: "mp4", mimetype: "video/mp4" },
{ id: "webm", mimetype: "video/webm" },
{ id: "avi", mimetype: "video/x-msvideo" },
{ id: "pdf", mimetype: "application/pdf" },
// ...
];
// Lookup Map
export const FormatsMap = new Map(
FormatsList.map(format => [format.id, format])
);This way JavaScript code can:
- Use constants:
import FileFormats from './FileFormats'; if (ext === FileFormats.MP4) ... - Access full data:
import { FormatsList } from './FileFormats'; - Look up by id:
import { FormatsMap } from './FileFormats'; const format = FormatsMap.get('pdf');
Testing Updates
File: tests/test_formats.py
Update to test against spec instead of old JSON:
import os
import json
spec_path = os.path.join(os.path.dirname(__file__), "..", "spec", "constants-file_formats.json")
with open(spec_path) as f:
spec = json.load(f)
formatlookup = spec["constants"]
# Verify all constants in Python module match spec
# Verify FORMATLIST namedtuples match spec data
# Test getformat() helper
# Verify _MIMETYPE constantsHow to Run Tests
# Run file formats tests
pytest tests/test_formats.py -v
# Run all tests to ensure nothing broke
pytest tests/ -vAcceptance Criteria
-
scripts/generate_from_specs.pyenhanced to support namedtuple specs -
spec/constants-file_formats.jsoncreated with ALL formats (including AVI, MOV, SRT, etc. currently missing) - Mimetypes determined for all missing formats (using MDN/IANA resources)
-
make buildsuccessfully generates Python and JavaScript files - Generated
le_utils/constants/file_formats.pyhas:- Namedtuple class definition
- Uppercase format constants for ALL formats
-
_MIMETYPEconstants for each format -
choicestuple -
FORMATLISTwith namedtuple instances -
_FORMATLOOKUPdict -
getformat()helper function
- Generated
js/FileFormats.jshas:- Default export with constant name mappings
-
FormatsListexport (PascalCase) with full data -
FormatsMapexport (PascalCase) as Map
-
tests/test_formats.pyupdated to test against spec - All tests pass:
pytest tests/ -v -
resources/formatlookup.jsondeleted - Auto-generated comment in code
Disclosure
🤖 This issue was written by Claude Code, under supervision, review and final edits by @rtibbles 🤖
