-
Notifications
You must be signed in to change notification settings - Fork 262
Closed
Labels
Description
Overview
This task involves updating embed_content in RecommendationsAdapter to be able to get all file URLs for each node from which textual content will be extracted.
Description and outcomes
- Update the
embed_contentinRecommendationsAdapter- The
embed_contentaccepts a list of nodes(ContentNode) as a parameter. - For each node, find all file URLs to be extracted
- Use
kindandpresetfields inContentNodeandFilemodels respectively to determine which file URLs to extract. - Currently, all studio files are store in this bucket
- The
- Finding file URLs
- Audio files (
mp3)- Return the corresponding URL(s)
- Video files (
mp4,web)- Return the corresponding subtitle URL(s) if they exist, else return corresponding URL(s) for the actual video files
- HTML files (
html5)- HTML files are uploaded as zip files and extracted into this bucket
- Return the corresponding URL(s) of the extracted zip location.
- H5P files (
h5p)- Return the corresponding URL(s)
- ZIM files (
zim)- Return the corresponding URL(s)
- Document files (
pdf,epub)- Return the corresponding URL(s)
- Exercise files (Perseus)
- Return the corresponding URL(s)
- Audio files (
- Making a request
- Make a request to the recommendations backend. For example
Where
body = { 'resources': resources, 'metadata': {} } embed_content_request = EmbedContentRequest( headers={}, # Leaving this to allow for passing of headers to external api params={}, # Same for this body=body ) return self.backend.make_request(embed_content_request)resourcesis the updated embed_content_request.json
- Make a request to the recommendations backend. For example
Acceptance Criteria
- All file URLs associated with the passed nodes in
embed_contentare extracted correctly
Assumptions and Dependencies
- Is blocked
Scope
The scope of this task is limited to;
- Updating
embed_contentto gather all file URLs required for content extraction.
Accessibility Requirements
NA