-
Notifications
You must be signed in to change notification settings - Fork 56
Description
I've encountered this issue with glacier-cli failing due to git-annex mistakenly adding things that look like file extension to the key when using the SHA256E backend. Essentially what it means is that certain files will have characters that look like a file extension appended to the key, even when they might not be part of the extension.
Example:
% ls 12.\ Change\ The\ World\ \(feat.\ 웅산\).mp3
12. Change The World (feat. 웅산).mp3
% git annex info 12.\ Change\ The\ World\ \(feat.\ 웅산\).mp3
file: 12. Change The World (feat. 웅산).mp3
size: 7.48 megabytes
key: SHA256E-s7479642--957208748ae03fe4fc8d7877b2c9d82b7f31be0726e4a3dec9063b84cc64cf09.웅산.mp3
present: true
% git annex calckey 12.\ Change\ The\ World\ \(feat.\ 웅산\).mp3
SHA256E-s7479642--957208748ae03fe4fc8d7877b2c9d82b7f31be0726e4a3dec9063b84cc64cf09.웅산.mp3
I've opened an issue with git-annex here:
https://git-annex.branchable.com/bugs/git-annex_adds_unicode_characters_at_end_of_checksum/
And the will be a fix for the case with brackets, but there are other cases in which a file extension might not be just ASCII. And then this is what happens:
% git annex copy 12.\ Change\ The\ World\ \(feat.\ 웅산\).mp3 --to glacier
copy 12. Change The World (feat. 웅산).mp3 (checking glacier...) Traceback (most recent call last):
File "/usr/local/bin/glacier", line 737, in <module>
main()
File "/usr/local/bin/glacier", line 733, in main
App().main()
File "/usr/local/bin/glacier", line 719, in main
self.args.func()
File "/usr/local/bin/glacier", line 600, in archive_checkpresent
self.args.vault, self.args.name)
File "/usr/local/bin/glacier", line 161, in get_archive_last_seen
result = self._get_archive_query_by_ref(vault, ref).one()
File "/usr/local/bin/glacier", line 136, in _get_archive_query_by_ref
if ref.startswith('id:'):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xec in position 83: ordinal not in range(128)
(user error (glacier ["--region=eu-west-1","archive","checkpresent","music","--quiet","SHA256E-s7479642--957208748ae03fe4fc8d7877b2c9d82b7f31be0726e4a3dec9063b84cc64cf09.\50885\49328.mp3"] exited 1)) failed
git-annex: copy: 1 failed
Now, As the bug report says, you can avoid this issue by changing your backend from SHA256E to SHA256 to avoid adding extensions. But I think addressing this issue would be good anyway.