.
├── delete-media-duplicates.sh # Main script
├── test-delete-media-duplicates.sh # Test suite
├── docs/
│ ├── usage.md # End-user documentation
│ ├── development.md # This file
│ └── testing.md # Testing guide
├── readme.md
├── license.txt
└── .editorconfig
The script uses a tab character ($'\t') as a delimiter between MD5 hashes and file paths in the internal array. This avoids issues with colons in filenames (which broke the original colon-delimited approach).
Files are discovered with find -print0 and read with read -d '' to correctly handle filenames containing spaces, newlines, or other special characters.
The ffmpeg command uses < /dev/null to prevent it from consuming stdin, which would otherwise interfere with the while read loop.
If ffmpeg cannot process a file (non-media, corrupt, etc.), the script catches the failure, prints a warning, and continues processing remaining files.
- Shell: Bash with
set -euo pipefail - Formatting: Follow
.editorconfigsettings (spaces, final newline) - Quoting: All variables are double-quoted to prevent word splitting
- Comments: Inline comments for non-obvious logic only
- Edit
delete-media-duplicates.sh - Run the test suite to verify nothing is broken (see testing.md)
- Test manually with a real media directory if the change affects file handling
- Commit with a descriptive message