imgdd is a performance-first perceptual hashing library that combines Rust's speed with Python's accessibility, making it perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets.
- Multiple Hashing Algorithms: Supports
aHash,dHash,mHash,pHash,wHash. - Multiple Filter Types: Supports
Nearest,Triangle,CatmullRom,Gaussian,Lanczos3. - Identify Duplicates: Quickly identify duplicate hash pairs.
- Simplicity: Simple interface, robust performance.
imgdd has been inspired by imagehash and aims to be a lightning-fast replacement with additional features. To ensure enhanced performance, imgdd has been benchmarked against imagehash. In Python, imgdd consistently outperforms imagehash by ~60%–95%, demonstrating a significant reduction in hashing time per image.
pip install imgddimport imgdd as dd
results = dd.hash(
path="path/to/images",
algo="dhash", # Optional: default = dhash
filter="triangle", # Optional: default = triangle
sort=False # Optional: default = False
)
print(results)import imgdd as dd
duplicates = dd.dupes(
path="path/to/images",
algo="dhash", # Optional: default = dhash
filter="triangle", # Optional: default = triangle
remove=False # Optional: default = False
)
print(duplicates)- aHash: Average Hash
- mHash: Median Hash
- dHash: Difference Hash
- pHash: Perceptual Hash
- wHash: Wavelet Hash
Nearest,Triangle,CatmullRom,Gaussian,Lanczos3
Contributions are always welcome! 🚀
Found a bug or have a question? Open a GitHub issue. Pull requests for new features or fixes are encouraged!
- https://github.com/JohannesBuchner/imagehash
- https://github.com/commonsmachinery/blockhash-python
- https://github.com/acoomans/instagram-filters
- https://pippy360.github.io/transformationInvariantImageSearch/
- https://www.phash.org/
- https://pypi.org/project/dhash/
- https://github.com/thorn-oss/perception (based on imagehash code, depends on opencv)
- https://docs.opencv.org/3.4/d4/d93/group__img__hash.html