Unsupervised Hashing algorithms in python

This repository contains the python implementation of some notable unsupervised hashing algorithms mostly based on spectral hashing formulation. If you found this code useful in your research, please consider citing one of the following papers:

@article{ESH,
title = {A non-alternating graph hashing algorithm for large-scale image search},
journal = {Computer Vision and Image Understanding},
pages = {103415},
year = {2022},
issn = {1077-3142},
doi = {https://doi.org/10.1016/j.cviu.2022.103415},
url = {https://www.sciencedirect.com/science/article/pii/S1077314222000418},
author = {Sobhan Hemati and Mohammad Hadi Mehdizavareh and Shojaeddin Chenouri and Hamid R. Tizhoosh},
keywords = {Hashing, Graph hashing, Image search, Spectral hashing},
}

@INPROCEEDINGS{9506441,
  author    = {Hemati, Sobhan and Mehdizavareh, Mohammad Hadi and Babaie, Morteza and Kalra, Shivam and Tizhoosh, H.R.},
  booktitle = {2021 IEEE International Conference on Image Processing (ICIP)}, 
  title     = {A Simple Supervised Hashing Algorithm Using Projected Gradient and Oppositional Weights}, 
  year      = {2021},
  volume    = {},
  number    = {},
  pages     = {2748-2752},
  doi       = {10.1109/ICIP42928.2021.9506441}}

@article{HEMATI202244,
title = {Beyond neighbourhood-preserving transformations for quantization-based unsupervised hashing},
journal = {Pattern Recognition Letters},
volume = {153},
pages = {44-50},
year = {2022},
issn = {0167-8655},
doi = {https://doi.org/10.1016/j.patrec.2021.11.007},
url = {https://www.sciencedirect.com/science/article/pii/S0167865521003974},
author = {Sobhan Hemati and H.R. Tizhoosh},
keywords = {Image Search, Unsupervised Hashing, Quantization, Binary Representation},
}

What algorithms are available?

The following algorithms have been implemented:

1- Iterative Quantization (ITQ) - See demo_ITQ.py

2- Spectral Hashing (SH) - See demo_SH.py

3- Kernelized Spectral Hashing (KSH) - See demo_KSH.py

4- Discrete Graph Hashing (DGH) - See demo_DGH.py

5- Large Graph Hashing with Spectral Rotation (LGHSR) also known as "Spectral Hashing with Spectral Rotation" (SHSR) - See demo_LGHSR.py

6- Discrete Spectral Hashing (DSH) - See demo_DSH.py

7- Reversed Spectral Hashing (ReSH) - See demo_ReSH.py

Also we have proposed a new algorithm based on spectral hashing, called Efficient Spectral Hashing (ESH). You can find the source code of this algorithm here.

Requirements

You should have scipy, numpy, and scikit-learn installed. Also, for running ReSH algorithm, you need tensorflow 2.1 or higher.

Usage

1- You should download one of the datasets in the Datasets section, put them in a folder (e.g. "C:\test") and set dataset_name parameter in the demo scripts to the dataset you downloaded. Also, you should change the path parameter to the dataset folder. For example if you want to use "labelme_vggfc7" dataset, you should change dataset_name and path parameters as follows:

path = r'C:\test' # folder containing dataset
dataset_name = 'labelme_vggfc7'

2- K is number of bits.

Datasets

The following datasets are available:

colorectal_EfficientNet

SUN397 (based on SCQ paper)

Some notes about our implementation

1- To reduce the computational complexity of calculating the affinity matrix in SH formulation, Most algorithms use the low-rank approximation proposed in AGH paper. We also used it in all of the algorithms which are based on SH formulation (except ReSH). You can change the parameters related to this affinity approximation technique in "parameter initialization" cell in the demo scripts.

2- There are multiple ways to generate binary codes for unseen samples (samples which are not in training data). One technique is learning a linear projection matrix (or linear hash functions) using least-square estimation and projecting unseen samples onto this matrix. We have implemented this technique in RRC function (see utilities script). However, most papers have used another technique called out-of-sample extension proposed again by AGH paper. We considered this technique as the default way for generating binary codes of unseen samples.

3- You can compute affinity matrix for each dataset only once and save it. This reduces the time required for training multiple models on the same data (e.g. for different Ks). We have implemented this in the demo scripts of SH-based methods (except ReSH).

4- For ReSH method, we couldn't get the same results with the paper. There are high chances that hyperparameters haven't been set properly in our implementation. Please consider this if you are going to use it and tune it in your case.

Acknowledgements

We would like to thank the authors of KNNH paper for facilitating access to most datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
saved_arrays		saved_arrays
Affinity_matrix.py		Affinity_matrix.py
Datasets.py		Datasets.py
Discrete_hashing_methods.py		Discrete_hashing_methods.py
ITQ.py		ITQ.py
README.md		README.md
ReSH.py		ReSH.py
SH.py		SH.py
demo_DGH.py		demo_DGH.py
demo_DSH.py		demo_DSH.py
demo_ITQ.py		demo_ITQ.py
demo_KSH.py		demo_KSH.py
demo_LGHSR.py		demo_LGHSR.py
demo_ReSH.py		demo_ReSH.py
demo_SH.py		demo_SH.py
evaluate.py		evaluate.py
init.py		init.py
out_of_sample.py		out_of_sample.py
utilities.py		utilities.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Unsupervised Hashing algorithms in python

What algorithms are available?

Requirements

Usage

Datasets

Some notes about our implementation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

zavareh89/unsupervised-hashing

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Hashing algorithms in python

What algorithms are available?

Requirements

Usage

Datasets

Some notes about our implementation

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages