Siamese Networks: AI’s Dynamic Duo for Smarter Similarity Learning!

By Team Algo
Reading Time: 4 minutes

By Sahil Kavitake

“Recognizing the difference between good and bad is the first step towards wisdom.”

Similarity Learning

Think about how you choose your friends. You tend to stay close to people who are kind and positive, and you distance yourself from those who aren’t. This natural behavior is similar to a concept in machine learning called similarity learning.

Congratulations! You’ve just learned what similarity learning is. Siamese neural networks work in a similar way. They help computers figure out if two things are similar or different. Let’s explore how these networks mimic our brains in identifying good and bad relationships.

Siamese Neural Network

In deep learning, neural networks often need a lot of data to work well. But for tasks like face recognition and signature verification, it’s not always easy to get enough data.

Traditionally, a neural network learns to predict multiple classes. This poses a problem when we need to add/remove new classes to the data. In this case, we have to update the neural network and retrain it on the whole dataset. That’s where siamese networks come in picture.

Siamese networks focus on comparing pairs of images (reference image and test image) and use a special similarity function to decide how similar two images are. This helps in tasks where accuracy matters more than just labeling things into groups.

Loss function in Siamese networks

Let us go through two main loss functions of the siamese network which are Contrastive loss and Triplet loss.

Contrastive loss function

This function just evaluates how well the siamese network is able to distinguish between the given image pairs.

Example in Practice:

Imagine we are training a siamese network to recognize students for attendance:

  1. Positive Pair: We show the network two images of the same student, Alice, labeled as a positive pair. The network extracts features from both images and calculates a small distance between them, meaning they are similar. Contrastive loss is low because the network correctly identifies them as the same person.
  2. Negative Pair: We then show the network an image of Alice and an image of Bob, labeled as a negative pair. The network extracts features from both images and calculates a large distance between them, meaning they are different. Contrastive loss is low again because the network correctly identifies them as different people.

By minimizing contrastive loss, the siamese network learns to accurately determine whether pairs of images are of the same person or not.

Triplet loss function

Triplet of Images: For triplet loss, we use three images: an anchor, a positive, and a negative.

  • Anchor Image: A reference image of a student, for example, Alice.
  • Positive Image: Another image of the same student, Alice.
  • Negative Image: An image of a different student, Bob.

Example in Practice:

  1. Anchor and Positive Pair: We show the network an anchor image of Alice and a positive image of Alice. The network extracts features from both and calculates a small distance between them because they are the same person. This part of the triplet loss encourages the network to recognize them as similar.
  2. Anchor and Negative Pair: We then show the network an anchor image of Alice and a negative image of Bob. The network extracts features from both and calculates a large distance between them because they are different people. This part of the triplet loss encourages the network to recognize them as dissimilar.

Triplet Loss: The loss is minimized when the positive distance is smaller than the negative distance by a margin. This encourages the network to correctly distinguish between similar and dissimilar images.

Pros and Cons of Siamese Networks:

ProsCons
Robustness to Class Imbalance: Effective with very little training data, useful for one-shot learning.Training Time: Requires more training time due to pairwise learning.
Ensemble Compatibility: Can be combined with other classifiers for improved performance.No Probability Outputs: Outputs distances instead of probabilities, using measures like Euclidean distance.
Semantic Similarity: Learns to place similar classes close together in the feature space.

Implementation in Keras:

Image similarity estimation using a Siamese Network:

https://keras.io/examples/vision/siamese_network/

Conclusion:

As AI continues to revolutionize industries, Siamese networks are emerging as a powerful tool in similarity learning tasks, including face recognition and signature verification. These networks offer a significant advantage in handling class imbalances and understanding deep semantic relationships, making them invaluable for improving accuracy in various applications. Despite challenges like longer training times and the lack of probabilistic outputs, their potential is driving advancements in many sectors.

At AlgoAnalytics, we remain at the forefront of these developments, leveraging techniques like Siamese networks when addressing complex problems in similarity learning. This approach allows us to continually innovate and enhance the solutions we provide.

References:

https://www.mygreatlearning.com/blog/siamese-networks/

https://medium.com/@rinkinag24/a-comprehensive-guide-to-siamese-neural-networks-3358658c0513