By Mugdha Thigle, Associate Data Scientist at AlgoAnalytics
While generic object detectors perform well on medium and large sized objects, they perform poorly for the overall task of recognition of small objects. Few examples of small objects would be ships as seen in satellite images (as shown in Fig. 1) or traffic signs seen from far away drone imaging. Small objects detection is a challenging task in computer vision due to its limited resolution and information. In this article we will explore Feature Pyramid Networks for small object detection and Super Resolution GANs for data augmentation and performance improvements.
Feature Pyramid Networks[1]
Figure 2. Feature Pyramid Network
Feature pyramids[2] are a basic component in recognition systems for detecting objects at different scales. But recent deep learning object detectors have avoided pyramid representations, in part because they are compute and memory intensive. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, as shown in Fig.2, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. When implemented on the airbus ship dataset[3], which is a collection of satellite images of ships in the ocean, a recall of 0.954 and mAP of 0.911 was achieved. Sample results as shown in Fig.3–4.
Figure 3, Small objects (ships) detected by FPN
Figure 4. Small objects (ships) detected by FPN
Super Resolution[4]
Super Resolution is the process of recovering a High Resolution (HR) image from a given Low Resolution (LR) image. An image may have a “lower resolution” due to a smaller spatial resolution (i.e. size) or due to a result of degradation (such as blurring). SR received substantial attention from within the computer vision research community and has a wide range of applications. As one of the main issues with small object detection is lack of appropriate picture clarity and resolution, it was thought that performing super resolution on the images might come in handy. For this, SRGAN[5] was used. During the training, A high-resolution image (HR) is downsampled to a low-resolution image (LR). A GAN generator upsamples LR images to super-resolution images (SR). We use a discriminator to distinguish the HR images and backpropagate the GAN loss to train the discriminator and the generator as shown in Fig.5. SRGAN uses a perceptual loss measuring the MSE of features extracted by a VGG-19 network. For a specific layer within VGG-19, we want their features to be matched (Minimum MSE for features).
Figure 5. Basic SRGAN architecture
However, with the airbus dataset, using super resolution showed no improvement in the performance. This is most likely because the image quality was not the issue for said dataset. The comparison table is shown in Fig.6.
Figure 6. Comparison Table
Small object detection is a challenging problem in computer vision. Showcased here is one of the many ways that we can continue working on it. Feature Pyramid Networks show significant improvement over more popular object detection methods such as YOLOv3 and thus show promise in the domain of small object detection. It has been widely applied in defense, military, transportation, industry, etc. It is extensively used for self driving cars in order to recognize street signs and pedestrians from a long way away and avoid accidents. Another major application is in the manufacturing industry, where detecting a small defect early on during assembly can save more money required for repairs or replacement than if the defect was found at a later stage in the assembly process. SRGAN may not have helped improve the performance for the airbus dataset, but it should not be dismissed when working on detecting small objects in lower quality images.
We, at AlgoAnalytics, have used innovative techniques for small object detection in satellite imaging using Feature Pyramid Networks and created a demo for the same.
Demos and Contact Information
For demo visit our link https://algoanalytics.com/demoapp
For further information, please contact: info@algoanalytics.com
References
[1] https://github.com/DetectionTeamUCAS/FPN_Tensorflow
[2] https://arxiv.org/pdf/1612.03144.pdf
[3] https://www.kaggle.com/c/airbus-ship-detection/data
[4] https://github.com/krasserm/super-resolution
[5] https://arxiv.org/pdf/1609.04802.pdf