Multi-label image classification with novel object from single-label dataset by mask conformer model
Abstract
On the basis of single-label datasets, the Convolutional Neural Network (CNN) and, more recently, the Transformer model, have shown to be successful at classifying single-label images. The lack of multi-label datasets for model training is a significant obstacle when it comes to the problem of multi-label image classification. In this paper, we propose a Conformer model and a BERT-like mask method for multi-label image classification based on the ImageNet single-label dataset and Coco multi-label dataset. ImageNet is used to train the “main” object in the image (ImageNet object) and Coco to recognize “secondary” objects in the image. The proposed model can identify the "main" object and other common objects in images when combined with a small amount of multi-label context data, which is a "hybrid" of objects from Coco and ImageNet to connect different datasets. In addition, the model can be applied to a multi-label reassignment of the ImageNet dataset with specific context information.