Multi-label image classification with novel object from single-label dataset by mask conformer model

  • Nghiêm Văn Triệu
  • Ngô Quốc Tạo
Keywords: Conformer model, ImageNet dataset, multi-label image classification, re-label ImageNet, single-label dataset.

Abstract

On the basis of single-label datasets, the Convolutional Neural Network (CNN) and, more recently, the Transformer model, have shown to be successful at classifying single-label images. The lack of multi-label datasets for model training is a significant obstacle when it comes to the problem of multi-label image classification. In this paper, we propose a Conformer model and a BERT-like mask method for multi-label image classification based on the ImageNet single-label dataset and Coco multi-label dataset. ImageNet is used to train the “main” object in the image (ImageNet object) and Coco to recognize “secondary” objects in the image. The proposed model can identify the "main" object and other common objects in images when combined with a small amount of multi-label context data, which is a "hybrid" of objects from Coco and ImageNet to connect different datasets. In addition, the model can be applied to a multi-label reassignment of the ImageNet dataset with specific context information.

Tác giả

Nghiêm Văn Triệu

Tổng công ty Viễn thông Mobifone

Ngô Quốc Tạo

Viện Công nghệ thông tin – Viện Hàn lâm Khoa học và Công nghệ Việt Nam

điểm /   đánh giá
Published
2023-03-28
Section
Bài viết