TEXTKNOCKOFF: KNOCKOFF NETS FOR STEALING FUNCTIONALITY OF TEXT SENTIMENT MODELS

Xuan Cong Pham; Trung Nguyen Hoang; Cao Truong Tran; Viet Binh Do

Xuan Cong Pham Military Institute of Information Technology, 17 Hoang Sam, Nghia Do, Cau Giay, Ha Noi
Trung Nguyen Hoang Institute of Information and Communication Technology, Le Quy Don Technical University
Cao Truong Tran Institute of Information and Communication Technology, Le Quy Don Technical University
Viet Binh Do Military Institute of Information Technology, 17 Hoang Sam, Nghia Do, Cau Giay, Ha Noi

Keywords: Text classification, text sentiment, black-box model stealing, knockoff model

Abstract

Most commercial machine learning models today are designed to require significant amounts of time, money, and human effort. Therefore, intrinsic information about the model (such as architecture, hyperparameters, and training data) needs to be kept confidential. These models are referred to as black boxes, and there is an increasing amount of research focused on both attacking and protecting them. Recent publications have often concentrated on the field of computer vision; in contrast, there is still relatively little research on methods for attacking black box models with textual data. This article introduces a research method for extracting the functionality of a black box model in the task of text sentiment analysis. The method has been effectively tested based on random sampling techniques to reconstruct a new model with equivalent functionality to the original model, achieving high accuracy (94.46% compared to 94.92%) and high similarity (96.82%).

TEXTKNOCKOFF: KNOCKOFF NETS FOR STEALING FUNCTIONALITY OF TEXT SENTIMENT MODELS

Abstract

BỘ KHOA HỌC VÀ CÔNG NGHỆ - MINISTRY OF SCIENCE AND TECHNOLOGY OF VIETNAM

CỤC THÔNG TIN, THỐNG KÊ - NATIONAL AGENCY FOR SCIENCE AND TECHNOLOGY INFORMATION AND STATISTICS