Detect and process outliers for temperature data at 3h monitoring stations in Vietnam

  • Nam Van Dang
  • Oanh Thi Nong
  • Hoai Xuan Nguyen
  • Manh Van Ngo
  • Hien Thi Nguyen
Keywords: Outliers, Anomalies, Z-Score, Box-plot

Abstract

Data preparation is a compulsory process in any data science project. Many research have shown that it constitutes 80% of the time, effort and resources of a data science project. Depending on the particular project and data type, Data preparation step may required different methods/steps. Detecting and processing outlier data is one of the important preprocessing steps in data preparation , especially for time series data. This paper reviews two methods for detecting outliers for low dimensional data, namely Z - Score and Box - plot charts. We also present results of experiments which applied these methods for temperature data collected from 43 monitoring stations in 3 - hour in Vietnam over the last 6 years from 01/01/2014 to 31/12/2019.

điểm /   đánh giá
Published
2020-06-17
Section
Bài viết