Detect and process outliers for temperature data at 3h monitoring stations in Vietnam

  • Affiliations:

    1 Faculty of Information Technology, Hanoi University of Mining and Geology, Vietnam;
    2 AI Academy Vietnam, Vietnam;
    3 Center for Hydro - Meteorological Data and Information, Vietnam;
    4 Falculty of Information Technology Technical University, Vietnam

  • *Corresponding:
    This email address is being protected from spambots. You need JavaScript enabled to view it.
  • Received: 15th-Nov-2019
  • Revised: 6th-Jan-2020
  • Accepted: 28th-Feb-2020
  • Online: 28th-Feb-2020
Pages: 132 - 146
Views: 21225
Downloads: 8937
Rating: 5.0, Total rating: 361
Yours rating

Abstract:

Data preparation is a compulsory process in any data science project. Many research have shown that it constitutes 80% of the time, effort and resources of a data science project. Depending on the particular project and data type, Data preparation step may required different methods/steps. Detecting and processing outlier data is one of the important preprocessing steps in data preparation , especially for time series data. This paper reviews two methods for detecting outliers for low dimensional data, namely Z - Score and Box - plot charts. We also present results of experiments which applied these methods for temperature data collected from 43 monitoring stations in 3 - hour in Vietnam over the last 6 years from 01/01/2014 to 31/12/2019.

How to Cite
Dang, N.Van, Nong, O.Thi, Nguyen, H.Xuan, Ngo, M.Van and Nguyen, H.Thi 2020. Detect and process outliers for temperature data at 3h monitoring stations in Vietnam (in Vietnamese). Journal of Mining and Earth Sciences. 61, 1 (Feb, 2020), 132-146. DOI:https://doi.org/10.46326/JMES.2020.61(1).15.
References

Charu C., Aggarwal, (2017). Outlier Analysis, Springer International Publishing AG, New York.

Davy Cielen, Arno D. B., Meysman, Mohamed Ali, (2016). Introducing Data Science, Manning Publications Co.

Hermine N., Akouemo, Richard J. Povinelli, (2014). Time series outlier detection and imputation, IEEE.

Nguyễn Văn Tuấn, (2014). Phân tích dữ liệu với R, Nhà xuất bản tổng hợp Thành phố Hồ Chí Minh.

Ranga Suri, N. N. R , Narasimha Murty M., Athithan, G., (2018). Outlier Detection: Techniques and Applications, Springer Nature Switzerland AG, Cham.

Tamara Munzer, (2014). Visualization Analysis and Design,CRC Press.