Validity of Homogeneity Tests for Meteorological Time Series Data: A Simulation Study
Ceylan Yozgatligil, Ceyda Yazici, Vilda Purutcuoglu, Inci Batmaz
Statistics, Middle East Technical University, Ankara, Cankaya, Turkey

Climate change and related problems have a serious impact on the nature. Analyzing, modeling and forecasting meteorological variables are crucial issues to prevent disasters. To be able to obtain reliable results, quality of meteorological data is very important. One of the important quality control methods in meteorological data is the test of homogeneity. If there are changes in the series due to non-climate reasons, they are known as non-homogeneous. There are several causes of non-homogeneity such as abrupt discontinuities, gradual or instant changes or changes in the variability. Changes in the location of the station, in the instrumentation or in the calculations of averages are known as abrupt discontinuities. Gradual change can be the result of the change in the surroundings of the station, urbanization, or changes in the instrumental characteristics. In anyway, non-homogeneity of the series must be detected and corrected before analyzing the data.

In the literature, there are two groups of homogeneity tests: the ones considering within homogeneity of the series, and considering the homogeneity of the series using the relationship between neighbor stations. In this study, we only focus on the former one. Many of the previous studies on homogeneity uses the tests developed for independent data. Actually, meteorological series are time series data, and thus, have autocorrelation. Therefore, in this study, we aim to evaluate the validity of the homogeneity tests used in the literature on time series data such as Kruskal-Wallis (KW), and compare them with the ones that consider data dependency such as Friedman test and Augmented Dickey Fuller (ADF) unit root test using Monte Carlo simulation technique.

To accomplish our aim, firstly, a time series model is developed for Turkish monthly average temperature data from 1950-2006; then, monthly temperature data is generated using the fitted model with Normally distributed errors. Next, on the yearly data, several scenarios of non-homogeneity are run. First scenario considers mean shifts at the beginning, at the middle and at the end of the series; second one contains an outlier; third one considers a gradual change and increasing trends in the series. Simulation results indicate that ADF test overperforms the others. Besides, KW detects mean shifts and sharp trends easily but fails to detect gradual changes. Also, it even detects wrong points as a break point sometimes. Hence, KW may not be a reliable test to detect the homogeneity, especially if there are no metadata.

Keywords: Homogeneity; Kruskal-Wallis Test; Friedman Test; Augmented Dickey Fuller Test

Biography: I was born in Istanbul, Turkey in 1974. After receiving my Bs and MSc degrees from Department of Statistics at Middle East Technical University, I had my PhD Degree from Department of Statistics at Temple University, USA. My main research interest is time series analysis. Currently, I am an assistant professor at Department of Statistics,Middle East Technical University, Turkey.