Statistical Aspects of Analysing Effects of Rainfall on Water Quality
Spiridon Penev1, Daniela Leonte2, Zdravetz Lazarov3
1Statistics, The University of New South Wales, Sydney, NSW, Australia; 2Risk Analytics Group Pty Ltd, Sydney, NSW, Australia; 3Boronia Capital Pty Ltd, Sydney, NSW, Australia

We discuss some novel statistical methods in analysing trends in water quality. Analysis of these trends is a very important activity in catchment management. Such analysis deals with large and complex data sets of different classes of variables like water quality variables, hydrological, meteorological variables and others.

Distinguishing features of water quality data set records include: irregularity of the days when observations were taken, presence of multiple observations on a single day, changing detection limits, different frequencies of data collection etc. Non-normality of many of the variables, presence of missing data and presence of seasonal patterns must also be accounted for in the analysis.

We will concentrate in this talk on analysing the effect of rainfall on trends in water quality variables.

Our approach is to utilise a flexible statistical model which, until now, seems to have been used only in financial and econometric literature. The model is called Mixed Data Sampling (MIDAS). The mixed data sampling arises because of the mixed frequency in the data collection: typically, water quality variables are sampled fortnightly, whereas the rain data is sampled daily. Rainfall can have an impact on the quality on the day of the measurement of the water quality variable or via its weighted influence from previous days.

The advantage of using MIDAS regression is in the simple, flexible and parsimonious modelling of the influence of the rain on trends in water quality variables. Our most successful application involves beta weights. In this case, only three additional parameters are used to model and to fit a variety of monotone and humped influence shapes for the weight coefficients of the lags in the rain impact. We will discuss the theoretical formulation of the model, its implementation on a water quality data set, and some outcomes to justify its benefits.


Ghysels, E., Sinko, A., and Valkanov, R. (2007) MIDAS Regressions: Further Results and New Directions. Econometric Reviews 26 (1), 53–90.

Keywords: Water quality; Mixed data sampling; Regression; Parsimony

Biography: Spiridon Penev is Associate Professor at the Department of Statistics, The University of New South Wales, Sydney. His research interests are in nonprametric statistics, asymptotic statistics and in structural equation models. He has wide-ranging interests is applications of statistical methodology in diverse areas such as finance and risk management, behavioural sciences, and in engineering.