Segmentation of Time Series with Heteroskedastic Components
Christian Derquenne
Research and Development, Electricité de France, Clamart, France, Metropolitan

The time series are decomposed into several general types of changes: trend, seasonality, volatility and noise. They may be more or less regular as the application domain. Behavioral changes that characterize these series are mainly of several types: peaks, jumps in level or trend, jumping variability. Modeling of these series is very delicate and requires a lot of experience in the application domain. It may be fundamental to detect ruptures behavior for many goals: to make stationary a time series using segmentation, building symbolic curves for cluster analysis, modeling of multivariate time series, etc. Many methods [2,3] and segmentation have been developed to address various problems in economics, finance, human sequencing, meteorology, energy management, etc. Most of these methods uses dynamic programming to reduce drastically the number of possible segmentations. These methods of detecting break points are designed to solve three problems [3]: detecting a change in the average, with a constant variance, the variance change detection with a constant mean and detecting changes in the entire distribution of the phenomenon, without distinguishing changes in level, variability and distribution errors. We have introduced a method [1] which not only reduces complexity compared to other methods, but mainly to propose solutions segmentation of the series into a series of segments increasing, decreasing, constant and different variances. Our method is original in its approach, that is to say, in successive stages to provide decision support for data segmentation. There are two phases: data preparation to provide a reasonable way to segment the data and models (heteroskedastic linear model) and successive adaptive. This method has been tested on many datasets (simulated and real) and yielded very good results. It competes strongly algorithms based on dynamic programming, especially when the variability is different from one segment to another. However, it turned out that, although in this case our method is generally more efficient, it could fail. Therefore, we propose an improvement of two new phase. The first is a segmentation of the variances on a transformation of the data, while the second applies to the method again [1] taking into account this information. A comparative study is made between dynamic programming algorithms, our previous method and the new. The quality of evidence was then significantly improved.


[1] Derquenne C., (2011), An Explanatory Segmentation Method for Time Series, in Proceedings of Compstat'2010, Y. Lechevallier & G. Saporta (eds.), 1st Edition, pp. 935-942.

[2] Guédon Y. (2008), Exploring the segmentation space for The Assessment of multiple change point models, National Institute for Research in Computer Science and Control, Research Paper 6619.

[3] Lavielle M. and Teyssières, G. (2006), Detecting multiple breaks in multivariate time series, Lietuvos Matematikos Rinikinys, vol 46.

Keywords: Time series analysis; Change-point; Heteroskedastic Linear Gaussian Model; Variance components

Biography: PhD in Statistics, Senior Researcher at Electricité de France R&D. Main scientific researches: Categorical Data Analysis, Generalized Linear Models, Structural Equations Modelling, Bayesian Networks, Clustering Methods, Robust Regression, Combining Data from Different Sources Methods, Goodness of Fit Tests, Time Series, Segmentation. More seventy publications and congress communications. Actual main applications: energy management. Teacher of Statistics in different Statistical Schools and Universities. Head of training courses of the Statistical French Society, Elected Member of the International Statistical Institute (ISI), Past President of the French Members Group of ISI, expert for the French Research Agency and Member of Scientific Council of the KXEN Society.