To estimate a heavy-tailed probability density function (pdf), different approaches are summarized: (1) a combined parametric-nonparametric method, (2) methods based on data transformations and, (3) a variable bandwidth kernel estimator.
The first method implies a separate estimation of the 'tail' and 'body' of the pdf by parametric and nonparametric methods, respectively. We consider a Pareto-type model to fit the 'tail' and a finite series expansion in terms of trigonometric functions as 'body' estimate. To fit the body of a multi-modal pdf better, we use a structural risk minimization method for the selection of the parameters.
The second approach requires a special data transformation which improves the estimation in the 'tails', namely, the transformation from a Generalized Pareto distribution function (df) which is assumed as a fitted df to a triangular df selected as the target df. The latter transformation is robust regarding the uncertainty of the tail index estimation. The triangular pdf can be estimated by a nonparametric estimator, e.g., a Parzen kernel estimator or a polygram. Regarding the heavy-tailed pdf estimation a kernel estimator with a variable bandwidth is usually recommended due to the variability of its bandwidth for each observation. It is demonstrated that this estimator works better if a preliminary data transformation is used.
To select data-driven smoothing parameters for the mentioned estimators, a discrepancy method is considered as an alternative to the cross-validation method. The discrepancy method is based on nonparametric statistics like the Kolmogorov-Smirnov or the von Mises-Smirnov statistics, and it uses quantiles of their limit distributions. Moreover, the convergence rates of these estimates are discussed.
Keywords: Heavy-tailed probability density function; Nonparametric estimation; Smoothing parameter; Discrepancy method
Biography: Natalia Markovich received the M.S. degree in Mathematics (with distinction) from the Applied Mathematics and Cybernetics Department of Tver University, Russia, Ph.D. and Doctor Sciences degree in Statistics from the Institute of Control Sciences of Russian Academy of Sciences in Moscow. She is a Main Scientist in the Institute of Control Sciences, and her research interests concern extreme value theory, nonparametrics, statistical analysis of measurements in telecommunication systems, including the Internet.
She has published over 60 papers and the book “Nonparametric analysis of univariate heavy-tailed data. Research and Practice”, Wiley, 2007.