JOURNAL ARTICLE

Estimation of a probability density function using interval aggregated data

Jianhua Z. HuangXueying WangXiming WuLan Zhou

Year: 2016 Journal:   Journal of Statistical Computation and Simulation Vol: 86 (15)Pages: 3093-3105   Publisher: Taylor & Francis

Abstract

In economics and government statistics, aggregated data instead of individual level data are usually reported for data confidentiality and for simplicity. In this paper we develop a method of flexibly estimating the probability density function of the population using aggregated data obtained as group averages when individual level data are grouped according to quantile limits. The kernel density estimator has been commonly applied to such data without taking into account the data aggregation process and has been shown to perform poorly. Our method models the quantile function as an integral of the exponential of a spline function and deduces the density function from the quantile function. We match the aggregated data to their theoretical counterpart using least squares, and regularize the estimation by using the squared second derivatives of the density function as the penalty function. A computational algorithm is developed to implement the method. Application to simulated data and US household income survey data show that our penalized spline estimator can accurately recover the density function of the underlying population while the common use of kernel density estimation is severely biased. The method is applied to study the dynamic of China's urban income distribution using published interval aggregated data of 1985–2010.

Keywords:
Kernel density estimation Mathematics Estimator Probability density function Quantile function Density estimation Quantile Variable kernel density estimation Statistics Multivariate kernel density estimation Population Kernel method Mathematical optimization Cumulative distribution function Computer science Artificial intelligence Support vector machine

Metrics

3
Cited By
0.19
FWCI (Field Weighted Citation Impact)
26
Refs
0.65
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Hydrology and Drought Analysis
Physical Sciences →  Environmental Science →  Global and Planetary Change
Statistical Methods and Inference
Physical Sciences →  Mathematics →  Statistics and Probability
Advanced Statistical Methods and Models
Physical Sciences →  Mathematics →  Statistics and Probability

Related Documents

JOURNAL ARTICLE

Probability density estimation using data projection

Mindaugas Kavaliauskas

Journal:   Lietuvos matematikos rinkinys Year: 2009 Vol: 50
JOURNAL ARTICLE

Probability density estimation using incomplete data

Kamalaldin MoradWilliam Y. SvrcekIan McKay

Journal:   ISA Transactions Year: 2000 Vol: 39 (4)Pages: 379-399
JOURNAL ARTICLE

Probability Density Function Estimation Using Gamma Kernels

Song Xi Chen

Journal:   Annals of the Institute of Statistical Mathematics Year: 2000 Vol: 52 (3)Pages: 471-480
JOURNAL ARTICLE

Probability Density Function Estimation Using Orthogonal Forward Regression

Sheng ChenXia HongC.J. Harris

Journal:   IEEE International Conference on Neural Networks/IEEE ... International Conference on Neural Networks Year: 2007 Vol: 36 Pages: 2492-2497
JOURNAL ARTICLE

Probability density function estimation using the MinMax measure

M. SrikanthH. K. KesavanP.H. Roe

Journal:   IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews) Year: 2000 Vol: 30 (1)Pages: 77-83
© 2026 ScienceGate Book Chapters — All rights reserved.