statistics-0.15.0.0: A library of statistical types, data, and functions

Copyright(c) 2009 Bryan O'Sullivan
LicenseBSD3
Maintainerbos@serpentine.com
Stabilityexperimental
Portabilityportable
Safe HaskellNone
LanguageHaskell98

Statistics.Sample.KernelDensity.Simple

Contents

Description

Deprecated: Use Statistics.Sample.KernelDensity instead.

Kernel density estimation code, providing non-parametric ways to estimate the probability density function of a sample.

The techniques used by functions in this module are relatively fast, but they generally give inferior results to the KDE function in the main KernelDensity module (due to the oversmoothing documented for bandwidth below).

Synopsis

Simple entry points

epanechnikovPDF #

Arguments

:: Vector v Double 
=> Int

Number of points at which to estimate

-> v Double

Data sample

-> (Points, Vector Double) 

Simple Epanechnikov kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.

gaussianPDF #

Arguments

:: Vector v Double 
=> Int

Number of points at which to estimate

-> v Double

Data sample

-> (Points, Vector Double) 

Simple Gaussian kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.

Building blocks

Choosing points from a sample

newtype Points #

Points from the range of a Sample.

Constructors

Points 
Instances
Eq Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

Methods

(==) :: Points -> Points -> Bool #

(/=) :: Points -> Points -> Bool #

Data Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

Methods

gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> Points -> c Points #

gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c Points #

toConstr :: Points -> Constr #

dataTypeOf :: Points -> DataType #

dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c Points) #

dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c Points) #

gmapT :: (forall b. Data b => b -> b) -> Points -> Points #

gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Points -> r #

gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Points -> r #

gmapQ :: (forall d. Data d => d -> u) -> Points -> [u] #

gmapQi :: Int -> (forall d. Data d => d -> u) -> Points -> u #

gmapM :: Monad m => (forall d. Data d => d -> m d) -> Points -> m Points #

gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Points -> m Points #

gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Points -> m Points #

Read Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

Show Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

Generic Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

Associated Types

type Rep Points :: Type -> Type #

Methods

from :: Points -> Rep Points x #

to :: Rep Points x -> Points #

ToJSON Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

FromJSON Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

Binary Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

Methods

put :: Points -> Put #

get :: Get Points #

putList :: [Points] -> Put #

type Rep Points # 
Instance details

Defined in Statistics.Sample.KernelDensity.Simple

type Rep Points = D1 (MetaData "Points" "Statistics.Sample.KernelDensity.Simple" "statistics-0.15.0.0-KYJLg9h4jsl1bBm8KLc3A8" True) (C1 (MetaCons "Points" PrefixI True) (S1 (MetaSel (Just "fromPoints") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (Vector Double))))

choosePoints #

Arguments

:: Vector v Double 
=> Int

Number of points to select, n

-> Double

Sample bandwidth, h

-> v Double

Input data

-> Points 

Choose a uniform range of points at which to estimate a sample's probability density function.

If you are using a Gaussian kernel, multiply the sample's bandwidth by 3 before passing it to this function.

If this function is passed an empty vector, it returns values of positive and negative infinity.

Bandwidth estimation

type Bandwidth = Double #

The width of the convolution kernel used.

bandwidth :: Vector v Double => (Double -> Bandwidth) -> v Double -> Bandwidth #

Compute the optimal bandwidth from the observed data for the given kernel.

This function uses an estimate based on the standard deviation of a sample (due to Deheuvels), which performs reasonably well for unimodal distributions but leads to oversmoothing for more complex ones.

epanechnikovBW :: Double -> Bandwidth #

Bandwidth estimator for an Epanechnikov kernel.

gaussianBW :: Double -> Bandwidth #

Bandwidth estimator for a Gaussian kernel.

Kernels

type Kernel = Double -> Double -> Double -> Double -> Double #

The convolution kernel. Its parameters are as follows:

  • Scaling factor, 1/nh
  • Bandwidth, h
  • A point at which to sample the input, p
  • One sample value, v

epanechnikovKernel :: Kernel #

Epanechnikov kernel for probability density function estimation.

gaussianKernel :: Kernel #

Gaussian kernel for probability density function estimation.

Low-level estimation

estimatePDF #

Arguments

:: Vector v Double 
=> Kernel

Kernel function

-> Bandwidth

Bandwidth, h

-> v Double

Sample data

-> Points

Points at which to estimate

-> Vector Double 

Kernel density estimator, providing a non-parametric way of estimating the PDF of a random variable.

simplePDF #

Arguments

:: Vector v Double 
=> (Double -> Double)

Bandwidth function

-> Kernel

Kernel function

-> Double

Bandwidth scaling factor (3 for a Gaussian kernel, 1 for all others)

-> Int

Number of points at which to estimate

-> v Double

sample data

-> (Points, Vector Double) 

A helper for creating a simple kernel density estimation function with automatically chosen bandwidth and estimation points.

References

  • Deheuvels, P. (1977) Estimation non paramétrique de la densité par histogrammes généralisés. Mhttp:/archive.numdam.orgarticle/RSA_1977__25_3_5_0.pdf>