statistics-0.15.0.0: A library of statistical types, data, and functions

Copyright(c) 2009 Bryan O'Sullivan
LicenseBSD3
Maintainerbos@serpentine.com
Stabilityexperimental
Portabilityportable
Safe HaskellNone
LanguageHaskell98

Statistics.Quantile

Contents

Description

Functions for approximating quantiles, i.e. points taken at regular intervals from the cumulative distribution function of a random variable.

The number of quantiles is described below by the variable q, so with q=4, a 4-quantile (also known as a quartile) has 4 intervals, and contains 5 points. The parameter k describes the desired point, where 0 ≤ kq.

Synopsis

Quantile estimation functions

Below is family of functions which use same algorithm for estimation of sample quantiles. It approximates empirical CDF as continuous piecewise function which interpolates linearly between points \((X_k,p_k)\) where \(X_k\) is k-th order statistics (k-th smallest element) and \(p_k\) is probability corresponding to it. ContParam determines how \(p_k\) is chosen. For more detailed explanation see [Hyndman1996].

This is the method used by most statistical software, such as R, Mathematica, SPSS, and S.

data ContParam #

Parameters α and β to the continuousBy function. Exact meaning of parameters is described in [Hyndman1996] in section "Piecewise linear functions"

Constructors

ContParam !Double !Double 
Instances
Eq ContParam # 
Instance details

Defined in Statistics.Quantile

Data ContParam # 
Instance details

Defined in Statistics.Quantile

Methods

gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> ContParam -> c ContParam #

gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c ContParam #

toConstr :: ContParam -> Constr #

dataTypeOf :: ContParam -> DataType #

dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c ContParam) #

dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c ContParam) #

gmapT :: (forall b. Data b => b -> b) -> ContParam -> ContParam #

gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> ContParam -> r #

gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> ContParam -> r #

gmapQ :: (forall d. Data d => d -> u) -> ContParam -> [u] #

gmapQi :: Int -> (forall d. Data d => d -> u) -> ContParam -> u #

gmapM :: Monad m => (forall d. Data d => d -> m d) -> ContParam -> m ContParam #

gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> ContParam -> m ContParam #

gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> ContParam -> m ContParam #

Ord ContParam # 
Instance details

Defined in Statistics.Quantile

Show ContParam # 
Instance details

Defined in Statistics.Quantile

Generic ContParam # 
Instance details

Defined in Statistics.Quantile

Associated Types

type Rep ContParam :: Type -> Type #

ToJSON ContParam # 
Instance details

Defined in Statistics.Quantile

FromJSON ContParam # 
Instance details

Defined in Statistics.Quantile

Binary ContParam # 
Instance details

Defined in Statistics.Quantile

Default ContParam #

We use s as default value which is same as R's default.

Instance details

Defined in Statistics.Quantile

Methods

def :: ContParam #

type Rep ContParam # 
Instance details

Defined in Statistics.Quantile

type Rep ContParam = D1 (MetaData "ContParam" "Statistics.Quantile" "statistics-0.15.0.0-KYJLg9h4jsl1bBm8KLc3A8" False) (C1 (MetaCons "ContParam" PrefixI False) (S1 (MetaSel (Nothing :: Maybe Symbol) SourceUnpack SourceStrict DecidedStrict) (Rec0 Double) :*: S1 (MetaSel (Nothing :: Maybe Symbol) SourceUnpack SourceStrict DecidedStrict) (Rec0 Double)))

class Default a where #

A class for types with a default value.

Minimal complete definition

Nothing

Methods

def :: a #

The default value for this type.

Instances
Default Double 
Instance details

Defined in Data.Default.Class

Methods

def :: Double #

Default Float 
Instance details

Defined in Data.Default.Class

Methods

def :: Float #

Default Int 
Instance details

Defined in Data.Default.Class

Methods

def :: Int #

Default Int8 
Instance details

Defined in Data.Default.Class

Methods

def :: Int8 #

Default Int16 
Instance details

Defined in Data.Default.Class

Methods

def :: Int16 #

Default Int32 
Instance details

Defined in Data.Default.Class

Methods

def :: Int32 #

Default Int64 
Instance details

Defined in Data.Default.Class

Methods

def :: Int64 #

Default Integer 
Instance details

Defined in Data.Default.Class

Methods

def :: Integer #

Default Ordering 
Instance details

Defined in Data.Default.Class

Methods

def :: Ordering #

Default Word 
Instance details

Defined in Data.Default.Class

Methods

def :: Word #

Default Word8 
Instance details

Defined in Data.Default.Class

Methods

def :: Word8 #

Default Word16 
Instance details

Defined in Data.Default.Class

Methods

def :: Word16 #

Default Word32 
Instance details

Defined in Data.Default.Class

Methods

def :: Word32 #

Default Word64 
Instance details

Defined in Data.Default.Class

Methods

def :: Word64 #

Default () 
Instance details

Defined in Data.Default.Class

Methods

def :: () #

Default All 
Instance details

Defined in Data.Default.Class

Methods

def :: All #

Default Any 
Instance details

Defined in Data.Default.Class

Methods

def :: Any #

Default CShort 
Instance details

Defined in Data.Default.Class

Methods

def :: CShort #

Default CUShort 
Instance details

Defined in Data.Default.Class

Methods

def :: CUShort #

Default CInt 
Instance details

Defined in Data.Default.Class

Methods

def :: CInt #

Default CUInt 
Instance details

Defined in Data.Default.Class

Methods

def :: CUInt #

Default CLong 
Instance details

Defined in Data.Default.Class

Methods

def :: CLong #

Default CULong 
Instance details

Defined in Data.Default.Class

Methods

def :: CULong #

Default CLLong 
Instance details

Defined in Data.Default.Class

Methods

def :: CLLong #

Default CULLong 
Instance details

Defined in Data.Default.Class

Methods

def :: CULLong #

Default CFloat 
Instance details

Defined in Data.Default.Class

Methods

def :: CFloat #

Default CDouble 
Instance details

Defined in Data.Default.Class

Methods

def :: CDouble #

Default CPtrdiff 
Instance details

Defined in Data.Default.Class

Methods

def :: CPtrdiff #

Default CSize 
Instance details

Defined in Data.Default.Class

Methods

def :: CSize #

Default CSigAtomic 
Instance details

Defined in Data.Default.Class

Methods

def :: CSigAtomic #

Default CClock 
Instance details

Defined in Data.Default.Class

Methods

def :: CClock #

Default CTime 
Instance details

Defined in Data.Default.Class

Methods

def :: CTime #

Default CUSeconds 
Instance details

Defined in Data.Default.Class

Methods

def :: CUSeconds #

Default CSUSeconds 
Instance details

Defined in Data.Default.Class

Methods

def :: CSUSeconds #

Default CIntPtr 
Instance details

Defined in Data.Default.Class

Methods

def :: CIntPtr #

Default CUIntPtr 
Instance details

Defined in Data.Default.Class

Methods

def :: CUIntPtr #

Default CIntMax 
Instance details

Defined in Data.Default.Class

Methods

def :: CIntMax #

Default CUIntMax 
Instance details

Defined in Data.Default.Class

Methods

def :: CUIntMax #

Default RiddersParam 
Instance details

Defined in Numeric.RootFinding

Methods

def :: RiddersParam #

Default NewtonParam 
Instance details

Defined in Numeric.RootFinding

Methods

def :: NewtonParam #

Default ContParam #

We use s as default value which is same as R's default.

Instance details

Defined in Statistics.Quantile

Methods

def :: ContParam #

Default [a] 
Instance details

Defined in Data.Default.Class

Methods

def :: [a] #

Default (Maybe a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Maybe a #

Integral a => Default (Ratio a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Ratio a #

Default a => Default (IO a) 
Instance details

Defined in Data.Default.Class

Methods

def :: IO a #

(Default a, RealFloat a) => Default (Complex a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Complex a #

Default (First a) 
Instance details

Defined in Data.Default.Class

Methods

def :: First a #

Default (Last a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Last a #

Default a => Default (Dual a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Dual a #

Default (Endo a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Endo a #

Num a => Default (Sum a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Sum a #

Num a => Default (Product a) 
Instance details

Defined in Data.Default.Class

Methods

def :: Product a #

Default r => Default (e -> r) 
Instance details

Defined in Data.Default.Class

Methods

def :: e -> r #

(Default a, Default b) => Default (a, b) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b) #

(Default a, Default b, Default c) => Default (a, b, c) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c) #

(Default a, Default b, Default c, Default d) => Default (a, b, c, d) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d) #

(Default a, Default b, Default c, Default d, Default e) => Default (a, b, c, d, e) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e) #

(Default a, Default b, Default c, Default d, Default e, Default f) => Default (a, b, c, d, e, f) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e, f) #

(Default a, Default b, Default c, Default d, Default e, Default f, Default g) => Default (a, b, c, d, e, f, g) 
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e, f, g) #

quantile #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> Int

k, the desired quantile.

-> Int

q, the number of quantiles.

-> v Double

x, the sample data.

-> Double 

O(n·log n). Estimate the kth q-quantile of a sample x, using the continuous sample method with the given parameters.

The following properties should hold, otherwise an error will be thrown.

  • input sample must be nonempty
  • the input does not contain NaN
  • 0 ≤ k ≤ q

quantiles :: (Vector v Double, Foldable f, Functor f) => ContParam -> f Int -> Int -> v Double -> f Double #

O(k·n·log n). Estimate set of the kth q-quantile of a sample x, using the continuous sample method with the given parameters. This is faster than calling quantile repeatedly since sample should be sorted only once

The following properties should hold, otherwise an error will be thrown.

  • input sample must be nonempty
  • the input does not contain NaN
  • for every k in set of quantiles 0 ≤ k ≤ q

quantilesVec :: (Vector v Double, Vector v Int) => ContParam -> v Int -> Int -> v Double -> v Double #

O(k·n·log n). Same as quantiles but uses Vector container instead of Foldable one.

Parameters for the continuous sample method

cadpw :: ContParam #

California Department of Public Works definition, α=0, β=1. Gives a linear interpolation of the empirical CDF. This corresponds to method 4 in R and Mathematica.

hazen :: ContParam #

Hazen's definition, α=0.5, β=0.5. This is claimed to be popular among hydrologists. This corresponds to method 5 in R and Mathematica.

spss :: ContParam #

Definition used by the SPSS statistics application, with α=0, β=0 (also known as Weibull's definition). This corresponds to method 6 in R and Mathematica.

s :: ContParam #

Definition used by the S statistics application, with α=1, β=1. The interpolation points divide the sample range into n-1 intervals. This corresponds to method 7 in R and Mathematica and is default in R.

medianUnbiased :: ContParam #

Median unbiased definition, α=1/3, β=1/3. The resulting quantile estimates are approximately median unbiased regardless of the distribution of x. This corresponds to method 8 in R and Mathematica.

normalUnbiased :: ContParam #

Normal unbiased definition, α=3/8, β=3/8. An approximately unbiased estimate if the empirical distribution approximates the normal distribution. This corresponds to method 9 in R and Mathematica.

Other algorithms

weightedAvg #

Arguments

:: Vector v Double 
=> Int

k, the desired quantile.

-> Int

q, the number of quantiles.

-> v Double

x, the sample data.

-> Double 

O(n·log n). Estimate the kth q-quantile of a sample, using the weighted average method. Up to rounding errors it's same as quantile s.

The following properties should hold otherwise an error will be thrown.

  • the length of the input is greater than 0
  • the input does not contain NaN
  • k ≥ 0 and k ≤ q

Median & other specializations

median #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> v Double

x, the sample data.

-> Double 

O(n·log n) Estimate median of sample

mad #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> v Double

x, the sample data.

-> Double 

O(n·log n). Estimate the median absolute deviation (MAD) of a sample x using continuousBy. It's robust estimate of variability in sample and defined as:

\[ MAD = \operatorname{median}(| X_i - \operatorname{median}(X) |) \]

midspread #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> Int

q, the number of quantiles.

-> v Double

x, the sample data.

-> Double 

O(n·log n). Estimate the range between q-quantiles 1 and q-1 of a sample x, using the continuous sample method with the given parameters.

For instance, the interquartile range (IQR) can be estimated as follows:

midspread medianUnbiased 4 (U.fromList [1,1,2,2,3])
==> 1.333333

Deprecated

continuousBy #

Arguments

:: Vector v Double 
=> ContParam

Parameters α and β.

-> Int

k, the desired quantile.

-> Int

q, the number of quantiles.

-> v Double

x, the sample data.

-> Double 

Deprecated: Use quantile instead

References