statistics-0.15.0.0: A library of statistical types, data, and functions

Copyright(c) 2011 Bryan O'Sullivan
LicenseBSD3
Maintainerbos@serpentine.com
Stabilityexperimental
Portabilityportable
Safe HaskellNone
LanguageHaskell98

Statistics.Sample.Histogram

Contents

Description

Functions for computing histograms of sample data.

Synopsis

Documentation

histogram #

Arguments

:: (Vector v0 Double, Vector v1 Double, Num b, Vector v1 b) 
=> Int

Number of bins (must be positive).

-> v0 Double

Sample data (cannot be empty).

-> (v1 Double, v1 b) 

O(n) Compute a histogram over a data set.

The result consists of a pair of vectors:

  • The lower bound of each interval.
  • The number of samples within the interval.

Interval (bin) sizes are uniform, and the upper and lower bounds are chosen automatically using the range function. To specify these parameters directly, use the histogram_ function.

Building blocks

histogram_ #

Arguments

:: (Num b, RealFrac a, Vector v0 a, Vector v1 b) 
=> Int

Number of bins. This value must be positive. A zero or negative value will cause an error.

-> a

Lower bound on interval range. Sample data less than this will cause an error.

-> a

Upper bound on interval range. This value must not be less than the lower bound. Sample data that falls above the upper bound will cause an error.

-> v0 a

Sample data.

-> v1 b 

O(n) Compute a histogram over a data set.

Interval (bin) sizes are uniform, based on the supplied upper and lower bounds.

range #

Arguments

:: Vector v Double 
=> Int

Number of bins (must be positive).

-> v Double

Sample data (cannot be empty).

-> (Double, Double) 

O(n) Compute decent defaults for the lower and upper bounds of a histogram, based on the desired number of bins and the range of the sample data.

The upper and lower bounds used are (lo-d, hi+d), where

d = (maximum sample - minimum sample) / ((bins - 1) * 2)

If all elements in the sample are the same and equal to x range is set to (x - |x|10, x + |x|10). And if x is equal to 0 range is set to (-1,1). This is needed to avoid creating histogram with zero bin size.