Copyright	(c) 2006-2008 Duncan Coutts
License	BSD-style
Maintainer	duncan@haskell.org
Stability	provisional
Portability	portable (H98 + FFI)
Safe Haskell	None
Language	Haskell98

Codec.Compression.Zlib.Internal

Contents

Compression
Decompression
The compression parameter types
Low-level API to get explicit error reports

Description

Pure stream based interface to lower level zlib wrapper

Synopsis

Compression

compress :: Format -> CompressParams -> ByteString -> ByteString

Compress a data stream.

There are no expected error conditions. All input data streams are valid. It is possible for unexpected errors to occur, such as running out of memory, or finding the wrong version of the zlib C library, these are thrown as exceptions.

data CompressParams

The full set of parameters for compression. The defaults are defaultCompressParams.

The compressBufferSize is the size of the first output buffer containing the compressed data. If you know an approximate upper bound on the size of the compressed data then setting this parameter can save memory. The default compression output buffer size is 16k. If your extimate is wrong it does not matter too much, the default buffer size will be used for the remaining chunks.

Constructors

CompressParams
Fields compressLevel :: !CompressionLevel compressMethod :: !Method compressWindowBits :: !WindowBits compressMemoryLevel :: !MemoryLevel compressStrategy :: !CompressionStrategy compressBufferSize :: !Int compressDictionary :: Maybe ByteString

defaultCompressParams :: CompressParams

The default set of parameters for compression. This is typically used with the compressWith function with specific parameters overridden.

Decompression

decompress :: Format -> DecompressParams -> ByteString -> ByteString

Decompress a data stream.

It will throw an exception if any error is encountered in the input data. If you need more control over error handling then use decompressWithErrors.

data DecompressParams

The full set of parameters for decompression. The defaults are defaultDecompressParams.

The decompressBufferSize is the size of the first output buffer, containing the uncompressed data. If you know an exact or approximate upper bound on the size of the decompressed data then setting this parameter can save memory. The default decompression output buffer size is 32k. If your extimate is wrong it does not matter too much, the default buffer size will be used for the remaining chunks.

One particular use case for setting the decompressBufferSize is if you know the exact size of the decompressed data and want to produce a strict ByteString. The compression and deccompression functions use lazy ByteStrings but if you set the decompressBufferSize correctly then you can generate a lazy ByteString with exactly one chunk, which can be converted to a strict ByteString in O(1) time using concat . toChunks.

Constructors

DecompressParams
Fields decompressWindowBits :: !WindowBits decompressBufferSize :: !Int decompressDictionary :: Maybe ByteString

defaultDecompressParams :: DecompressParams

The default set of parameters for decompression. This is typically used with the compressWith function with specific parameters overridden.

The compression parameter types

data Format

The format used for compression or decompression. There are three variations.

Constructors

GZip	Deprecated: Use gzipFormat. Format constructors will be hidden in version 0.7
Zlib	Deprecated: Use zlibFormat. Format constructors will be hidden in version 0.7
Raw	Deprecated: Use rawFormat. Format constructors will be hidden in version 0.7
GZipOrZlib	Deprecated: Use gzipOrZlibFormat. Format constructors will be hidden in version 0.7

Instances

Eq Format

gzipFormat :: Format

The gzip format uses a header with a checksum and some optional meta-data about the compressed file. It is intended primarily for compressing individual files but is also sometimes used for network protocols such as HTTP. The format is described in detail in RFC #1952 http://www.ietf.org/rfc/rfc1952.txt

zlibFormat :: Format

The zlib format uses a minimal header with a checksum but no other meta-data. It is especially designed for use in network protocols. The format is described in detail in RFC #1950 http://www.ietf.org/rfc/rfc1950.txt

rawFormat :: Format

The 'raw' format is just the compressed data stream without any additional header, meta-data or data-integrity checksum. The format is described in detail in RFC #1951 http://www.ietf.org/rfc/rfc1951.txt

gzipOrZlibFormat :: Format

This is not a format as such. It enabled zlib or gzip decoding with automatic header detection. This only makes sense for decompression.

data CompressionLevel

The compression level parameter controls the amount of compression. This is a trade-off between the amount of compression and the time required to do the compression.

Constructors

DefaultCompression	Deprecated: Use defaultCompression. CompressionLevel constructors will be hidden in version 0.7
NoCompression	Deprecated: Use noCompression. CompressionLevel constructors will be hidden in version 0.7
BestSpeed	Deprecated: Use bestSpeed. CompressionLevel constructors will be hidden in version 0.7
BestCompression	Deprecated: Use bestCompression. CompressionLevel constructors will be hidden in version 0.7
CompressionLevel Int

defaultCompression :: CompressionLevel

The default compression level is 6 (that is, biased towards higher compression at expense of speed).

noCompression :: CompressionLevel

No compression, just a block copy.

bestSpeed :: CompressionLevel

The fastest compression method (less compression)

bestCompression :: CompressionLevel

The slowest compression method (best compression).

compressionLevel :: Int -> CompressionLevel

A specific compression level between 0 and 9.

data Method

The compression method

Constructors

Deflated

Deprecated: Use deflateMethod. Method constructors will be hidden in version 0.7

deflateMethod :: Method

'Deflate' is the only method supported in this version of zlib. Indeed it is likely to be the only method that ever will be supported.

data WindowBits

This specifies the size of the compression window. Larger values of this parameter result in better compression at the expense of higher memory usage.

The compression window size is the value of the the window bits raised to the power 2. The window bits must be in the range 8..15 which corresponds to compression window sizes of 256b to 32Kb. The default is 15 which is also the maximum size.

The total amount of memory used depends on the window bits and the MemoryLevel. See the MemoryLevel for the details.

Constructors

WindowBits Int
DefaultWindowBits	Deprecated: Use defaultWindowBits. WindowBits constructors will be hidden in version 0.7

defaultWindowBits :: WindowBits

The default WindowBits is 15 which is also the maximum size.

windowBits :: Int -> WindowBits

A specific compression window size, specified in bits in the range 8..15

data MemoryLevel

The MemoryLevel parameter specifies how much memory should be allocated for the internal compression state. It is a tradoff between memory usage, compression ratio and compression speed. Using more memory allows faster compression and a better compression ratio.

The total amount of memory used for compression depends on the WindowBits and the MemoryLevel. For decompression it depends only on the WindowBits. The totals are given by the functions:

compressTotal windowBits memLevel = 4 * 2^windowBits + 512 * 2^memLevel
decompressTotal windowBits = 2^windowBits

For example, for compression with the default windowBits = 15 and memLevel = 8 uses 256Kb. So for example a network server with 100 concurrent compressed streams would use 25Mb. The memory per stream can be halved (at the cost of somewhat degraded and slower compressionby) by reducing the windowBits and memLevel by one.

Decompression takes less memory, the default windowBits = 15 corresponds to just 32Kb.

Constructors

DefaultMemoryLevel	Deprecated: Use defaultMemoryLevel. MemoryLevel constructors will be hidden in version 0.7
MinMemoryLevel	Deprecated: Use minMemoryLevel. MemoryLevel constructors will be hidden in version 0.7
MaxMemoryLevel	Deprecated: Use maxMemoryLevel. MemoryLevel constructors will be hidden in version 0.7
MemoryLevel Int

defaultMemoryLevel :: MemoryLevel

The default memory level. (Equivalent to memoryLevel 8)

minMemoryLevel :: MemoryLevel

Use minimum memory. This is slow and reduces the compression ratio. (Equivalent to memoryLevel 1)

maxMemoryLevel :: MemoryLevel

Use maximum memory for optimal compression speed. (Equivalent to memoryLevel 9)

memoryLevel :: Int -> MemoryLevel

A specific level in the range 1..9

data CompressionStrategy

The strategy parameter is used to tune the compression algorithm.

The strategy parameter only affects the compression ratio but not the correctness of the compressed output even if it is not set appropriately.

Constructors

DefaultStrategy	Deprecated: Use defaultStrategy. CompressionStrategy constructors will be hidden in version 0.7
Filtered	Deprecated: Use filteredStrategy. CompressionStrategy constructors will be hidden in version 0.7
HuffmanOnly	Deprecated: Use huffmanOnlyStrategy. CompressionStrategy constructors will be hidden in version 0.7

defaultStrategy :: CompressionStrategy

Use this default compression strategy for normal data.

filteredStrategy :: CompressionStrategy

Use the filtered compression strategy for data produced by a filter (or predictor). Filtered data consists mostly of small values with a somewhat random distribution. In this case, the compression algorithm is tuned to compress them better. The effect of this strategy is to force more Huffman coding and less string matching; it is somewhat intermediate between defaultCompressionStrategy and huffmanOnlyCompressionStrategy.

huffmanOnlyStrategy :: CompressionStrategy

Use the Huffman-only compression strategy to force Huffman encoding only (no string match).

Low-level API to get explicit error reports

decompressWithErrors :: Format -> DecompressParams -> ByteString -> DecompressStream

Like decompress but returns a DecompressStream data structure that contains an explicit representation of the error conditions that one may encounter when decompressing.

Note that in addition to errors in the input data, it is possible for other unexpected errors to occur, such as out of memory, or finding the wrong version of the zlib C library, these are still thrown as exceptions (because representing them as data would make this function impure).

data DecompressStream

A sequence of chunks of data produced from decompression.

The difference from a simple list is that it contains a representation of errors as data rather than as exceptions. This allows you to handle error conditions explicitly.

Constructors

StreamEnd
StreamChunk ByteString DecompressStream
StreamError DecompressError String	An error code and a human readable error message.

data DecompressError

The possible error cases when decompressing a stream.

Constructors

TruncatedInput	The compressed data stream ended prematurely. This may happen if the input data stream was truncated.
DictionaryRequired	It is possible to do zlib compression with a custom dictionary. This allows slightly higher compression ratios for short files. However such compressed streams require the same dictionary when decompressing. This error is for when we encounter a compressed stream that needs a dictionary, and it's not provided.
DataError	If the compressed data stream is corrupted in any way then you will get this error, for example if the input data just isn't a compressed zlib data stream. In particular if the data checksum turns out to be wrong then you will get all the decompressed data but this error at the end, instead of the normal sucessful `StreamEnd`.

foldDecompressStream :: (ByteString -> a -> a) -> a -> (DecompressError -> String -> a) -> DecompressStream -> a

Fold an DecompressionStream. Just like foldr but with an extra error case. For example to convert to a list and translate the errors into exceptions:

foldDecompressStream (:) [] (\code msg -> error msg)

fromDecompressStream :: DecompressStream -> ByteString

Convert a DecompressStream to a lazy ByteString. If any decompression errors are encountered then they are thrown as exceptions.

This is a special case of foldDecompressStream.