accelerate-0.15.0.0: An embedded language for accelerated array processing
Data.Array.Accelerate
defines an embedded array language for computations
for high-performance computing in Haskell. Computations on multi-dimensional,
regular arrays are expressed in the form of parameterised collective
operations, such as maps, reductions, and permutations. These computations may
then be online compiled and executed on a range of architectures.
- A simple example
As a simple example, consider the computation of a dot product of two vectors of floating point numbers:
dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float) dotp xs ys = fold (+) 0 (zipWith (*) xs ys)
Except for the type, this code is almost the same as the corresponding Haskell
code on lists of floats. The types indicate that the computation may be
online-compiled for performance - for example, using
Data.Array.Accelerate.CUDA
it may be on-the-fly off-loaded to the GPU.
- Available backends
Currently, there are two backends:
- An interpreter that serves as a reference implementation of the intended semantics of the language, which is included in this package.
- A CUDA backend generating code for CUDA-capable NVIDIA GPUs: http://hackage.haskell.org/package/accelerate-cuda
Several experimental and/or incomplete backends also exist. If you are particularly interested in any of these, especially with helping to finish them, please contact us.
- Cilk/ICC and OpenCL: https://github.com/AccelerateHS/accelerate-backend-kit
- Another OpenCL backend: https://github.com/HIPERFIT/accelerate-opencl
- A backend to the Repa array library: https://github.com/blambo/accelerate-repa
- An infrastructure for generating LLVM code, with backends targeting multicore CPUs and NVIDIA GPUs: https://github.com/AccelerateHS/accelerate-llvm/
- Additional components
The following support packages are available:
accelerate-cuda
: A high-performance parallel backend targeting CUDA-enabled NVIDIA GPUs. Requires the NVIDIA CUDA SDK and, for full functionality, hardware with compute capability 1.1 or greater. See the table on Wikipedia for supported GPUs: http://en.wikipedia.org/wiki/CUDA#Supported_GPUsaccelerate-examples
: Computational kernels and applications showcasing Accelerate, as well as performance and regression tests.accelerate-io
: Fast conversion between Accelerate arrays and other formats, includingvector
andrepa
.accelerate-fft
: Computation of Discrete Fourier Transforms.
Install them from Hackage with cabal install PACKAGE
- Examples and documentation
Haddock documentation is included in the package, and a tutorial is available on the GitHub wiki: https://github.com/AccelerateHS/accelerate/wiki
The accelerate-examples
package demonstrates a range of computational
kernels and several complete applications, including:
- An implementation of the Canny edge detection algorithm
- An interactive Mandelbrot set generator
- A particle-based simulation of stable fluid flows
- An n-body simulation of gravitational attraction between solid particles
- A cellular automata simulation
- A "password recovery" tool, for dictionary lookup of MD5 hashes
- A simple interactive ray tracer
- Mailing list and contacts
- Mailing list: accelerate-haskell@googlegroups.com (discussion of both use and development welcome).
- Sign up for the mailing list here: http://groups.google.com/group/accelerate-haskell
- Bug reports and issue tracking: https://github.com/AccelerateHS/accelerate/issues
- Release notes
- 0.15.0.0: Bug fixes and performance improvements.
- 0.14.0.0: New iteration constructs. Additional Prelude-like functions. Improved code generation and fusion optimisation. Concurrent kernel execution. Bug fixes.
- 0.13.0.0: New array fusion optimisation. New foreign function interface for array and scalar expressions. Additional Prelude-like functions. New example programs. Bug fixes and performance improvements.
- 0.12.0.0: Full sharing recovery in scalar expressions and array
computations. Two new example applications in package
accelerate-examples
: Real-time Canny edge detection and fluid flow simulator (both including a graphical frontend). Bug fixes. - 0.11.0.0: New Prelude-like functions
zip*
,unzip*
,fill
,enumFrom*
,tail
,init
,drop
,take
,slit
,gather*
,scatter*
, andshapeSize
. New simplified AST (in packageaccelerate-backend-kit
) for backend writers who want to avoid the complexities of the type-safe AST. - 0.10.0.0: Complete sharing recovery for scalar expressions (but currently disabled by default). Also bug fixes in array sharing recovery and a few new convenience functions.
- 0.9.0.0: Streaming, precompilation, Repa-style indices,
stencil
s, morescan
s, rank-polymorphicfold
,generate
, block I/O & many bug fixes. - 0.8.1.0: Bug fixes and some performance tweaks.
- 0.8.0.0:
replicate
,slice
andfoldSeg
supported in the CUDA backend; frontend and interpreter support forstencil
. Bug fixes. - 0.7.1.0: The CUDA backend and a number of scalar functions.
- Hackage note
The module documentation list generated by Hackage is incorrect. The only exposed modules should be: