* unit tests
* SIMD support
* more wavelets (Haar)
* more data types (integer, fixed point)
* integrate performance measurement
* more example applications
* more effective lifting implementation (take a look at another implementations, e.g. OpenJPEG or JasPer)
* support for full decomposition in 1D
* relicense to GNU LGPL
* OpenCV wrapper (C)
* C++ interface
* drop deprecated interface
* remove duplicate code
* tiling (e.g., as in JPEG 2000)
* remove "lib" prefix pro source file names
* group lifting steps into one for-loop (suitable for SIMD implementation and UTIA ASVP platform)
* 2D DWT: single-loop approach (iteration of horizontal part is immediately fed into iteration of vertical part)
* memory access through incremented pointers instead of indexed arrays
* cache optimizations (stride should be a prime?)
* use size_t, ssize_t, etc. instead of unsigned int, int, ...
* 3D data support
* ARM support
* ASVP: upload new data when waiting for preceding operation (use memory bank "D", then copy to bank "A")
* ASVP: number of workers should be adjustable by environment variable like number of threads in OpenMP
* ASVP: compile never cross compiler (GCC for microblaze)
* MS Windows support
* code cleanup
