DjVu is a web-centric format and software platform for distributing documents and images. DjVu can advantageously replace PDF, PS, TIFF, JPEG, and GIF for distributing scanned documents, digital documents, or high-resolution pictures. DjVu content downloads faster, displays and renders faster, looks nicer on a screen, and consume less client resources than competing formats. DjVu images display instantly and can be smoothly zoomed and panned with no lengthy re-rendering. DjVu is used by hundreds of academic, commercial, governmental, and non-commercial web sites around the world.
DjVuLibre is an open source (GPL'ed) implementation of DjVu, including viewers, browser plugins, decoders, simple encoders, and utilities.
What's in DjVuLibre
DjVuLibre contains:
* a standalone viewer for X11 that is built with the Qt GUI toolkit, and a small program that makes the viewer behave like a plug-in for all major Web browsers on Unix (Netscape-4.x, Netscape-6.x, Mozilla, Galeon, Konqueror, and Opera). * a bunch of command line tools and scripts to create, manipulate and convert DjVu images and documents. * a C++ library around which all of the above is built. This library can be used to build new viewers, new utilities, new compression algorithms, and even new codecs. It could (should) also be used to enable popular open source packages to support DjVu (Gimp, ImageMagick,....).
Here is a non exhaustive list of the commands included with DjVuLibre:
* c44: a wavelet-based continuous-tone image encoder (à la JPEG-2000). * cjb2: single page encoder for bitonal images (black and white scans). * cpaldjvu: encoder for palettized images (a la GIF, but better). * bzz: a general-purpose data compressor (a la bzip2). * djvused: a powerful command interpreter for manipulating DjVu documents. * ddjvu: converts DjVu documents to PBM/PGM/PPM images. * djvudump: displays the structure of a DjVu file. * djvuextract: extracts chunks from a DjVu file. * djvumake: assemble chunks into a DjVu file * djvutxt: extract the "hidden text" from a previously OCRed DjVu document.
What's not in DjVuLibre
DjVu is a bit like MPEG in its asymmetry between the decoders and the encoders. Decoders and simple/experimental encoders are open sourced and included in DjVuLibre, but the best encoders (as of today) are owned by LizardTech Inc and kept proprietary. The smarts in the encoder can make a big difference in terms of file size and image quality. Building smart or specialized commercial DjVu encoders (and applications around them) is what companies like LizardTech do for a living.
LizardTech is in fact building its business around selling high-performance encoders, OCR, indexing tools, server software, OEM software development kits, customized systems, specialized viewers, and support. LizardTech builds its high-performance commercial compressors around four pieces of technologies:
* a fast and high-performance multipage bitonal document encoder (acquired from AT&T) * a foreground/background layer segmenter for scanned color documents (acquired from AT&T) * a direct converter from PS/PDF to DjVu (licensed from AT&T) * an OCR engine (licensed from a 3rd party).
Although it is conceivable that adequate open source replacements for these will eventually become available, the AT&T/LizardTech technologies listed above are not included in DjVuLibre. This means that, at the moment, certain types of document compressed with LizardTech's commercial compressors or with the on-line conversion services (such as Any2DjVu) will end up smaller (and in some cases higher-quality) than the ones compressed with the DjVuLibre encoders.