December 3, 2007 The venerable PDF document format has been around for about 14 years now – longer than most people have been using the Internet – and despite a few foibles it's proven itself as a highly effective means of distributing formatted content. But for the end receiver, the format leaves very little flexibility to get information out and re-use it, which is where Docudesk's DeskUNPDF software comes in handy. The software allows you to quickly and easily convert entire PDF documents (or selected bits of them) into a broad range of formats including Word and Excel files, HTML files and JPG images.
The software is very simple and intuitive to use – just drag a PDF file into the main window, select which pages you wish to convert, and what format you want them to come out in. There's plenty of options when it comes to how you wish to treat images and text blocks.
We tested DeskUNPDF Pro firstly with a large, all-image, 168-page PDF file of a scanned manual. Scaling the images down to half size and setting an output directory, we chose to output the images as JPG files. It worked quickly and very effectively. We did encounter a minor glitch when we attempted to convert only one page of the document however – the software converted the selected page perfectly but then proceeded to fire out 167 blank JPGs. Not a huge problem but a little annoying.
Our second test file was a PDF of a CD cover booklet, with lots of photos, tiny text, vector images and fairly intricate formatting. The software did a far better job than I expected replicating the document into Word and HTML formats – although it seemed that font substitutions pushed out the formatting in places, and the engine seemed to struggle with complex vector images. But to be fair, the input document was asking a lot more of the software than it would reasonably be expected to perform in 99% of productivity applications.
Once in Word format, all text blocks were immediately editable, and plain-text conversion options make it pretty easy to extract the text only out of a PDF where necessary. There is a line break between every line of text - just like when you copy and paste from a PDF – but it's easy enough to get around this sort of thing with textfixer's online line break removal tool, and other similar services.
Docudesk's DeskUNPDF Professional is able to spit PDF docs out in quite a range of formats, including ODT, DOC, XLS, CSV, XML, HTML, XHTML, SVG, BMP, PNG, TIFF and JPG. It can operate as a simple drag-and-drop conversion icon on the desktop, or run batch conversions if desired. A free trial download is available at the docudesk website, and the full Pro license will set you back US$59.95.
See the stories that matter in your inbox every morning