Archiving Old Books

Looking through the bookshelf, it appears I have a few titles which at least the Internet Archive doesn't have.

I've enjoyed the use of books and magazines others have scanned and uploaded online, so I thought it only prudent I do the same. My printer has an ADF, so figured I'd give it a go!

Preparation

Unless you want to be scanning individual pages on a flatbed, or with some kind of camera set up, you'll need to prepare your books. Essientially, this involves cutting or removing the spine of the books.
This is clearly a destructive process, so may not be one everyone would like to do.

There's lots of advice that can be found online on how to do this. I didn't want to spend a lot of time or money for this step.

Ready to Scan

Instead, I went down my local Officeworks which has a service for this. For $1 a book, they'll use their fancy guillotine to cut and remove the spine. I suspect on thicker books it might need to be sliced a couple of times to be able to fit (I was advised they could do up to 250 80gsm pages at a time!).

Scanning

I'm lucky enough to have an MFC at home with an automatic document feede (a Konica Minolta Bizhub C35).

Scanner goes brrr

Scanner goes brr

I used the built-in "Windows Fax and Scan" application to connect to the scanner via WIA, and scanned directly to TIFF at 600 DPI. Note that the documents I'm using were all black-and-white, so used black-and-white mode to scan.

Preparing the PDF

I couldn't find a good free option for this. Instead, I used Foxit PDF Editor to convert from the scanned TIF image to PDF. It does deskewing and OCR for me which is really handy.

I made sure to fix up the page numbers in the PDF to match the pages in the books (it's a pet hate of mine when PDFs don't do this), and add some missing meta-data.

Issues

I've only done two so far and it's been reasonably smooth. Lessons so far:

For glued spined books, make sure you flip through every page and ensure each page is free. It'll save having to rescan pages which went through all at once.
Archive.org don't like the output from Foxit PDF. I had to run it through PDFTK to make Internet Archive happy :/

Archiving Old Books

Preparation

Scanning

Preparing the PDF

Issues

Uploads so far

Be First to Comment

Leave a ReplyCancel reply