Looking through the bookshelf, it appears I have a few titles which at least the Internet Archive doesn't have.
I've enjoyed the use of books and magazines others have scanned and uploaded online, so I thought it only prudent I do the same. My printer has an ADF, so figured I'd give it a go!
Preparation
Unless you want to be scanning individual pages on a flatbed, or with some kind of camera set up, you'll need to prepare your books. Essientially, this involves cutting or removing the spine of the books.
This is clearly a destructive process, so may not be one everyone would like to do.
There's lots of advice that can be found online on how to do this. I didn't want to spend a lot of time or money for this step.
Instead, I went down my local Officeworks which has a service for this. For $1 a book, they'll use their fancy guillotine to cut and remove the spine. I suspect on thicker books it might need to be sliced a couple of times to be able to fit (I was advised they could do up to 250 80gsm pages at a time!).
Scanning
I'm lucky enough to have an MFC at home with an automatic document feede (a Konica Minolta Bizhub C35).
Scanner goes brr
I used the built-in "Windows Fax and Scan" application to connect to the scanner via WIA, and scanned directly to TIFF at 600 DPI. Note that the documents I'm using were all black-and-white, so used black-and-white mode to scan.
Preparing the PDF
I couldn't find a good free option for this. Instead, I used Foxit PDF Editor to convert from the scanned TIF image to PDF. It does deskewing and OCR for me which is really handy.
I made sure to fix up the page numbers in the PDF to match the pages in the books (it's a pet hate of mine when PDFs don't do this), and add some missing meta-data.
Issues
I've only done two so far and it's been reasonably smooth. Lessons so far:
-
For glued spined books, make sure you flip through every page and ensure each page is free. It'll save having to rescan pages which went through all at once.
-
Archive.org don't like the output from Foxit PDF. I had to run it through PDFTK to make Internet Archive happy :/