Wednesday, March 17, 2010

Archiving Documents

A few months ago, we took a look at our filing cabinet in our office.  We have a lot of documents that we have saved, just in case we had a need for that information.  Unfortunately, over time the amount of paper continues to increase.  So I decided to look for a solution for me to digitize all of this content.

The Evernote blog had a post about Pixily/OfficeDrop.  This is a service where you can mail your documents in and they scan them and run OCR software.  Then they host the files in a digital locker, where you can search or organize them.  This seemd like a reasonable service, but it works best for scanning documents as you receive them.  In order for use to scan through our backlog of documents, it would have cost several hundred dollars.

We decided not to go with OfficeDrop, not only because of the cost of scanning our backlog of documents, but also for the cost of the service.  The amount of storage that you get, I feel, is not worth the monthly charge.

We decided to do the scanning ourselves.  We bought a Fujitsu ScanSnap S1500M document scanner to do the scanning.  This scanner will quickly scan a batch of multiple side documents.  The included software has some pretty useful features:

  1. Automatically detects the number of printed sides on a sheet

  2. OCR and creation of searchable PDFs




When I was uploading photos to my Picassa Web account, I purchased space in my Google Account, so I decided that I could also use this space to upload the pdfs that are generated by the scaner into Google Docs.  The copy stored in Google Docs gives me an offsite storage backup of these files.

When the files are stored in Google Docs, I can share either individual documents or whole folders with other people.  Also, I am able to search for the documents based on the file content.

There are some things that I would like to see added to the ScanSnap software:

  1. Ability to scan multiple various length multi-page documents at one time, with the documents separated by a "separator page".  Currently you can either scan a batch of documents into a single pdf or have the software create a pdf for every x pages

  2. Ability to have the resulting file named based on a text highlighted with a highlighter