I’ve been on a quest the last 6-8 months of trying to get rid of all my hard copy paperwork that I keep at home. I had a massive filing cabinet with paperwork that included:
- Tax filings from previous years
- Receipts and paperwork for upcoming tax year
- car maintenance records
The list just goes on and on. It was so easy to have a filing cabinet and just throw things in there and, the next thing I knew it, I had years worth of paperwork accumulated. Some of it was important to have the originals, some not, and most of it useless. So I decided I wanted to go all digital with my paperwork, to not only reduce physical clutter, but also make it easier to find specific documents. I had visions of all my paperwork scanned, OCR’d and tagged so I could do a search in Vista for a particular and all the pertinent documents would be shown.
So I decided that I would go through nearly 10 years of paperwork, throw away what was irrelevant, and sort the remaining into 2 buckets: what needed to be kept in original hardcopy and what could be scanned.
I did this sorting process over a 4 day period and the net result was 3 full garbage bags of papers that I brought for professional shredding and disposal. The rest of the documents for keep and scan was very, very small which wasn’t surprising.
Now I had to move onto the task of going digital with my paperwork life. In order to achieve this, I needed the following things:
- A folder structure and file naming convention that would make sense without looking at the document contents or tags.
- A scanner that made it easy to scan multiple sheets, double-sided in a single pass and gave me some sort of automation to easily get to a PDF
- Reliable backup and storage method
I decided to make top-level folders for all major buckets of documents:
- Model Releases
- Business License
The first 5 root folders fit nearly all my documents very neatly and the last miscellaneous category was created to hold outliers that just didn’t fit in the main set.
File naming convention
For the file names of each scanned document, I wanted to ensure that the names conveyed the purpose of the document as well as the date so each individual file was self-describing without relying on any folder structure so that if I was ever looking at an individual file, I wouldn’t need other files or the containing folder structure to describe it. The other added benefit was that if I ever changed the folder structure, the file names could stay the same.
I ultimately settled on:
<YYYY><MM><DD> – <Document name>.pdf
The date prefix always matches the date on the document itself, or if that’s missing, the date I received it. I never wanted to make the date the day that I scanned it since that date is never relevant.
With the help of my friend Fil, who also went all digital (months and months before me), I decided on getting a Fujitsu ScanSnap S300 which satisfied all my needs. It was the newer model to his scanner, and much more compact which also fit the bill for what I was looking for.
Features that were important to me:
- Scans double-sided in a single pass
- 10 pages document feeder
- 8 pages per minute
- One-touch scanning
- Auto PDF and OCR after scanning
- Support Vista
- Very, very compact
Basically, this scanner has it all! I love the convenience of it’s small size and the fact it will create a PDF and OCR automatically after I initiate a scan. The only manual thing I have to do is choose the folder to save the file and specify the file name.
The things I don’t like about the scanner:
- Doesn’t have TWAIN drivers which means it isn’t recognized as a scanning device through Windows natively – in other words, you need their software to drive the scanner. Fortunately, the software operates well enough and also has a good OCR engine
- It’s not clear from their website whether or not it supports 64-bit which will impede my eventual move to 64-bit Vista.
Reliable back and storage
Backing up all the scanned documents is obviously a key part to this whole process. No sense in scanning all this paperwork and have it stored on a single hard drive on my computer that is bound to fail at some point. Luckily, due to my past investments, this was the easiest part for me. I have a Windows Home Server machine that backs up all my documents to its local RAID. I also have all my data (including scanned paperwork) backed up off-site using Carbonite. (Update: As of May 2009, I have switched back to Mozy)
Overall, the Fujitsu ScanSnap S300 document scanner is a dream and everything I was looking for. It’s the best on the market that I’ve seen and it’s price point is quite low in my opinion for the value that I’m deriving from it. My entire process now with document scanning is that as I receive mail or documents that I want to keep, I keep them on the edge of my desk then at the end of each week I scan the lot of them. I’ve found this works better for my lifestyle rather than scan each document as it comes on a daily basis.
Now with my investment in workflow at home, I can actually do a search in File Explorer in Vista for a particular word, and the contents of the scanned documents are now actually searched in addition to the rest of the files on my system. So my vision of searching for a particular keyword and getting back all pertinent documents is now a reality.
If any of you are still keeping hordes of paperwork at home in massive filing cabinets, I encourage you to at least experiment going the (nearly) all digital path. It’s made things SO much easier for me.
Special thanks to Fil for providing advice along the way and the original scanner recommendation.