A major corporation in the transportation and moving industry brought Six Feet Up a unique challenge–how can we utilize AI and machine learning techniques to speed up the routing of transportation-related document paperwork? Currently, these documents are run through a completely human-centric process. A transportation professional will scan, or in many cases, photograph using their mobile device, the documents and upload a PDF of the form. In many cases, these PDFs are prone to poor image quality, pages out of order, upside down, flipped, or missing altogether. After being uploaded, these documents are pulled down one at a time and individual PDF pages are broken out by the audit team for filing.
At certain times of the year, this can lead to a significant backlog of documents awaiting processing. Relying on humans to handle the processing of the PDFs also adds a factor of human error in the review and classification phases of the document processing.
Six Feet Up’s goal was to provide an automated document processing system to speed up the document classification, reduce a factor of human error, and eventually predict the categories of new and unknown document types within a PDF.
Initial Proof of Concept
Six Feet Up began with a review of the AI and machine learning landscape to identify the processor that would provide the most useful document processing capabilities. We knew that our client was leaning in the direction of utilizing the Google ecosystem, so Google Cloud Vision and Google’s AutoML was at the top of our list of technologies to consider. We also wanted to evaluate Amazon’s document recognition capabilities along with potential Python options.
To tackle this challenge, Six Feet Up began with the review of a single type of document within a PDF document, to create a consistent test across multiple document recognition technologies. At this phase of the project, our test was focused on identifying document types based on elements within the content of the PDF pages themselves, so a good Optical
Character Recognition (OCR) system was critical. Many of the pages within the PDFs were leveraging QR codes and other key page content identifiers, so playing off of those elements was also an option.
After our review of the available platforms that could handle this level of processing, Google’s Cloud Platform and Google Cloud Vision stood out from the rest in the results of our tests. Google Cloud Vision's OCR capabilities allowed for a comprehensive analysis of the documents. Six Feet Up was able to combine both the QR and OCR analysis along with custom-coded Python scripts to generate a simple proof of concept application for quickly processing a large group of documents.
Next level Processing
Looking beyond the initial proof of concept, Six Feet Up took the initial findings to the next level with a configuration of automatic document processing within the Google Cloud Platform utilizing AutoML. While OCR proved to be useful for identifying content within a page, it was less consistent at pulling the proper page category and separating from other pages that might mention similarly-worded phrases. Using the first set of 2,000 sample PDFs, we put together a document training model using Google AutoML. By leveraging the power of machine learning, Six Feet Up pushed the boundary of document identification and page category recognition. We found that Google’s AutoML processor was capable of recognizing page types quickly and accurately with a properly curated training set.
Leveraging Google Cloud Vision for categorizing documents, Six Feet Up was able to consistently route documents along with its identifying metadata. During the image routing process, we were also able to correct many of the initial page errors that result from document upload. We took our processes to Google’s Cloud Platform to automate the document classification process, created a training model from our initial sample documents and leveraged AutoML to sort and categorize document types.
Through document categorization utilizing machine learning technology, Six Feet Up has proven that AI/ML structures can be used to successfully speed up mundane human accounting and document processing tasks. A manual process that may have taken an individual or team days can now be completed in a matter of hours. Utilizing machine learning can also reduce the factor of human error found in a commonly manual process, while also providing a level of sophistication and category prediction that rivals human review.