invoice ocr

Invoice processing can be done in 3 ways. Manual, semi-automated and automated. In this post, we are going to cover semi-automated invoice processing (invoice OCR) and automated invoice processing (invoice automation).

What is Invoice OCR?

Invoice OCR is processing invoice with Optical Character Recognition technology, which is based on pre-defined rules and templates. Legacy invoice OCR solution is dependent to standard lay-out templates. That’s why we call it semi-automated invoice processing. OCR is a reliable technology in terms of extracting data from invoices. It reads dark and light patterns in a document, matches patterns with closest characters and turns image of an invoice or PDF into text based digital data. This is a big step towards invoice automation, since manual data entry is completely eliminated.

However, invoice OCR is not capable of performing two key functions for invoice automation:

  • To read and evaluate unfamiliar invoice lay-outs.
  • To deliver extracted data to relevant agents and integrate to ERP system.

Working Principles of Invoice OCR Software

You need to provide a picture or a PDF of an invoice as the first step of invoice OCR. When you upload image to OCR API, system extracts data according to re-defined rules, which tell to the OCR software where to look at for each invoice content. It is easy to train OCR for a single lay-out. You can define rules such as; find vendor name at the top-left of the invoice or total sum takes place at x,y coordinates with a maximum deviation margin of 10%.

Limitations of Invoice Processing with OCR

Invoices arrive to AP department in many various lay-outs. Each vendor has its own lay-out and sometimes the same vendor can add fields or change its general lay-out. In this case, you need to define new rules and teach OCR software. This is a time-consuming process and also requires expertise. If you need human intervention each time you receive a new invoice, can you call it automated invoice processing?

Another important feature that is not performed by invoice OCR is data delivery and integration. OCR can’t interpret extracted data and integrate it with related business processes and core systems (ERP, DMS etc.) unless it is supported by machine learning.

Invoice Automation with Machine Learning

Machine learning (ML) as a subdivision of AI, comes to help to solve both of the problems. ML is the key element for invoice automation. It enables the system to process any invoice layout, because ML can cover new templates by learning from thousands of examples and is also capable of interpreting shapes, line items or even noisy documents.

Thanks to powerful AI components, machine learning adds another dimension to invoice processing. This data delivery and integration. When we use ML technologies combined with invoice OCR, we can easily send data to authorized agents for approval or control automatically and also transfer each data set to accounting system, ERP or DMS.