When using invoice OCR software to automate the business processes of the AP department, you expect certain functions to be performed. Collecting invoices from different sources, extracting data from invoices with high accuracy, reading existing and new invoice templates, full integration with ERP and accounting systems.

An invoice OCR software with these features greatly reduces the loss of time and human error caused by manual data entry, and significantly simplifies the audit and control processes.

However, OCR technology alone cannot provide the wonderful benefits we have outlined above. Considering the variety of data contained in invoices (whether in paper or digital format such as PDF, etc.) and the flexibility of the way this data is presented, OCR absolutely needs machine learning support for complete invoice automation.

Contact us if you need free consultancy about accounts payable automation. 

Invoice OCR and Machine Learning

The point where machine learning comes into play is that invoice templates can change frequently. Invoices from different countries or different suppliers, or even a new line item or table added to the same supplier’s invoice, render the rule-based invoice OCR system inoperable. For this reason, there is a need for a software that can interpret such different invoice samples by itself and that can perform data extraction without human intervention. A software with this feature should have a strong machine learning infrastructure. Only in this way can the invoice OCR system be fully automated.

For example, let’s consider an invoice that was previously processed with the invoice OCR system. Things will get complicated when the supplier that sent this invoice changes the stock system and switches to a new system in product coding. A new data field added to the product table cannot be processed by an invoice OCR system that is not supported by machine learning, even if all other fields of the invoice remain the same. However, an OCR software using machine learning can make sense of such new data types and can process even an invoice that it has never encountered before, thanks to its comparative analysis skill.

Cloud Access Is Indispensable for Invoice OCR Systems

Invoice OCR software is expected to fulfill some conditions in terms of user experience and access possibilities, as well as technical features. The first of these is that the software is cloud-based. Software running on cloud technology allows employees to work independently of time and space.

This feature also simplifies user management and eliminates installation/hardware costs. Another advantage is that the number of invoices you process does not matter in cloud-based invoice OCR systems, such as onVision Invoice Extraction.

Integration with ERP

The last stage of AP automation is, of course, the transfer of data to the ERP or accounting system at the point where invoice processing is finished. This process should also be performed automatically by the invoice OCR software. In order to fulfill this task, the OCR software you will use must have various and smoothly working connectors. Integration protocols developed for ERP systems such as SAP, MS Dynamics will enable a seamless integration without interrupting your workflow even for a day.

Both OCR (short for optical character recognition) and ICR (short for intelligent character recognition) are solutions for extracting data from images. Any type of document can be handled with these two technologies, which basically read the document and convert images to processable electronic data units.

What are the pros and cons of OCR?

OCR is a reliable technology, which is usually used for extracting data from printed documents. Since printed documents involve mostly uniform fonts and characters, when you scan the document OCR can easily analyze the dark and light patterns and understand which characters are used. So that it can turn images of each single letter and number into text. Though, OCR is not very handy to process handwriting or noisy images. Because OCR systems need well defined rules to match dark and light patterns with correct characters (letters & numbers), non-uniform contents such as handwriting, shapes, tables, lines, QR codes are out of its scope.

How does ICR work?

ICR is the advanced version of OCR, which means that ICR is capable of recognizing non-uniform characters. It is especially advantageous to use ICR to handle handwriting. Think about tones of handwritten documents that need to be processed manually: forms, invoices, receipts, delivery notes, registration documents etc.  It takes huge amounts of time to process them manually and manual data entry often leads to critical errors.

Machine learning is the key feature of ICR, which makes the difference. Machine learning abilities (especially utilizing neural networks) let an ICR system to learn by itself and interpret images without applying to pre-defined rules or templates. This method is called cognitive data capture, meaning extracting data from any type of document by understanding the context of the document and comparing it with many other variations.

Comparison of OCR and ICR

  • OCR systems are template or rule based and don’t use AI. That’s why OCR needs human supervision frequently. On the contrary, ICR warns only when an anomaly occurs.
  • While OCR is useful for companies who process documents with fixed layouts, ICR is adaptive and trained for frequent layout changes.
  • Templates, rules or layouts have to be manually created for OCR. ICR doesn’t require templates.
  • Outputs of ICR systems are more easily integrated to ERP systems
  • Accuracy rates of OCR are dependent of supporting data base. ICR improves its own accuracy level by time.

If you need any further assistance on OCR or ICR solutions, you can visit our products page or contact us immediately.

Why is form extraction so important? Let’s try to make a list of paper form types, which contain valuable data for organizations.

  • Account opening forms (Banking, Insurance)
  • Customer satisfaction forms (Retail, HORECA, Services)
  • Job application forms (All industries)
  • Proof of delivery forms (Transportation, Courier)
  • Medical record forms (Health, Medicine)
  • Complaint forms (Public sector, Aviation)
  • Registration forms (Education, Travel)
  • Surveys (All industries)
  • Maintenance Forms (Logistics, Aviation, Automotive)

These are only a few examples of most frequently used form types. The list goes on.

Now, think that almost all of these forms are filled handwritten and you need to read, understand and classify every data. You can imagine how big amount of time it takes to complete the task manually. Moreover, you need to process data properly to make it ready for use. You can use a data entry layout or tool to speed up processing, nevertheless you would have to prepare a new layout and workflow for each new form type. Even for a single input field added to the existing forms.

AI Makes It Possible to Automate Form Extraction

We are lucky that there is a cost-efficient solution for this intimidating business problem. Automated form extraction (also called; form capture or form ICR) is combining ICR (intelligent character recognition) technology with AI and thereby extract any type of data easily from handwritten forms.

ICR is the muscle of automated form processing while AI is the brain. Without utilizing AI, extracting handwritten content would be less useful. Since you need to interpret, validate, classify and integrate extracted data, AI plays an essential role in automation of form processing. Another asset of AI is the capability of processing new form layouts, which are not recognized by the system previously. This feature reduces need for human intervention and minimizes errors.


How Does Form Automation Work?

onVision Intelligent Form Capture is a cloud-based solution and it can be used through API or by Web scan. After you scan the form or take a photo of it by your mobile device, you can upload it to the platform. We can also listen an e-mail inbox or watch a folder to collect forms. Once the document reaches to the platform, handwritten data is extracted in seconds and structured output is generated in XML or JSON format.

From there on it becomes possible to use data as you wish, according to business rules and workflows. You can also employ RPA solutions to optimize any process. Thus, your organization will not only save time and money, but also improve its way of doing business and customer/employer satisfaction.

Paper invoices still occupy a huge place in day-to-day business operations. Accounts payable team needs to collect, process and distribute invoices properly to manage payments. When performed manually each step of invoice processing takes a lot of time and usually cause delays and bottlenecks. There is an effective solution of this frequent problem: invoice processing automation through invoice OCR.

How to use OCR for Invoice Automation?

 

The starting point of digitizing a paper invoice and thereby automating invoice processing, is scanning or visualizing printed or handwritten texts. OCR (optical character recognition) helps us to turn data of paper invoices (or PDFs) into digital units. However, if there are handwritten contents, special characters, non-uniform or distorted parts then OCR won’t suffice to advance. In this case, ICR (intelligent character recognition) is the proper tool to use.

There are critical points at this stage. It is not always easy to handle unstructured data. First obstacle is new invoice types. If you use a rule-based OCR, it can’t extract data from layouts which are not included to system previously. In such a case, human intervention becomes necessary. It is clear that we can’t call it an automated invoice processing, if we need human supervision every time a new layout is arrived.

Machine Learning Supports OCR

 

To solve this problem, onVision utilizes deep learning. Our invoice extraction tool uses a powerful machine learning algorithm to adapt itself to unfamiliar invoice formats. System is always ready to learn by itself and interpret any type of document to understand relevant data fields, line items etc.

Another significant difficulty with OCR technology is accuracy rates. Even the most advanced OCR engines struggle to offer an accuracy rate over 80%. Shadowy or noisy texts can only be extracted properly by the help of higher technology. That’s why we strengthen invoice OCR with machine learning.

Once we have a digitized data set, we are ready to process invoices. Following the first step, artificial intelligence has to interpret data and classify it according to business rules. After validation is realized, AI parses data and JSON output can be easily processed on API or server. Hence, we can easily integrate with any ERP or DMS software.

Language support and cloud access for invoice OCR

onVision Invoice Extraction support 119 languages for printed invoices and can also extract data from handwritten invoices in almost every language using Latin alphabet. Cloud based platform offers a great flexibility and speed. User management and accessibility becomes so easy that AP team can work from anywhere, any time. You can use onVision by buying credits and pay as much as you use.