When using invoice OCR software to automate the business processes of the AP department, you expect certain functions to be performed. Collecting invoices from different sources, extracting data from invoices with high accuracy, reading existing and new invoice templates, full integration with ERP and accounting systems.

An invoice OCR software with these features greatly reduces the loss of time and human error caused by manual data entry, and significantly simplifies the audit and control processes.

However, OCR technology alone cannot provide the wonderful benefits we have outlined above. Considering the variety of data contained in invoices (whether in paper or digital format such as PDF, etc.) and the flexibility of the way this data is presented, OCR absolutely needs machine learning support for complete invoice automation.

Contact us if you need free consultancy about accounts payable automation. 

Invoice OCR and Machine Learning

The point where machine learning comes into play is that invoice templates can change frequently. Invoices from different countries or different suppliers, or even a new line item or table added to the same supplier’s invoice, render the rule-based invoice OCR system inoperable. For this reason, there is a need for a software that can interpret such different invoice samples by itself and that can perform data extraction without human intervention. A software with this feature should have a strong machine learning infrastructure. Only in this way can the invoice OCR system be fully automated.

For example, let’s consider an invoice that was previously processed with the invoice OCR system. Things will get complicated when the supplier that sent this invoice changes the stock system and switches to a new system in product coding. A new data field added to the product table cannot be processed by an invoice OCR system that is not supported by machine learning, even if all other fields of the invoice remain the same. However, an OCR software using machine learning can make sense of such new data types and can process even an invoice that it has never encountered before, thanks to its comparative analysis skill.

Cloud Access Is Indispensable for Invoice OCR Systems

Invoice OCR software is expected to fulfill some conditions in terms of user experience and access possibilities, as well as technical features. The first of these is that the software is cloud-based. Software running on cloud technology allows employees to work independently of time and space.

This feature also simplifies user management and eliminates installation/hardware costs. Another advantage is that the number of invoices you process does not matter in cloud-based invoice OCR systems, such as onVision Invoice Extraction.

Integration with ERP

The last stage of AP automation is, of course, the transfer of data to the ERP or accounting system at the point where invoice processing is finished. This process should also be performed automatically by the invoice OCR software. In order to fulfill this task, the OCR software you will use must have various and smoothly working connectors. Integration protocols developed for ERP systems such as SAP, MS Dynamics will enable a seamless integration without interrupting your workflow even for a day.

Both OCR (short for optical character recognition) and ICR (short for intelligent character recognition) are solutions for extracting data from images. Any type of document can be handled with these two technologies, which basically read the document and convert images to processable electronic data units.

What are the pros and cons of OCR?

OCR is a reliable technology, which is usually used for extracting data from printed documents. Since printed documents involve mostly uniform fonts and characters, when you scan the document OCR can easily analyze the dark and light patterns and understand which characters are used. So that it can turn images of each single letter and number into text. Though, OCR is not very handy to process handwriting or noisy images. Because OCR systems need well defined rules to match dark and light patterns with correct characters (letters & numbers), non-uniform contents such as handwriting, shapes, tables, lines, QR codes are out of its scope.

How does ICR work?

ICR is the advanced version of OCR, which means that ICR is capable of recognizing non-uniform characters. It is especially advantageous to use ICR to handle handwriting. Think about tones of handwritten documents that need to be processed manually: forms, invoices, receipts, delivery notes, registration documents etc.  It takes huge amounts of time to process them manually and manual data entry often leads to critical errors.

Machine learning is the key feature of ICR, which makes the difference. Machine learning abilities (especially utilizing neural networks) let an ICR system to learn by itself and interpret images without applying to pre-defined rules or templates. This method is called cognitive data capture, meaning extracting data from any type of document by understanding the context of the document and comparing it with many other variations.

Comparison of OCR and ICR

  • OCR systems are template or rule based and don’t use AI. That’s why OCR needs human supervision frequently. On the contrary, ICR warns only when an anomaly occurs.
  • While OCR is useful for companies who process documents with fixed layouts, ICR is adaptive and trained for frequent layout changes.
  • Templates, rules or layouts have to be manually created for OCR. ICR doesn’t require templates.
  • Outputs of ICR systems are more easily integrated to ERP systems
  • Accuracy rates of OCR are dependent of supporting data base. ICR improves its own accuracy level by time.

If you need any further assistance on OCR or ICR solutions, you can visit our products page or contact us immediately.

Why is form extraction so important? Let’s try to make a list of paper form types, which contain valuable data for organizations.

  • Account opening forms (Banking, Insurance)
  • Customer satisfaction forms (Retail, HORECA, Services)
  • Job application forms (All industries)
  • Proof of delivery forms (Transportation, Courier)
  • Medical record forms (Health, Medicine)
  • Complaint forms (Public sector, Aviation)
  • Registration forms (Education, Travel)
  • Surveys (All industries)
  • Maintenance Forms (Logistics, Aviation, Automotive)

These are only a few examples of most frequently used form types. The list goes on.

Now, think that almost all of these forms are filled handwritten and you need to read, understand and classify every data. You can imagine how big amount of time it takes to complete the task manually. Moreover, you need to process data properly to make it ready for use. You can use a data entry layout or tool to speed up processing, nevertheless you would have to prepare a new layout and workflow for each new form type. Even for a single input field added to the existing forms.

AI Makes It Possible to Automate Form Extraction

We are lucky that there is a cost-efficient solution for this intimidating business problem. Automated form extraction (also called; form capture or form ICR) is combining ICR (intelligent character recognition) technology with AI and thereby extract any type of data easily from handwritten forms.

ICR is the muscle of automated form processing while AI is the brain. Without utilizing AI, extracting handwritten content would be less useful. Since you need to interpret, validate, classify and integrate extracted data, AI plays an essential role in automation of form processing. Another asset of AI is the capability of processing new form layouts, which are not recognized by the system previously. This feature reduces need for human intervention and minimizes errors.

How Does Form Automation Work?

onVision Intelligent Form Capture is a cloud-based solution and it can be used through API or by Web scan. After you scan the form or take a photo of it by your mobile device, you can upload it to the platform. We can also listen an e-mail inbox or watch a folder to collect forms. Once the document reaches to the platform, handwritten data is extracted in seconds and structured output is generated in XML or JSON format.

From there on it becomes possible to use data as you wish, according to business rules and workflows. You can also employ RPA solutions to optimize any process. Thus, your organization will not only save time and money, but also improve its way of doing business and customer/employer satisfaction.

Paper invoices still occupy a huge place in day-to-day business operations. Accounts payable team needs to collect, process and distribute invoices properly to manage payments. When performed manually each step of invoice processing takes a lot of time and usually cause delays and bottlenecks. There is an effective solution of this frequent problem: invoice processing automation through invoice OCR.

How to use OCR for Invoice Automation?


The starting point of digitizing a paper invoice and thereby automating invoice processing, is scanning or visualizing printed or handwritten texts. OCR (optical character recognition) helps us to turn data of paper invoices (or PDFs) into digital units. However, if there are handwritten contents, special characters, non-uniform or distorted parts then OCR won’t suffice to advance. In this case, ICR (intelligent character recognition) is the proper tool to use.

There are critical points at this stage. It is not always easy to handle unstructured data. First obstacle is new invoice types. If you use a rule-based OCR, it can’t extract data from layouts which are not included to system previously. In such a case, human intervention becomes necessary. It is clear that we can’t call it an automated invoice processing, if we need human supervision every time a new layout is arrived.

Machine Learning Supports OCR


To solve this problem, onVision utilizes deep learning. Our invoice extraction tool uses a powerful machine learning algorithm to adapt itself to unfamiliar invoice formats. System is always ready to learn by itself and interpret any type of document to understand relevant data fields, line items etc.

Another significant difficulty with OCR technology is accuracy rates. Even the most advanced OCR engines struggle to offer an accuracy rate over 80%. Shadowy or noisy texts can only be extracted properly by the help of higher technology. That’s why we strengthen invoice OCR with machine learning.

Once we have a digitized data set, we are ready to process invoices. Following the first step, artificial intelligence has to interpret data and classify it according to business rules. After validation is realized, AI parses data and JSON output can be easily processed on API or server. Hence, we can easily integrate with any ERP or DMS software.

Language support and cloud access for invoice OCR

onVision Invoice Extraction support 119 languages for printed invoices and can also extract data from handwritten invoices in almost every language using Latin alphabet. Cloud based platform offers a great flexibility and speed. User management and accessibility becomes so easy that AP team can work from anywhere, any time. You can use onVision by buying credits and pay as much as you use.

Automated invoice processing is extracting data from incoming invoices and transferring it to any ERP or financial system within seconds. To achieve this task, companies need a well-designed framework which works seamlessly.

Processing invoices and payments is a complex and a very time-consuming process with hundreds and thousands of invoices arriving in various formats (e-mail attachments, PDFs, paper based etc.). If accounts payable team is working manually to process invoices, errors and delays occur unavoidably.  For this reason, using an invoice automation software creates great benefits for companies.

How to Automate Invoice Processing?

Invoice processing as a business function, may seem relatively simple job with a low priority. It is not the case though. First of all, invoice processing has a multi-channel workflow and it sits in the middle of three critical processes: operations, production and finance. How can a company manage its supplier chain without running a fast, productive and error-free accounts payable operation? How is it possible to keep production lines (or services) working without managing your supply chain properly?

With its important role, invoice processing workflow needs to be improved. To do this, invoice automation solutions have been developed for the last decade. Let’s dive in details and see how to automate invoice processing.

Invoice Automation Basics

There are four main steps of invoice automaton.

  • Digitizing physical invoices (scanning)
  • Extracting data from invoices (invoice OCR). Powerful OCR and ICR engines let us to extract data from files in various formats such as PDF, jpg, TIFF.
  • Interpreting and analyzing the invoice content. AI and machine learning technologies come to help at this stage. Pre-trained invoice automation solutions are able to understand unstructured or nonuniform contents and classify them. No human intervention is needed.
  • Integration to core systems such as SAP, MS Dynamics, Oracle etc. (connectors and APIs)

An intelligent invoice automation software can handle these four steps seamlessly and help your company to save time, money and valuable other resources.

Invoice OCR or Cognitive Data Capture?

OCR (optical character recognition) is the technology which makes automated invoice processing possible.  When you scan a physical invoice or take a photo of it, OCR turns optical characters into digital data. ICR (intelligent character recognition) is the advanced version of OCR. It can even turn handwritten texts into digitally usable data.

However, invoice OCR does not suffice to fully automate invoice processing. It is mainly because, OCR software without an AI component can only extract data from pre-defined templates. When you add a new vendor to system, rule based OCR software is incapable of extracting data from new invoice layout since it doesn’t know how to process fields, line items, shapes etc.

Cognitive data capture, on the contrary, uses artificial intelligence to read invoices. Two key features of cognitive data capture are; ability of AI to learn by itself and ability to understand patterns and layouts which are not seen by it before.

Thanks to these two key features, cognitive data capture doesn’t require human supervision or continuous controlling. So that it really automates invoice processing and saves huge volumes of manpower. You can learn details about onVision Invoice Extraction solution here. 

As the amount of data companies need to process in day-to-day operations becomes larger and larger, business processes get more sophisticated and intelligent automation needs gain urgency. There are many competencies that companies have to possess to keep pace with these requirements. Digital transformation is the overall framework which can be defined as the process of using digital resources to improve existing business models and culture or creating new ones.

Intelligent Automation as a Part of Digital Transformation

One of the main goals of digital transformation of an organization is to automate operations and processes. This not only about getting faster or more profitable, but also getting able to make better decisions and becoming more strategic.

Overall, we conclude that digital transformation is a more complex type of technology-enabled business transformation, which needs to address the strategic roles of new digital technologies and capabilities for successful digital innovation in the digital world (Yoo et al. 2010). We define it as the process through which companies converge multiple new digital technologies, enhanced with ubiquitous connectivity, with the intention of reaching superior performance and sustained competitive advantage, by transforming multiple business dimensions, including the business model, the customer experience (comprising digitally enabled products and services) and operations (comprising processes and decision-making), and simultaneously impacting people (including skills talent and culture) and networks (including the entire value system).

Digital Business Transformation and Strategy: What Do We Know So Far? by Mariam H. Ismail, Mohamed Khater, Mohamed Zaki

With this perspective in mind, we can understand why intelligent automation is such an important stage of digital transformation. It is so important because it is impossible to digitize processes and operations without utilizing intelligent automation tools. Companies need to free their employees from manual work as much as possible. So that employees can become more productive and focus on high value tasks instead of repetitive, time-consuming jobs.

Handling Data with Data Extraction

As we mentioned above, the amount of data companies need to handle is getting larger. It is not easy to process such a big amount of data, especially when it is generated through many types of documents. When data is arrived in various formats and mostly unstructured, we need a solution to extract, interpret and validate data. Data extraction tools are built to meet this demand and OCR technology has been enriched by AI assets.

Since it is possible to run intelligent automation only after proper data extraction from documents, it is critical to choose a well-developed data extraction solution.

What are the Stages of Intelligent Automation?

Although requirements and KPIs for intelligent automation differ from industry to industry, there are some common principles and methodologies to be used as a guideline.

  • Data Extraction: This is the first step towards successful automation. OCR and ICR engines working through AI based software, make it possible to extract data in various formats, from any type of document.
  • Data Processing: Once data is extracted and digitized, second step is processing it. Machine learning technologies come in help at this stage. Data validation and classifying is ensured with high accuracy.
  • Data Distribution & Automation: As a final step, structured and validated data should be distributed to relevant authorities and core systems. Integration to ERP or DMS software is undertaken at this stage and final automation tasks are completed.

There are two underlying reasons why invoice capture has become an essential tool for companies in recent years.  First and foremost, invoice capture contains two powerful and effective technologies, which serve companies to easily automate processes. These are Optical Character Recognition (OCR) and machine learning. The other reason is that by automating one of the most time consuming and critical processes, invoice capture promises high ROI rate.

Invoice Capture Combines OCR and Machine Learning

Invoice capture (also called invoice extraction or invoice OCR) means extracting data from invoices so that invoice processing can be automated. It is significant to automate data extraction and invoice processing because there are tones of different invoice formats and these formats involve many unstructured and nonuniform data types.

invoice capture detail

Using OCR (and also ICR) to recognize invoice content and machine learning to understand the context of that content and validate each data and item makes the difference. In this sense, invoice capture goes beyond OCR. OCR by itself is incapable of automating invoice extraction, because it needs templates and rules to work properly. This means that every time OCR meets a new format, a different line item, complex tables or a low-quality invoice a human operator’s supervision is necessary.

How to Choose Invoice Extraction Solution?

We can classify invoice extraction solutions under 3 groups:

  • Template based solutions. (OCR level)
  • Pre-trained machine learning solutions. (OCR + machine learning)
  • Continuous training AI solutions. (OCR + machine learning + AI)

The last group is your best choice to fully automate invoice processing, for the reason that these solutions are; format/document agnostic, can learn by themselves, have very high data confidence ratings.

You can learn more about continuously trained onVision Invoice Extraction solution. 

High ROI of Invoice Capture

Companies processing 300-500 invoices/month and whose invoice processing is not automated yet can reach to a $1500-2000 cost saving amount per month. *

*This calculation is based on various variables and therefore may differ from company to company. We took APQC benchmark report data as an average.


Digital transformation has been around almost for two decades, however there is still lots to do to overcome devastating paper work and manual document processing.  Main reason which lays beneath the fact that document processing (invoice processing, banking and financial document processing, legal document processing etc.) couldn’t reach to a higher level of digitalization is the difficulty of transforming contents of custom and arbitrary documents to structured, standardized data.

According to a recent survey, 92% of business leaders agree that companies need to enable process automation technologies. OCR (optical character recognition) technology plays a vital role in this story. It is still the best method to turn print outs into consistent digital data, which can be processed by computers. And the good news is OCR technology is improving constantly. Nevertheless, companies need more complex, flexible and yet accurate solutions than OCR in order to fulfill their challenging document management and digitalization needs.

How Does Intelligent Document Processing Work? 


Think about all of these various types of documents, forms, e-mail attachments, PDFs, invoices, receipts, work orders, bank and insurance statements etc. Tones of valuable data are stored in these documents, yet it needs a great deal of time and effort to reach and use them meaningfully.


Intelligent document processing (IDP) can be defined as a bunch of solutions come together to transform unstructured data into usable data and finally to create smooth, productive and cost-effective work flows. Most of IDP tools can be easily integrated with enterprise solutions such as ERP, CRM and DMS.

Components of IDP framework are:

  • Data capture & classification: OCR, NLP
  • Data extraction & interpretation: Deep learning, machine learning
  • Data validation & automation: AI, RPA

What Are the Benefits of Intelligent Document Processing?


When you run an IDP platform your business gets into a total boost. Typical and obvious benefits of IDP are:

  • Time and cost savings
  • User friendly processes and efficiency
  • Data accuracy & security uplift
  • Higher satisfaction of employers and customers
  • High quality internal project management

These benefits can be designed for a high variety of use cases in a flexible manner. One of the best assets of IDP is this flexibility, which is built on different levels of technology. For example, a medium sized exportation company can limit the solution for scanning invoices and extract invoice data to transfer to concerning parties in seconds. Besides, a large international enterprise would run AI based intelligent document processing in a multi lingual environment. And also can track, interpret and create each business process through robotic process automation (RPA) integration.