Receipt Bank now processes millions of documents every month. Photos of receipts, handwritten invoices, digital pdfs - we extract the key details, ready for review by the accountant or bookkeeper. When people ask how, we say “we use OCR receipt software to read the items”. But how does it actually work?
Some of you may have had the same question from your clients. And since we’ve never actually dug into how our baseline technology works, maybe it’s time we answered that question. Today we’re going to dig into the background of this startling useful tool, and the difference it makes for accountants.
One of the first things worth mentioning about OCR is that it’s not new.
OCR stands for Optical Character Recognition, and it’s been around in various forms since the early 20th century. Fun fact: one of its early applications was to read to text to blind people at a rate of 60 words a minute.
The big change, however, has come from implementing this software in industries where data collection and data entry took up a large part of the work. Professions that handle lots of data, such the legal sector and accounting, can now automate work that used to require large teams and lots of time. So how does the software actually work?
What is OCR and how does it work for receipts?
This section is going to be a bit technical, but details are important when it comes to handling this kind of data.
OCR software is designed to analyse images of printed text and turn it into data that a computer can process more easily – or put into a spreadsheet/ general ledger.
Straight away, OCR faces a few problems. Firstly, computers can ‘see’ what an image is. All the computer is aware of is a jumble of pixels and colours. It therefore needs to identify any part of the image that might be text and parse that out.
Then we reach the second issue — there is no one way of producing letters. Not only does every person writes the letter A in a slightly different way, but receipts and invoices can use a variety of fonts, with the letter A printed in many different forms.
There are two ways of dealing with this. Either by recognising whole characters as discrete patterns (pattern recognition) or by mapping the component parts of the characters (feature detection).
Pattern recognition works for fonts that the computer already knows - just compare your scanned image with a stored version of your alphabet an element matches, you know you’ve found a letter.
Feature recognition is a slightly more sophisticated way of finding characters. Here’s a handy explanation from a great tech resource called Explain that stuff:
Suppose you're an OCR computer program presented with lots of different letters written in lots of different fonts; how do you pick out all the letter As if they all look slightly different? You could use a rule like this: If you see two angled lines that meet in a point at the top, in the center, and there's a horizontal line between them about halfway down, that's a letter A. Apply that rule and you'll recognize most capital letter As, no matter what font they're written in.
This is also the method used for recognising handwritten text. Combined, these two methods can do a reasonabley good job of reading most receipts and invoices, but new technology is taking it further.
Receipt Bank has been investing heavily in Machine Learning and Artificial Intelligence development over the last three years, with the aim of making our data extraction smarter, faster and more accurate.
What does difference does automating data entry make?
The exact amount of time an accountant or bookkeeper spends on data entry varies depending on their processes and how much work the client does themselves. However, when we talk to our partners, we always hear a similar story – collecting and entering data got in the way of helping their clients.
One Australian partner, Digit Books, used to spend over 60% of their time per client just entering data into their clients' books. Using Receipt Bank to automate their data entry, they cut this time in half.
And what did they do with this extra time? According to Managing Director Leah Moore, “Everything that we’ve saved we put back in the client themselves, making sure they feel valued and they have the information they need.”
Automating data entry gives you the time to actually use the numbers to help your clients, instead of spending all your time on manual work. And since automation is going to play an ever larger role in accounting, including the arrival of new, smarter AI tools, you need to make sure your firm stays competitive.
OCR receipt software is already driving some of the world’s top firms, and Receipt Bank can give you that same advantage.