address. Gaston Geenslaan 11 B4, 3001 Leuven
All Insights

Generating handwritten text with an AI model

In this blog, we talk about how we generated handwritten text with an AI model in order to train another model to extract handwritten text. A hard nut to crack, but we'll show you how we solved this problem.

Written by
Daphné Vermeiren
Marketing Manager

As you may know, we have been developing and improving our intelligent document processing, for a while now. Our objective is simple: we want to unburden you of repetitive, manual tasks like processing documents so you can spend your valuable time elsewhere. However, building a platform that can help you with this isn’t as easy as you might think, especially since we wanted to process any document.

Yes, you heard that right: any document. We wanted to process it all, including handwritten documents, particularly since one of our customers could benefit from a solution, namely HVW/CAPAC. This governmental organization processes many unemployment benefit cards, which contain several handwritten fields such as a social security number, date, address, etc. Processing these documents takes enormous amounts of time. Luckily for them, this is where Brainjar comes in to make this process more efficient!

In this blog, we will show you how we developed a solution.

Let’s write some text

Requirements for intelligent document processing

We developed so that our tool could extract relevant information from digital documents by accessing the digital text layer. These kinds of documents are the easiest to process.

In addition to processing purely digital documents, we had to extract information from scanned documents. For this, we used Tesseract, an optical character recognition (OCR) engine that can recognize more than 100 languages. We could already help our clients process a lot by processing these two kinds of documents. However, we wanted to make our tool much more powerful. So, we had to tackle a third category that no one likes to talk about: documents containing handwritten text.

As we have mentioned, this is not a quick and dirty implementation. As OCR engines cannot work with handwritten text, we had to think out-of-the-box and build a custom AI model. Creating an algorithm requires loads of training data.

Based on our initial estimates, we needed approximately 400.000 handwritten documents. My hand suddenly starts to feel like it’s developing carpal tunnel syndrome by just typing this number. ☹️ This kind of data is not immediately available, and I wouldn’t want to be the one who’s writing those documents. Yeah, we had to find a better solution for this.

Ideation of a new model

Down the rabbit hole, we went. We started thinking: what if we make a tool that can generate images containing handwritten text snippets? First, we would have much more control over our training data this way. In addition, our concerns about data volume would be in the past. With such a tool, we could ask the system to generate a certain amount of training samples, and the AI would generate the labeled training data for us.

Labeled? Yes! This working method enables us to generate the necessary labels for automatically. For instance, the label is implied when you tell the system to write a handwritten date in DD-MM-YYYY. This means that instead of processing and labeling 10.000 documents manually for days on end, we could do this in a matter of minutes. As a result, we have our handwritten text, we know exactly what kind of text is written, and we can see what the according label is. Same result, less effort.

Easier said than done, of course, because if you can’t design an AI system to recognize handwritten text, how on earth are you going to make an AI that generates text?

Training the AI model

We’ve got our idea. Now, we have to get to the result (a labeled dataset with handwritten text) to begin generating our training data. We’re ready to get to the AI side of our solution.

First, we have to decide on a model structure and train the AI models to write sentences themselves.

text generator training
Training our text generator

In our model structure, the generator is the most crucial part - as it is the part we will use in the end. To give the generator input, we fed the target image to a text recognizer (to deal with spelling) and a style extractor (which can extract whether the text is written in cursive, capitals, or other stylistic features) first, which respectively deliver spaced text and style vectors. When the generator does its thing, a generated image is again fed to a text recognizer that checks spelling, a discriminator that checks for adversarial loss, and an encoder that checks for perceptual loss.

The discriminator's adversarial loss, on the one hand, is meant to distinguish between fake and real samples and is an excellent judge of the generator's performance.

On the other hand, perceptual loss ensures that the generated images are not too different from their starting state by pixel, constraining the network's results visually.

Model structure

Taken together, the task of the generator is generating images that look like the original input. The other parts of our model check if this is done right. We do this for millions of samples, so the AI model is trained thoroughly.

Our own handwritten text

Hooray, we've got our solution. Now, we can generate handwritten text to train the platform. The beauty of this solution is that we can extend it - if a client needs to, for example, filter the dates of a document. Still, they all are written differently (in full, shortened, American, British...), we can train our model to recognize these types of inputs.

Would you like to know more? Book a meeting with us here!

Written by
Daphné Vermeiren
Marketing Manager

Subscribe to our newsletter

Raccoons NV (Craftworkz NV, Oswald AI NV, Brainjar NV, Wheelhouse NV, TPO Agency NV and Edgise NV are all part of Raccoons) is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick below:

By submitting this form, you agree to our privacy policy.

In order to provide you the content requested, we need to store and process your personal data. If you consent to us storing your personal data for this purpose, please tick the checkbox below.

More blog articles that can inspire you


How to build an inclusive work environment

In this blog, we will talk about inclusion, how you can make everyone on your team comfortable, and how you can embed in...

what we do


address. Gaston Geenslaan 11 B4, 3001 Leuven