Picture Text Recognition: Jan 2025's Endpoint of the Month

Discover our Picture Text Recognition and how OCR simply extracts text from images. Learn about its benefits, how to set it up, and one creative use case.

Published
January 23, 2025

Optical Character Recognition (OCR) was first introduced in the 70s by Ray Kurzweil, an American computer scientist who developed and marketed the "omni-font OCR". This technology was able to process printed text in almost any font, which seemed unthinkable in at that time. OCR technology is now widely accessible as a cloud-based service via mobile devices and desktops.

Moreover, it has evolved even further and, nowadays, it is able to process other documents, such as camera images, scanned documents, and image-only PDF documents. As a result, individuals and businesses can now simplify digitalization by handling physical documents like study materials, contracts, invoices, forms, or IDs, and make them available on digital devices.

With this in mind, we developed two endpoints that make use of Optical Character Recognition to identify and extract text bodies from images or PDFs. In this blog, we've decided to focus on the OCR endpoint that processes images because we believe that this feature is innovative and can be useful for many people, businesses, and situations. Therefore, January's endpoint of the month is the Picture Text Recognition feature.

Why It Matters

  • Efficiency is Everything: Many document formats don't allow the extraction text from images or PDFs that easily, forcing individuals to extract text manually. Since our tools carries out this task automatically and accurately, individuals and business can save a big amount of time and money by automating this simple but tiresome process.
  • Non-Tech Savvy: Since this feature is no-code, anyone can use this feature. In fact, once the endpoint or scenario has been set up, you don't need to do anything but uploading the images that you want to analyze. Below you can find a tutorial on how to set up the endpoint.

Set It Up

Note: This feature will request a URL to be able to access the document. Therefore, we'll need to set up a content management software, like Dropbox or Google Drive, and create a URL for a document with 0CodeKit.

The first step is to upload the desired document into one of these software. Then, we sign up or log into one automation platform where this feature is available (Make, Zapier, and n8n). After that, we can set up the first Dropbox/Google Drive module and choose the feature called "Watch Files", which will look at a specified folder and it'll trigger whenever a file is uploaded. Later, we need to add a second Dropbox/Google Drive module with the feature "Download File" for the 0CodeKit to access this document.

Once the Dropbox/Google Drive module has been set up, we must integrate the 0CodeKit app, and find the feature "Create temporary URL to file" for 0CodeKit to be able to access the document via the URL. Here, we only have to click on the option "Dropbox/Google Drive - Download a File". After, we can add the last 0CodeKit module "Detect Text in a Picture with OCR AI" and drag the "Temporary File URL" item to the "Image URL" field. Finally, we click on "Save" and can now execute the scenario.

Creative Use Cases

A while ago, with the help of our "Picture Text Recognition" feature, we created an automation that was able to create study materials from scanned documents, like hand-written notes. Check out this automation!

What’s a Rich Text element?

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.

Static and dynamic content editing

A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!

How to customize formatting for each rich text

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.