OCR using C++
The purpose of this article is to teach you how to perform OCR using C++ by interfacing with an OCR SDK.
During our day to day development we needed to perform OCR (Optical Character Recognition) on scanned images, screenshots and other forms or files. We looked for an SDK which will allow that and examined ABBYY Cloud OCR SDK. They didn't have any C / C++ code samples so I had to develop ones...
Creating an ABBYY Cloud OCR App
Here are the steps that need to be taken once a trial account was created and verified.
1. Create a new App.
2. Check your email for the Application's password. You will need both Application ID and Applicaiton Password to start.
You should see these 2 placeholder lines in our code:
Replace these strings with your allocated Application ID and Application Password.
Initiating libCurl
First we initiate the Curl object:
<a href="https://curl.se/libcurl/c/curl_mime_init.html">curl_mime_init()</a>
Next we generate the upload part in our request:
field = curl_mime_addpart(form);
curl_mime_name(field, "upload");
and generate the file data using curl_mine_filedata() which is used to set our mime part's body data from out input file's contents.
curl_mime_filedata(field, file_to_upload.c_str());
Now we set the options by calling curl_easy_setopt(), which, as its name implies, prepare the set the options for our request.
We need the following attributes:
- PROCESSING_URL - is the URL given by Abbyy SDK.
- headerlist - was set earlier.
- form - is the upload part.
- APP_ID - is an application specific identifying provided by Abbyy SDK per each software we develop.
- PASSWORD - is the application's password, which needs to be generated.
- CURLOPT_WRITEFUNCTION is a callback function for writing the result of the request. Data is written into readBuffer which will hold the results we receive from the API.
curl_easy_setopt(curl, CURLOPT_URL, PROCESSING_URL);
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headerlist);
curl_easy_setopt(curl, CURLOPT_MIMEPOST, form);
curl_easy_setopt(curl, CURLOPT_USERNAME, APP_ID);
curl_easy_setopt(curl, CURLOPT_PASSWORD, PASSWORD);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, CurlWrite_CallbackFunc_StdString);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
Then we are ready to execute our request. We call curl_easy_perform().
Right after we check the results.
Checking the results
Next we obtain the Task_ID which is given to each OCR task. You can initiate several tasks and then wait for each of them to be completed. We use the Task_ID we have obtained to check the status of the task and wait until its completed.
while (1)
{
res = curl_easy_perform(status_curl);
if (res != CURLE_OK)
{
WriteLogFile(L"Error: %S", curl_easy_strerror(res));
}
else
{
WriteLogFile(L"Read Buffer:\n%S", readBuffer.c_str());
task_status = ObtainStatus(readBuffer);
WriteLogFile(L"task_status: %S", task_status.c_str());
}
if (task_status != "Completed")
{
//wait 5s before next check
Sleep(2000);
}
else
{
setcolor(LOG_COLOR_DARKGREEN, 0);
SetConsoleTitle(L"OCR completed");
setcolor(LOG_COLOR_WHITE, 0);
result_url = ObtainURL(readBuffer);
//replace all & to &
result_url = ReplaceAll(result_url, "&", "&");
//downloading text file of response
WriteLogFile(L"Downloading File from URL: %S", result_url.c_str());
op_curl = curl_easy_init();
if (op_curl)
{
headerlist = curl_slist_append(headerlist, buf);
curl_easy_setopt(op_curl, CURLOPT_URL, result_url.c_str());
curl_easy_setopt(op_curl, CURLOPT_HTTPHEADER, headerlist);
curl_easy_setopt(op_curl, CURLOPT_HTTPGET, 1L);
FILE* wfd = fopen(json_result_file.c_str(), "ab");
fprintf(wfd, "\n");
curl_easy_setopt(op_curl, CURLOPT_WRITEDATA, wfd);
curl_easy_perform(op_curl);
curl_easy_cleanup(op_curl);
fclose(wfd);
WriteLogFile(L"FILE saved");
}
break;
}
readBuffer = "";
} // While
Now once we have the results we just clean up everything.
Here is a video demo of how the program works:
Building Blocks
One of the key building blocks to such project, would be libCurl. I used it as a static library. The .lib file is included in the article's source code, however you can read about using libCurl as a static library here.
Notes: WriteLogFile() is one of my old logging functions described in this article.
Using the code
You can use different export format. See this link for the options.
You can define which languages you are expecting. Read this ilnk for the options.
You can use many languges, most of then can be also as hand written text (ICR). You set the list of expected languages in the PROCESSING_URL string.
In this example, we expect English and Hebrew:
#define PROCESSING_URL "https://cloud-westus.ocrsdk.com/processImage?exportFormat=txt&language=English,Hebrew"
Comentários