This article deals with reading text using Vision API Cognitive Service.
Introduction
Microsoft has come up with Cognitive Services. These are a set of machine learning algorithms which has been developed to solve problems in the field of Artificial Intelligence (AI).
Earlier we have written an article Extract Text from Image using Tesseract in C#.This article deals with reading text using Vision API Cognitive Service.
Let's do the experiment
Step 1
Let us first get the Computer Vision API key(s)
Step 2
Open Visual Studio 2017 and fire a WPF application
Step 3
Make the design as under
<Window x:Class="WpfApp2.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:local="clr-namespace:WpfApp2"
mc:Ignorable="d"
Title="MainWindow" Height="463.039" Width="525">
<Grid>
<Button Content="UploadPhoto" HorizontalAlignment="Left" Margin="110,113,0,0" VerticalAlignment="Top" Width="100" Click="btnUploadPhoto_Click"/>
<Image x:Name="imgView" Stretch="Uniform" Margin="0,0,-702.333,49.667"/>
<TextBlock Name="txtDescription" HorizontalAlignment="Left" Margin="10,157,0,0" TextWrapping="Wrap" VerticalAlignment="Top" Height="240" Width="412"/>
</Grid>
</Window>
Step 4
From the Nuget package manager console, fire the below
PM > Install-Package Microsoft.ProjectOxford.Vision
Also install
PM > Install-Package Newtonsoft.Json
Step 5
In the code behind,let us first write the below code
private async Task<string> ExtractTextFromImage(string imageFilePath)
{
HttpClient client = new HttpClient();
HttpResponseMessage response = null;
// Request headers.
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
// Request parameter.
string requestParameters = "handwriting=true";
// URI formation for API invocation
string uri = uriBase + "?" + requestParameters;
// Read the image as byte array - It's the request body
byte[] byteData = GetImageAsByteArray(imageFilePath);
// Set the header content type
ByteArrayContent content = new ByteArrayContent(byteData);
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
/*
OCR reading needs tow API calls
a)The first REST API call analyze the written text in the image
and the response object contains the URI to retrieve the result of the process.
b)The second REST API call retrieves the text written in the image.
*/
//a) The first REST API call analyze the written text in the image.
response = await client.PostAsync(uri, content);
//the response object contains the URI to retrieve the result of the process.
string operationLocation = null;
if (response.IsSuccessStatusCode)
operationLocation = response.Headers.GetValues("Operation-Location").FirstOrDefault();
//b)The second REST API call retrieves the text written in the image.
string imageContent =string.Empty;
response = await client.GetAsync(operationLocation);
imageContent = await response.Content.ReadAsStringAsync();
return imageContent;
}
Two REST API calls are needed for the OCR operation.
-
The first REST API call analyze the written text in the image and the response object contains the URI to retrieve the result of the process.
//a) The first REST API call analyze the written text in the image.
response = await client.PostAsync(uri, content);
//the response object contains the URI to retrieve the result of the process.
string operationLocation = null;
if (response.IsSuccessStatusCode)
operationLocation = response.Headers.GetValues("Operation-Location").FirstOrDefault();
-
The second REST API call retrieves the text written in the image.
//b)The second REST API call retrieves the text written in the image.
string imageContent =string.Empty;
response = await client.GetAsync(operationLocation);
imageContent = await response.Content.ReadAsStringAsync();
Once the textual content is obtained, it is deserialized to a .Net object. The dotnet object is as under
public class OCRResult
{
public string status { get; set; }
public RecognitionResult recognitionResult { get; set; }
}
public class RecognitionResult
{
public List<Line> lines { get; set; }
}
public class Line
{
public List<int> boundingBox { get; set; }
public string text { get; set; }
public List<Word> words { get; set; }
}
public class Word
{
public List<int> boundingBox { get; set; }
public string text { get; set; }
}
The process of deserialization is done as under
var deserializeObject = JsonConvert.DeserializeObject<OCRResult>(ocrInfo);
Step 6
Now let's run the application.
As can be figured out that the vision API has correctly identified the text from the image.
Let's try with another
Again the vision API has correctly identified the text from the image.
Reference
- Azure Cognitive Services
- Computer Vision API - v1.0
Conclusion
In this article we learnt Text Recognition using Vision API Cognitive Service with examples.Hope this helps. Thanks for reading.Zipped file attached.
Disclaimer: The image used in this article is for demo purpose only. They might be respective owners copyright content.