This technical tip shows how to Extract Text from Specific Part of the Image. Aspose.OCR for .NET provides OcrEngine class to extract text from a specific part of the image document. The OcrEngine class requires following three items for character recognition:
1. Source Image
2. Language
3. Resource file
Steps to Extract Text from a Specific Recognition Block Below are the steps to perform OCR on image using OcrEngine class of Aspose.OCR for .NET component.
1. Create an instance of OcrEngine and initialize using default constructor.
2. Set the image file using OcrEngine.Image property on which OCR is to be performed.
3. Add language(s) using OcrEngine.Languages.AddLanguage() method.
4. Set start point, width and height of the recognition block using RecognitionBlock.FromRectangle method.
5. Set the resource file using OcrEngine.Resource property.
6. Call OcrEngine.Process() method to perform OCR on the whole image.
7. If OcrEngine.Process() returns true, then get the recognized text with IRecognitionBlock.Text property.
Sample Code to do OCR on a Specific Block of Image
[C#]
const string resourceFileName = @"2011.07.02 v1.0 Aspose.OCR.Resources.zip";
try
{
//Create OcrEngine instance and assign
//image, language and image configuration
OcrEngine ocrEngine = new OcrEngine();
ocrEngine.Image = ImageStream.FromFile("Sample.bmp");
ocrEngine.Languages.AddLanguage(Language.Load("english"));
ocrEngine.Config.NeedRotationCorrection = false;
ocrEngine.Config.UseDefaultDictionaries = true;
//Select the block to recognize text
int startX = 0, startY = 0, width = 120, height = 100;
IRecognitionBlock rectangleBlock = Aspose.OCR.RecognitionBlock.FromRectangle(startX, startY, width, height);
ocrEngine.AddRecognitionBlock(rectangleBlock);
//Set resource file name and extract OCR text
using (ocrEngine.Resource = new FileStream(resourceFileName, FileMode.Open))
{
try
{
if (ocrEngine.Process())
{
Console.WriteLine(rectangleBlock.Text.ToString());
}
}
catch (Exception ex)
{
Console.WriteLine("Exception: " + ex.Message);
}
}
ocrEngine = null;
}
catch (Exception ex)
{
Console.WriteLine("Exception: " + ex.Message);
}
[VB.NET]
Const resourceFileName As String = "2011.07.02 v1.0 Aspose.OCR.Resources.zip"
Try
'Create OcrEngine instance and assign
'image, language and image configuration
Dim ocrEngine As OcrEngine = New OcrEngine()
ocrEngine.Image = ImageStream.FromFile("Sample.bmp")
ocrEngine.Languages.AddLanguage(Language.Load("english"))
ocrEngine.Config.NeedRotationCorrection = False
ocrEngine.Config.UseDefaultDictionaries = True
'Select the block to recognize text
Dim startX As Integer = 0, startY As Integer = 0, width As Integer = 120, height As Integer = 100
Dim rectangleBlock As IRecognitionBlock = Aspose.OCR.RecognitionBlock.FromRectangle(startX, startY, width, height)
ocrEngine.AddRecognitionBlock(rectangleBlock)
'Set resource file name and extract OCR text
ocrEngine.Resource = New FileStream(resourceFileName, FileMode.Open)
Using ocrEngine.Resource
Try
If ocrEngine.Process() Then
Console.WriteLine(rectangleBlock.Text.ToString())
End If
Catch ex As Exception
Console.WriteLine("Exception: " & ex.Message)
End Try
End Using
ocrEngine = Nothing
Catch ex As Exception
Console.WriteLine("Exception: " & ex.Message)
End Try
More about Aspose.OCR for .NET - Homepage of Aspose.OCR for .NET: http://www.aspose.com/categories/.net-components/aspose.ocr-for-.net/default.aspx
Contact Information
Aspose Pty Ltd, Suite 163,
79 Longueville Road
Lane Cove, NSW, 2066
Australia
http://www.aspose.com/
sales@aspose.com
Phone: 888.277.6734
Fax: 866.810.9465