Regex Timeout Exeption in c# 4.0

Posted by Patil_Rakesh under C# on 6/5/2013 | Points: 10 | Views : 2597 | Status : [Member] | Replies : 2
Hi,
I have a long string and one pattern matching regular expression,
when i apply regex.matches method on long string it throws a timeout exception,
Plz help me if you have any idea...




Responses

Posted by: lakhansin-22735 on: 7/4/2013 [Member] Starter | Points: 25

Up
0
Down
It is because the exception that is thrown when the execution time of a regular expression pattern-matching method exceeds its time-out interval.
You can catch this exception and handle as per your requirement; see the below code; hope this will help you.

using System;

using System.Collections.Generic;
using System.IO;
using System.Text.RegularExpressions;

public class Example
{
public static void Main()
{
RegexUtilities util = new RegexUtilities();
string title = "Doyle - The Hound of the Baskervilles.txt";
try {
var info = util.GetWordData(title);
Console.WriteLine("Words: {0:N0}", info.Item1);
Console.WriteLine("Average Word Length: {0:N2} characters", info.Item2);
}
catch (IOException e) {
Console.WriteLine("IOException reading file '{0}'", title);
Console.WriteLine(e.Message);
}
catch (RegexMatchTimeoutException e) {
Console.WriteLine("The operation timed out after {0:N0} milliseconds",
e.MatchTimeout.TotalMilliseconds);
}
}
}

public class RegexUtilities
{
public Tuple<int, double> GetWordData(string filename)
{
const int MAX_TIMEOUT = 1000; // Maximum timeout interval in milliseconds.
const int INCREMENT = 350; // Milliseconds increment of timeout.

List<string> exclusions = new List<string>( new string[] { "a", "an", "the" });
int[] wordLengths = new int[29]; // Allocate an array of more than ample size.
string input = null;
StreamReader sr = null;
try {
sr = new StreamReader(filename);
input = sr.ReadToEnd();
}
catch (FileNotFoundException e) {
string msg = String.Format("Unable to find the file '{0}'", filename);
throw new IOException(msg, e);
}
catch (IOException e) {
throw new IOException(e.Message, e);
}
finally {
if (sr != null) sr.Close();
}

int timeoutInterval = INCREMENT;
bool init = false;
Regex rgx = null;
Match m = null;
int indexPos = 0;
do {
try {
if (! init) {
rgx = new Regex(@"\b\w+\b", RegexOptions.None,
TimeSpan.FromMilliseconds(timeoutInterval));
m = rgx.Match(input, indexPos);
init = true;
}
else {
m = m.NextMatch();
}
if (m.Success) {
if ( !exclusions.Contains(m.Value.ToLower()))
wordLengths[m.Value.Length]++;

indexPos += m.Length + 1;
}
}
catch (RegexMatchTimeoutException e) {
if (e.MatchTimeout.TotalMilliseconds < MAX_TIMEOUT) {
timeoutInterval += INCREMENT;
init = false;
}
else {
// Rethrow the exception.
throw;
}
}
} while (m.Success);

// If regex completed successfully, calculate number of words and average length.
int nWords = 0;
long totalLength = 0;

for (int ctr = wordLengths.GetLowerBound(0); ctr <= wordLengths.GetUpperBound(0); ctr++) {
nWords += wordLengths[ctr];
totalLength += ctr * wordLengths[ctr];
}
return new Tuple<int, double>(nWords, totalLength/nWords);
}
}


see the below link for in-depth

http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regexmatchtimeoutexception.aspx

Lakhan Singh
Tech Lead
BeyondKey System Pvt. Ltd.
Indore, M.P.
India

Patil_Rakesh, if this helps please login to Mark As Answer. | Alert Moderator

Posted by: Patil_Rakesh on: 7/4/2013 [Member] Starter | Points: 25

Up
0
Down
Thanks Lakhansin for your response..
I guess you are using .net 4.5 for this code.
I wanted to solve this problem in 4.0, because 4.0 regex constructor dont have Timeout parameter in Regex Constructor,
If you have any idea then let me know..


Patil_Rakesh, if this helps please login to Mark As Answer. | Alert Moderator

Login to post response