Solution - HtmlParser doesn't exsits in iTextSharp 5.0.2.0 (Free .NET .pdf generator)

Posted by SheoNarayan under Error and Solution on 7/31/2010 | Views : 31443 | Status : [Administrator] | Replies : 8

Earlier version of iTextSharp (free .net .pdf generator) had HtmlParser that was used to parse the html from the string, the latest version that I downloaded from http://sourceforge.net/projects/itextsharp/ doesn't have this method. I was struggling on how to parse the html and I coudn't find googling, so here is the work around for that. This is needed when you want to export any html into .pdf for example GridView to .pdf. I shall write a separate article on this topic soon.

In below code snippet, I am assuming that strB is the StringBuilder that has all the html code. You will have to use System.IO and iTextSharp.text.html.simpleparser namespaces.

            // now read the Grid html one by one and add into the document object
using (TextReader sReader = new StringReader(strB.ToString()))
{
List<IElement> list = HTMLWorker.ParseToList(sReader, new StyleSheet());
foreach (IElement elm in list)
{
document.Add(elm);
}
}


Hope this will help somebody.

Thanks

Regards,
Sheo Narayan
http://www.dotnetfunda.com



Responses

Posted by: Vickyaa on: 10/6/2010 [Member] Starter | Points: 25

Up
0
Down
Wornderful. works fine..

SheoNarayan, if this helps please login to Mark As Answer. | Alert Moderator

Posted by: SheoNarayan on: 10/6/2010 [Administrator] HonoraryPlatinum | Points: 25

Up
0
Down
Glad to know this. Thanks

Regards,
Sheo Narayan
http://www.dotnetfunda.com

SheoNarayan, if this helps please login to Mark As Answer. | Alert Moderator

Posted by: Lsochia on: 4/16/2011 [Member] Starter | Points: 25

Up
0
Down
Sheo,

Your solution for html parsing in iTextSharp is the closest I've come to this problem yet.

My need is a little bit different, I'm filling a PDF form with data from my ASP.NET app... below is the relevant code with "MySummary" being an object that contains html.

I don't have a list so I tried "HTMLWorker.ReferenceEquals" but then the PDF form field just gets filled with "False" I'm not sure how to interpret this, other than I've obviously got something wrong... can you help?

thanks in advance... Lee

-----start code--------------
'instantiate the PDFReader object.
Dim reader As New PdfReader(CoverPageTemplate)

Using fs As New FileStream(pdfFile, FileMode.Create)
'Get the PDF file into stamper object
Dim stamper As New PdfStamper(reader, fs)
'Get fields from the PDF file
Dim fields As AcroFields = stamper.AcroFields
'Set form fields
fields.SetField("pdfID", MyID)
fields.SetField("pdfType", MyType)
fields.SetField("pdfUnit", MyUnit)
fields.SetField("pdfID", MyID)
fields.SetField("pdfCreateDate", MyCreateDate)

'instantiate the TXTreader object.
Dim TXTreader As New StringReader(MySummary())
MySummary = HTMLWorker.ReferenceEquals(TXTreader, New StyleSheet())

fields.SetField("pdfSummary", HttpUtility.HtmlDecode(MySummary))
fields.SetField("pdfSummary", HttpUtility.HtmlDecode(MySummary))
fields.SetField("pdfRelitem", MyID)
'flatten form fields and close document
stamper.FormFlattening = True
stamper.Close()
End Using
----------end code---------------

SheoNarayan, if this helps please login to Mark As Answer. | Alert Moderator

Posted by: Sarathy on: 5/18/2011 [Member] Starter | Points: 25

Up
0
Down
Hi,

When I use the below cod, I got error at (document.add(elm)) like "Unable to cast object of type 'iTextSharp.text.html.simpleparser.TableWrapper' to type 'iTextSharp.text.Meta'."
Please help me where i did mistake.
// now read the Grid html one by one and add into the document object
using (TextReader sReader = new StringReader(strB.ToString()))
{
List<IElement> list = HTMLWorker.ParseToList(sReader, new StyleSheet());
foreach (IElement elm in list)
{
document.Add(elm);
}
}




Parthasarathy M

SheoNarayan, if this helps please login to Mark As Answer. | Alert Moderator

Posted by: Zondo on: 10/31/2011 [Member] Starter | Points: 25

Up
0
Down
@Sheo Narayan,
thank you very much, it works great.thanks

SheoNarayan, if this helps please login to Mark As Answer. | Alert Moderator

Posted by: Awaisusmanbutt on: 2/14/2012 [Member] Starter | Points: 25

Up
0
Down
Thanks a lot, it works perfectly fine.

SheoNarayan, if this helps please login to Mark As Answer. | Alert Moderator

Posted by: Bhupendra on: 2/8/2013 [Member] Starter | Points: 25

Up
0
Down
Below is my coding, i am also facing the same error, so please help me to solve this error.
Thanks in advance.

protected void btn_PDF_Click(object sender, EventArgs e)
{

//fn_test();

Uri strurl = Request.Url;
string url = strurl.ToString();
string text = GetPageText(url);
string filepath = Server.MapPath("images\\test.htm"); //"c:\\test.htm";
StreamWriter writer = new StreamWriter(filepath);
writer.Write(text);
writer.Close();

htmltopdf();
}

public string GetPageText(string url)
{
string htmlText = string.Empty;
string FILE_NAME = Server.MapPath("images\\test.xml"); //"c:\\test.xml";

try
{

HttpWebRequest requestIP = (HttpWebRequest)WebRequest.Create(url);
requestIP.Timeout = 10000;
using (HttpWebResponse responseIP = (HttpWebResponse)requestIP.GetResponse())
{
using (Stream streamIP = responseIP.GetResponseStream())
{
using (StreamReader readerText = new StreamReader(streamIP))
{
htmlText = readerText.ReadToEnd();
string text = htmlText;

StreamWriter writer = new StreamWriter(FILE_NAME);
writer.Write(text);
writer.Close();
}
}
}
}
finally
{
}
return htmlText;
}


public void htmltopdf()
{


iTextSharp.text.Document doc = new iTextSharp.text.Document();
PdfWriter.GetInstance(doc, new FileStream(Server.MapPath("image\\test.pdf"), System.IO.FileMode.Create));


HTMLParser.Parse(doc, Server.MapPath("images\\test.htm"));
//XmlParser.Parse(doc, Server.MapPath("image\\test.xml"));
//ITextHandler h = new ITextHandler(doc, new TagMap("c:\\test.xml"));
//h.Parse("c:\\test.xml");

if (File.Exists(Server.MapPath("images\\test.htm")))
File.Delete(Server.MapPath("images\\test.htm"));
if (File.Exists(Server.MapPath("images\\test.xml")))
File.Delete(Server.MapPath("images\\test.xml"));
}

Bhupendra

SheoNarayan, if this helps please login to Mark As Answer. | Alert Moderator

Posted by: Qaisarawan on: 8/22/2013 [Member] Starter | Points: 25

Up
0
Down
I have tried below code i am also facing the error, so please help me to solve this error.
Thanks.

public MemoryStream CreatePdf(string html)
{
MemoryStream m = new MemoryStream();
try
{
Document document = new Document(PageSize.LETTER);
PdfWriter.GetInstance(document, new FileStream(@"E:\Traning Projects\ConvertDocFileToPDF\ExampleDoc.pdf", FileMode.OpenOrCreate));

//StringReader sr = new StringReader(html);
//XmlTextReader xtr = new XmlTextReader(sr);

document.Open();
StringBuilder sb = new StringBuilder(html);
using (TextReader sReader = new StringReader(sb.ToString()))
{

List<IElement> list = HTMLWorker.ParseToList(sReader, new StyleSheet());

foreach (IElement elm in list)
{

document.Add(elm);

}

}

// HtmlParser.Parse(document, xtr);

//xtr.Close();
document.Close();
}
catch (Exception ex)
{
System.Diagnostics.EventLog.WriteEntry("Application", ex.Message);
throw ex;
}
return m;
}

I got this error: "The given key was not present in the dictionary."

Kindly help me. Thanks

SheoNarayan, if this helps please login to Mark As Answer. | Alert Moderator

Login to post response