Retrieve HTML Elements Mapping to PDF |
This is a very powerful feature of the converter which allows you to obtain the position in the generated PDF document for any HTML element. Knowing the position in the generated PDF document of any element from the HTML document allows you to create bookmarks for elements from the HTML document, create internal links between HTML elements, place texts or images over the HTML elements or assign a digital signature to a certain element from HTML.
This feature can be accessed using the HtmlElementsMappingOptions property of the PdfConverter object. This property allows you to define a list of page elements for which you want to retrieve position. That is done using the HtmlElementSelectors property. This property specifies the CSS selectors of the HTML elements. For example, the selector for all the H1 elements is "H1", the selector for all the elements with the CSS class name 'myclass' is "*.myclass" and the selector for the elements with the id 'myid' is "*#myid".
Read more about CSS selectors here.
The HtmlElementsMappingOptions property must be set before calling the convert method.
The HTML elements mapping is returned in the HtmlElementsMappingResult property. The HtmlElementsMappingResult result is a collection of HtmlElementMapping objects which offers the PDF page index where the element was mapped by the converter and the rectangle where the element was rendered inside that page, the element HTML ID, the element tag name, the element text and the element outer HTML code.
Code sample showing how to use this feature to highlight a specified list of HTML elements in the generated PDF document. The code sample below has been taken from the WinForms_HtmlElementsLocationInPdf demo application, all the H1 and IMG elements and the elements with the ID ID1 and ID2 will be highlighted with a green rectangle in the generated PDF.
private void btnConvert_Click(object sender, EventArgs e) { try { PdfConverter pdfConverter = new PdfConverter(); // inform the converter about the HTML elements for which we want the location in PDF // in this sample we want the location of IMG, H1 and H2 elements pdfConverter.HtmlElementsMappingOptions.HtmlElementSelectors = new string[] { "IMG", "H1", "H2" }; // call the converter and get a Document object from URL Document pdfDocument = pdfConverter.GetPdfDocumentObjectFromUrl(textBoxURL.Text.Trim()); // iterate over the HTML elements locations and hightlight each element with a green rectangle foreach (HtmlElementMapping elementMapping in pdfConverter.HtmlElementsMappingOptions.HtmlElementsMappingResult) { // because a HTML element can span over many PDF pages the mapping // of the HTML element in PDF document consists in a list of rectangles, // one rectangle for each PDF page where this element was rendered foreach (HtmlElementPdfRectangle elementLocationInPdf in elementMapping.PdfRectangles) { // get the PDF page PdfPage pdfPage = pdfDocument.Pages[elementLocationInPdf.PageIndex]; RectangleF pdfRectangleInPage = elementLocationInPdf.Rectangle; // create a RectangleElement to highlight the HTML element RectangleElement highlightRectangle = new RectangleElement(pdfRectangleInPage.X, pdfRectangleInPage.Y, pdfRectangleInPage.Width, pdfRectangleInPage.Height); highlightRectangle.ForeColor = Color.Green; pdfPage.AddElement(highlightRectangle); } } // save the PDF bytes in a file on disk string outFilePath = System.IO.Path.Combine(Application.StartupPath, "Result.pdf"); try { pdfDocument.Save(outFilePath); } finally { // close the Document to realease all the resources pdfDocument.Close(); } // open the generated PDF document in an external viewer DialogResult dr = MessageBox.Show("Open the rendered file in an external viewer?", "Open Rendered File", MessageBoxButtons.YesNo); if (dr == DialogResult.Yes) { System.Diagnostics.Process.Start(outFilePath); } } catch (Exception ex) { MessageBox.Show(ex.Message); return; } }