Click or drag to resize

Working With Page Objects

This topic contains the following sections:

Overview

Page object is a feature that allows developers to work with text, path, image objects. Patagames PDF SDK provides APIs to add and delete PDF objects in a page and set specific attributes. Using page object, developers can create PDF page from object contents. Other possible usages of page object include adding headers and footers to PDF documents, adding an image logo to each page, or generating a custom view of annotations.

PDF provides five types of graphics objects:

  • The PdfPathObject The PdfPathObject is an arbitrary shape made up of straight lines, rectangles, and cubic Bézier curves. A path may intersect itself and may have disconnected sections and holes.

  • The PdfTextObject consists of one character string that identify sequences of glyphs to be painted. Like a path, text can be stroked, filled, or used as a clipping boundary.

  • The PdfShadingObject describes a geometric shape whose color is an arbitrary function of position within the shape. A shading can also be treated as a color when painting other graphics objects; it is not considered to be a separate graphics object in that case.

  • The PdfImageObject consist of a rectangular array of color samples to be painted;

  • The PdfFormObject is used to group graphical elements together as a unit for various purposes.

Open in full size

Page Objects
Enumerate all page objects on a PDF page
C#
    // ...
    using (var doc = PdfDocument.Load("sample.pdf"))
    {
        PdfPage page = doc.Pages[0];
        //Recursively list and process all page objects on the page.
        EnumerateAllPageObjects(page.PageObjects);
    }
    // ...

public void EnumerateAllPageObjects(PdfPageObjectsCollection collection)
{
    foreach (PdfPageObject obj in collection)
    {
        if (obj is PdfFormObject)
        {
            var formObject = obj as PdfFormObject;
            EnumerateAllPageObjects(formObject.PageObjects);
        }
        else if (obj is PdfTextObject)
        {
            var textObject = obj as PdfTextObject;
            // process text object.
        }
        else if (obj is PdfImageObject)
        {
            var imageObject = obj as PdfImageObject;
            // process image object.
        }
        else if (obj is PdfPathObject)
        {
            var pathObject = obj as PdfPathObject;
            // process path object.
        }
        else if (obj is PdfShadingObject)
        {
            var shadingObject = obj as PdfShadingObject;
            // process shading object.
        }
    }
}
Create a text object in a PDF page
C#
using (var doc = PdfDocument.Load("sample.pdf"))
{
    PdfPage page = doc.Pages[0];

    // Vertical and horizontal position in user space coordinate system
    float xPos = 50;
    float yPos = 150;

    // Create fonts used for text objects
    PdfFont font = PdfFont.CreateStock(doc, FontStockNames.Arial);
    float fontSize = 12;

    // Create text oject and add it on the page
    PdfTextObject textObject = PdfTextObject.Create("Sample text", xPos, yPos, font, fontSize);
    page.PageObjects.Add(textObject);

    //Generate page content
    page.GenerateContent();
}
Change text in an existing text object
C#
using (var doc = PdfDocument.Load("sample.pdf"))
{
    PdfPage  page = doc.Pages[0];
    //Get existing text object
    var textObj = page.PageObjects[5] as PdfTextObject;
    //You must create a new font because in most cases, the font used in a PDF does not have glyphs for those characters that are not used on the page.
    textObj.Font = CreateNewFont(doc);
    //A text object may contain a clipping area, which may result in the new text not fitting into the visible area of the text object. So you may need to remove it.
    textObj.RemoveClipPath();
    //Set new text
    textObj.TextUnicode = "Test";
    //After changing all the objects on the page, you need to generate new page content.
    page.GenerateContent();
}
C#
private PdfFont CreateNewFont(PdfDocument doc)
{
    //The embedded font is most preferable, but you can use one of the standard ones
    return PdfFont.CreateStock(doc, FontStockNames.Arial);

    //embedded font
    byte[] fontFileContent = System.IO.File.ReadAllBytes("font.ttf");
    PdfFont.CreateEmbeddedFont(doc, fontFileContent, FontCharSet.UNICODE_CHARSET, false);
}
Create an image object in a PDF page
C#
using (var doc = PdfDocument.Load("sample.pdf"))
{
    PdfPage page = doc.Pages[0];

    // Vertical and horizontal position in user space coordinate system
    float xPos = 50;
    float yPos = 150;
    //The image resolution
    float horizontalResolution = 95;
    float verticalResolution = 95;
    //The size of default user space units
    float pdfUserUnit = 72;

    //Create a PdfBitmap from the specified image.
    using (PdfBitmap pdfBitmap = PdfBitmap.FromFile("sample.png"))
    {
        // Create image oject and add it on the page
        var imageObject = PdfImageObject.Create(doc, pdfBitmap, xPos, yPos);
        page.PageObjects.Add(imageObject);

        // Calculate size of image in PDF points
        FS_SIZEF size = new FS_SIZEF(
            pdfBitmap.Width * pdfUserUnit / horizontalResolution,
            pdfBitmap.Height * pdfUserUnit / verticalResolution);

        //set image matrix
        imageObject.Matrix = new FS_MATRIX(size.Width, 0, 0, size.Height, 0, 0);

        //Generate page content
        page.GenerateContent();
    }
}
Extracting images from PDF file

The example demonstrates extracting different format images from a pdf file, and saves them to disk.

C#
public void ExtractAllImages()
{
    //Open and load a PDF document from a file.
    using (var doc = PdfDocument.Load(@"sample.pdf"))
    {
        //Enumerate all pages sequentially in a given document
        foreach (var page in doc.Pages)
        {
            //Extract and save images
            ExtractImagesFromPage(page);

            //dipose page object to unload it from memory
            page.Dispose();
        }
    }
}

private void ExtractImagesFromPage(PdfPage page)
{
    int idx = 0;
    //Enumerate all objects on a page
    foreach (var obj in page.PageObjects)
    {
        var imageObject = obj as PdfImageObject;
        if (imageObject == null)
            continue; //if not an image object then nothing do

        //Save image to disk
        var path =$"Output\\image_{page.PageIndex}_{idx++}.png";
        imageObject.Bitmap.GetImage().Save(path, ImageFormat.Png);
    }
}
Generate PDF From Multiple Images

This example shows how you can generate a PDF document from a bunch of scanned images using a simple C# code.

C#
public void GeneratePdf()
{
    //Create a PDF document
    using (var doc = PdfDocument.CreateNew())
    {
        //Read images
        var files = System.IO.Directory.GetFiles(@"c:\Images\", "*.*",
                    System.IO.SearchOption.AllDirectories);
        foreach (var file in files)
        {
            //Create PdfBitmap from image file
            using (PdfBitmap pdfBitmap = PdfBitmap.FromFile(file))
            {
                //Create Image object
                var imageObject = PdfImageObject.Create(doc, pdfBitmap, 0, 0);
                //Calculate size of image in PDF points
                var size = CalculateSize(pdfBitmap.Width, pdfBitmap.Height);
                //Add empty page to PDF document
                var page = doc.Pages.InsertPageAt(doc.Pages.Count, size);
                //Insert image to newly created page
                page.PageObjects.Add(imageObject);
                //set image matrix
                imageObject.Matrix = new FS_MATRIX(size.Width, 0, 0, size.Height, 0, 0);
                //Generate PDF page content to content stream
                page.GenerateContent();
            }
        }
        // Save  PDF document as "saved.pdf" in no incremental mode
        doc.Save(@"c:\test.pdf", SaveFlags.NoIncremental);
    }
}
/// <summary>
/// The function takes width and height of the bitmap in pixels as well as 
/// horizontal and vertical DPI and calculates the size of the PDF page. 
/// To understand the conversion you should know the following:
///     One inch contains exactly 72 PDF points;
///     DPI of the scanned image may vфry and depends on scanning resolution
/// <summary>
private FS_SIZEF CalculateSize(int width, int height, float dpiX = 300, float dpiY = 300)
{
    return new FS_SIZEF()
    {
        Width = width * 72 / dpiX,
        Height = height * 72 / dpiY
    };
}
Create dashed line
C#
float x1, y1, x2, y2;
x1 = 100; y1 = 400;
x2 = 400; y2 = 300;
var dashedline = PdfPathObject.Create(FillModes.None, true);
dashedline.Path.AppendLine(x1, y1, x2, y2);

float[] dash = { 10, 5, 2, 5 };
dashedline.LineWidth = 3.0f;
dashedline.StrokeColor = FS_COLOR.Red;
dashedline.SetDashArray(dash);
page.PageObjects.Add(dashedline);

page.GenerateContent();