Saturday, June 15, 2013

Create PDF books with XMLtoPDFBook

By Vasudev Ram


XMLtoPDFBook is a program that lets you create simple PDF books from XML text content. It requires Python, ReportLab and my xtopdf toolkit for PDF creation.

(Use ReportLab v1.21, not the 2.x series; though 2.x has more features, xtopdf has not been tested with it; also, those additional features are not required for xtopdf.)

XMLtoPDFBook.py is released as open source software under the BSD license, and I'll be adding it to the tools in my xtopdf toolkit.

Here's how to use XMLtoPDFBook:

In a text editor, create a simple XML template for the book, like this:
<?xml version="1.0"?>
<book>
        <chapter>
        Chapter 1 content here.
        </chapter>

        <chapter>
        Chapter 2 content here.
        </chapter>
</book>
Add as many chapter elements as you need.

Then write or paste the text of one chapter inside each chapter element, in sequence.

Now you can convert the book content to PDF using this program, XMLtoPDFBook:
#--------------------------------------------------
# XMLtoPDFBook.py

# A program to convert a book in XML text format to a PDF book.
# Uses xtopdf and ReportLab.

# Author: Vasudev Ram - http://www.dancingbison.com
# Version: v0.1

#--------------------------------------------------

# imports

import sys
import os
import string
import time

from PDFWriter import PDFWriter

try:
    import xml.etree.cElementTree as ET
except ImportError:
    import xml.etree.ElementTree as ET

#--------------------------------------------------

# global variables

sysargv = None

#--------------------------------------------------

def debug(message):
    sys.stderr.write(message + "\n")

#--------------------------------------------------

def get_xml_filename(sysargv):
    return sysargv[1]

#--------------------------------------------------

def get_pdf_filename(sysargv):
    return sysargv[2]

#--------------------------------------------------

def XMLtoPDFBook():

    debug("Entered XMLtoPDFBook()")

    global sysargv

    xml_filename = get_xml_filename(sysargv)
    debug("xml_filename: " + xml_filename)
    pdf_filename = get_pdf_filename(sysargv)
    debug("pdf_filename: " + pdf_filename)

    pw = PDFWriter(pdf_filename)
    pw.setFont("Courier", 12)
    pw.setHeader(xml_filename + " to " + pdf_filename)
    pw.setFooter("Generated by ElementTree and xtopdf")

    tree = ET.ElementTree(file=xml_filename)
    debug("tree = " + repr(tree))

    root = tree.getroot()
    debug("root.tag = " + root.tag)
    if root.tag != "book":
        debug("Error: Root tag is not 'book'")
        sys.exit(2)

    debug("=" * 60)
    for root_child in root:
        if root_child.tag != "chapter":
            debug("Error: root_child tag is not 'chapter'")
            sys.exit(3)
        debug(root_child.text)
        lines = root_child.text.split("\n")
        for line in lines:
            pw.writeLine(line)
        pw.savePage()
        debug("-" * 60)
    debug("=" * 60)
    pw.close()

    debug("Exiting XMLtoPDFBook()")

#--------------------------------------------------

def main():

    debug("Entered main()")

    global sysargv
    sysargv = sys.argv

    # Check for right number of arguments.
    if len(sysargv) != 3:
        sys.exit(1)

    XMLtoPDFBook()

    debug("Exiting main()")

#--------------------------------------------------

if __name__ == "__main__":
    main()

#--------------------------------------------------

Here is an example run of XMLtoPDFBook, using my vi quickstart article earlier published in Linux For You magazine:

python XMLtoPDFBook.py vi_quickstart.xml vi_quickstart.pdf

This results in the contents of the article being published to PDF in the file vi_quickstart.pdf.

- Vasudev Ram - Dancing Bison Enterprises

Contact me

No comments: