Chapter 25. PDF417 (2D-Barcode)

Table of Contents

25.1. Principle of PDF417 Barcodes
25.1.1. PDF417 standard
25.1.2. Data capacity
25.1.3. Structure of PDF417 barcodes
25.2. Creating barcodes
25.2.1. Introduction
25.3. Creating barcodes
25.4. Specifying the PDF417 parameters
25.4.1. Specifying encoding and input data
25.4.2. Encoder option: Adjusting the number of data columns
25.4.3. Encoder option: Adjusting the error level
25.4.4. Truncated PDF417
25.5. Adjusting the output
25.5.1. Output format
25.5.2. Summary of user settings for the backend
25.6. A template to create barcodes
25.7. Method reference
25.7.1. Encoder methods
25.7.2. Common backend methods
25.7.3. Image backend methods
25.7.4. Postscript backend methods
25.8. Example scripts
25.8.1. Showing human readable text
25.8.2. Altering colors
25.8.3. Creating postscript output
25.8.4. Manually selecting compaction schema

Principle of PDF417 Barcodes

Note

This module is only available in the pro-version of the library.

Caution

In order to use the PDF417 barcode module it is necessary for the PHP installation to support the function bcmod(). This is enabled when compiling PHP by making sure that the option --enable-bcmath is given when configuring PHP at compile time.

This first section gives a very brief explanation of the general structure of PDF417 barcodes and some capacity figures.

PDF417 was one of the first publicly available high density (capable of storing up to 2710 data characters) two dimensional barcodes. It was originally published by Symbol Technologies, Inc. but has since become an ISO standard. PDF417 belongs to the early two dimensional barcodes which internally consists of a number of linear barcodes stacked on top of each other. This is in contrast to the more modern two dimensional barcodes like Datamatrix and QR code which are truly two dimensional in that they have moved away from the row thinking in the internal construction of the barcode.

PDDF417 barcodes are extensively used for example within aviation, automobile industry and health care.

Strictly speaking it is not necessary to know this level of detail to use the PDF417 barcode module but we would recommend to read through this at least once since some parameters (like number of columns - explained below - that are used adjustable)

PDF417 is an acronym for Portable Data Format 4 of 17 where 4 of 17 describes the structure of how a single data character is encoded (4 bars and 4 spaces in a 17 module wide structure).

PDF417 standard

The PDF417 is high capacity two dimensional barcode and is fully described in the official standard ISO/IEC 15438:2001 available for purchase from ISO Standard Organization.

Data capacity

PDF417 is a row based 2 dimensional barcode that consists of a maximum of 90 rows and 30 columns. The maximum number of data is dependent on

  • The compaction mode used

  • The number of columns (and rows)

  • The error correction level

The maximum data size is dependent on both the compaction mode as well as the input data. The figures listed below will give some idea on the capacity

  • 2710 digits in numeric compaction mode

  • 1850 characters in text compaction mode

  • 1108 bytes in byte compaction mode

One barcode can hold up to a maximum of 929 codewords (data count + data + error correction)

Structure of PDF417 barcodes

A high level overview of the structure of a PDF417 barcode is shown in Figure 25.1. PDF417 Structure - Overview . A PDF417 barcode can be thought of as a number of linear barcode stapled on top of each other. Each row in the barcode is constructed in a similar way.

Each data word (symbol character) consists of 4 bars and four spaces in a 17 module structure, hence the name PDF417. A more detailed explanation of a real PDF417 barcode is shown in Figure 25.2. PDF417 Structure - Details of a real barcode

There are three distinct areas in a barcode:

  1. Start and stop pattern (light red background color). Used to help the scanner find the start and beginning of the barcode. These patterns are static and are the same for all barcodes.

  2. Left and right row indicators. Used to help the scanner orient itself in the barcode. These patterns are dependent on the actual data in the barcode to achieve maximum contrast.

  3. Data and data count. This is unique for each barcode and represents the encoded data. PDF417 specifies several ways to encode the characters in the input data to achieve maximum compression level based on the knowledge (and restriction) on the input alphabet. For example, if the data is known to be only numeric the encodation can take advantage of this and make the compaction schema more efficient than if also alphabetical letters have to be encoded.

    The way the data is encoded is user specifiable. By default the library analyses the input data and determines an optimal mix of encoding suitable for this particular data.

  4. Error correction codewords. Each PDf417 have a user selectable error correction level. Since the barcode have a specified size this means that the more error correction words that are used the less data can fit. The error correction words are added to the end of the payload data, each barcode has a minimum of 2 error detection codewords. Up to 510 additional error correction codewords can be added for maximum data correction.

Figure 25.1. PDF417 Structure - Overview

PDF417 Structure - Overview


In Figure 25.2. PDF417 Structure - Details of a real barcode the distinct data column (which on each row holds one data word) are indicated at the bottom (w1,w2,w3,w4,w5). This particular barcode have 8 rows and 5 columns which means that the total number of data words + error correction words encoded are 8x5 = 40.

Figure 25.2. PDF417 Structure - Details of a real barcode

PDF417 Structure - Details of a real barcode


The data to be converted into a barcode has to go through a number of steps which are handled by the library:

  1. The first step is a high level compression schema known as compaction. This schema translates the input string into a number of codewords. Each codeword has a numeric value between 0 and 928. To achieve the highest possible compaction and flexibility the PDF417 standard defined three different compaction schema:

    • numeric (encodes only digits 0-9, ASCII 30-39). This schema can compact up to 2.9 digits per codeword (and has the highest density)

    • byte (encodes ASCII 0-255). This schema can only compact up to 1.2 bytes per codeword (and has the lowest density)

    • text (which encodes ASCII 32-126). This schema can compact up to 1.8 characters per codeword.

  2. The second step is the transformation of codeword into (4,17) symbols. The exact symbol used is dependent on which row the codeword to be encoded is on. Three different sets of codewords, known as clusters, are used. This ensures that two adjacent rows uses different clusters. This allows the barcode to be scanned without using specific divider symbols.

  3. In the third step the error codewords specified are calculated and added to the end of the payload data. The error correction uses polynomial Reed-Solomon error correcting coding (same schema as used on CD:s) to achieve a good balance between error correcting efficiency and computational effort and space requirements.

  4. Finally these codeword are positioned sequentially in each row starting at the top left corner down to the bottom right in between the left and right row indicators, and the start and stop patterns.

By necessity this s a fairly shallow description where we have omitted many technical details in the encodation process. We therefore refer to the official standard which gives much more technical details on the encodation process.