Perl pack Function
The pack function is used to convert a list into a string, according to a user-defined template (ex. a binary representation of a list).
The pack function has as arguments a LIST of values and a TEMPLATE. It concatenates into a string the list values converted according to the formats specified by the template. It returns the resulting string.
Its mainly purpose is to turn data (numbers and strings) into a sequence of bits that can be easily used by some external applications. You can use the Perl pack function either to achieve binary data to a file or for network transmission.
The reverse of this function is the unpack function which takes a sequence of bits and converts it into numbers and strings, needed for further processing.
The syntax form of the pack function is as follows:
The following table shows you some of the most frequent template characters:
a |
A string with arbitrary binary data, will be null padded |
A |
A text (ASCII) string, will be space padded |
b |
A bit string (ascending bit order inside each byte, like vec()) |
B |
A bit string (descending bit order inside each byte) |
c |
A signed char (8-bit) value |
C |
An unsigned char (octet) value |
d |
A double-precision float in the native format |
f |
A single-precision float in the native format |
h |
A hex string (low nybble first) |
H |
A hex string (high nybble first) |
i |
A signed integer value |
I |
A unsigned integer value |
l |
A signed long (32-bit) value |
L |
An unsigned long value |
n |
An unsigned short (16-bit) in "network" (big-endian) order |
N |
An unsigned long (32-bit) in "network" (big-endian) order |
s |
A signed short (16-bit) value |
S |
An unsigned short value |
U |
A Unicode character number |
v |
An unsigned short (16-bit) in "VAX" (little-endian) order |
V |
An unsigned long (32-bit) in "VAX" (little-endian) order |
x |
A null byte |
X |
Back up a byte |
The following example shows you how to deal with the Perl pack function and the 'a' template:
The following lines of the code allow you to see the content of the $str converted in hexadecimal characters.
If you have a list of strings to be converted, you can use the x (repetition) operator like in the following line of code:
The 'A' template is similar with the 'a' template, except that space is used instead of null - see the example shown for the 'a' template. For instance, you can use the line:
You’ll get as output: 31 32 33 61 20 20 20 where 20 is the hex code for the space character.
The 'b' format of the pack function packs strings consisting of 0 and 1 characters to bytes. A byte consists of a group of 8 bits as in the following figure:
1 0 1 1 0 0 1 0 |
MSB LSB |
LSB means here the least significant bit and it is sometimes referred as the rightmost bit. MSB is the most significant bit and is sometimes referred as the leftmost bit. In the above example, MSB = 1 and LSB = 0.
The 'b' format means that the bits are specified in increasing order from MSB to LSB. For instance, in the next line of code:
In this representation, the count refers to the number of bits to be packed - in the above example the count is 8.
You can use the Perl pack function with the 'b*' format to translate a string of 0’s and 1’s into a bit string, and the Perl unpack function to get back the list of 0’s and 1’s from the bit string.
Here’s an example:
The 'B' template is similar with the 'b' template except that the bits are specified in decreasing order from LSB to MSB. For instance, in the next line of code:
You can use the Perl pack function with the 'B*' format in a similar way as shown in the example for the 'b' format:
The 'c' template format is for signed char values. The usage is similar with the 'C' template format – see the 'C' template format for examples.
The 'C' format is used for unsigned characters. Here're a few examples:
The split function will create an array from the string 'This is Perl', each character becomes an element of the array. The map function will run the ord function for each element of the array and it will return a list with the ASCII values of the characters.
Finally, the pack function with the 'C*' template (for unsigned characters) is used for all the numbers of the list (if you put an * character inside the template, you don’t need to count the elements of the list argument).
The 'd' format of the pack function is for 64 bit floating point in native machine format. Its usage is similar with the 'f' template format.
The 'f' format of the Perl pack function is for 32 bit floating point in a native machine format. Because of the variety of floating formats around, it’s possible that floating point data written on one machine may not be readable on another – as in the case that the two machines have different endianness.
You can use this format like in the following line of code:
Or you can follow the 'f' specifier with a count, if you know how many floats you want to pack:
Here the Perl pack function will return a string with 3 single-precision float numbers packed into the specific native machine format. The unpack function will unpack the 3 numbers from the pack resulting string into an array.
Finally, the array with the result will be printed. As you can notice, the content is equal with the content of the initial array – there are even a few more decimal digits for each unpacked number.
The 'h' template format is for packing a hex string by putting the low nibble first. Its usage is similar with the 'H' template format.
The 'H' template format of the pack function is for packing a hex string by putting the high nibble first. If you want to get back the unaltered value of the string, you can use the unpack function but with the same template format.
If you use unpack with 'h' format, you’ll get the bytes in the same order but with their nibbles reversed, as you can notice in the next snippet:
This template format of the pack function generates a signed integer and you can use it like this:
Here the Perl pack function will return a string with 5 integers packed into the specific integer format to your machine. The unpack function will unpack the 5 integers from the pack resulting string into an array.
Finally, the array with the result will be printed. As you can notice, the content is equal with the content of the initial array.
But the 'i' format is machine dependent, so if you pack a list of integers into a string and then unpack it to another machine, it’s possible to get back a list of weird things.
If you need to pack unsigned characters, you can use the 'I' template format of the Perl pack function. The usage is similar as in the case of the 'i' template format .
The 'l' format generates a signed long format, which generally generates a four-byte number. It depends if the machine is little- or big-endian. See the following lines of code for a short example:
The 'L' format of the pack function generates an unsigned long value, its usage is similar with the signed long format. Its length is exactly 32 bits and could differ from the long format of the local C compiler.
The 'n' format tells to the Perl pack function to create an unsigned short in a network byte order. This format is specific to TCP/IP communications and you need to use this format (or 'N' for bigger numbers) if you do certain types of TCP/IP communication.
You can use it like in the following line of code:
The 'N' format tells to the pack function to create an unsigned long in a network byte order. You can use it similar with the 'n' template format.
Here’s a short example:
This format is for signed short numbers. If you transfer data across the network or onto a disk of another computer, you must consider the endianness of your computers, because the integers and the floating-point numbers could be stored in memory in different orders. So you must take this into considerations when you use the 's' format.
A short example about how to use it:
You can determine the endianess of your system by using this format, as you can see in the example below:
The 'S' format is for unsigned short integers, its usage is similar with the 's' format.
The 'U' template format of the pack function allows you to pack a Unicode number into its UTF-8 representation. The Unicode character sets associate characters with integers and the converting of the Unicode characters to UTF-8 format let you store only the bytes that are needed.
The most common cases are that when the Unicode characters are encoded in one or two bytes only. For instance, the next example converts into UTF-8 the smile face Unicode character:
Because of the endianness of a system, the integers and floating-point numbers are stored in a different order, so if you move binary data across the network, you could expect to meet some format issues.
A way to avoid this is by using 'U', the Unicode character number. You can use the Perl pack function to pack a sequence of characters encoded as characters in UTF-8 format on a computer and use the unpack function on another.
See the following example where we use the pack function to pack a few integers into an UTF-8 format:
The 'v' format is for 16-bit unsigned short numbers being similar with the 'n' format but refers to a little-endian order. When you need to pack some unsigned short numbers in a little endian format, you should use this format.
The next line of code shows you how to use the Perl pack function to pack it:
The 'V' template format is for unsigned long (32 bit) numbers, its usage is similar with the 'v' template format.
The 'x' format is used to pack a null byte. The following example puts a null between the a b c characters. The result is stored in the $str variable.
The 'X' format of the Perl pack function is used to move one byte backwards in the string.
Here’s an example:
To reverse the bits in each character of a string, you can use the split function to turn the string into an array of characters. Then you can use a foreach loop to iterate through this array.
Inside the foreach loop, for each character from the array, calling the Perl unpack and pack functions does the job.
For more details, see the code: