Perl pack Function

The pack function is used to convert a list into a string, according to a user-defined template (ex. a binary representation of a list).

The pack function has as arguments a LIST of values and a TEMPLATE. It concatenates into a string the list values converted according to the formats specified by the template. It returns the resulting string.

Its mainly purpose is to turn data (numbers and strings) into a sequence of bits that can be easily used by some external applications. You can use the Perl pack function either to achieve binary data to a file or for network transmission.

The reverse of this function is the unpack function which takes a sequence of bits and converts it into numbers and strings, needed for further processing.

The syntax form of the pack function is as follows:

STRING = pack TEMPLATE, LIST
The TEMPLATE consists of a sequence of characters as shown in the table below. One or more modifiers may follow some letters in the template (for instance, each letter may optionally be followed by a number giving a repeat count; or a * for the repeat count means to use however many items are left).

The following table shows you some of the most frequent template characters:

 a 

 A string with arbitrary binary data, will be null padded

 A 

 A text (ASCII) string, will be space padded

 b 

 A bit string (ascending bit order inside each byte, like vec()) 

 B 

 A bit string (descending bit order inside each byte)

 c 

 A signed char (8-bit) value

 C 

 An unsigned char (octet) value

 d 

 A double-precision float in the native format

 f 

 A single-precision float in the native format

 h 

 A hex string (low nybble first)

 H 

 A hex string (high nybble first)

i

 A signed integer value

 I 

 A unsigned integer value

 l 

 A signed long (32-bit) value

 L 

 An unsigned long value

 n 

 An unsigned short (16-bit) in "network" (big-endian) order

 N 

 An unsigned long (32-bit) in "network" (big-endian) order

 s 

 A signed short (16-bit) value

 S 

 An unsigned short value

 U 

 A Unicode character number

 v 

 An unsigned short (16-bit) in "VAX" (little-endian) order

 V 

 An unsigned long (32-bit) in "VAX" (little-endian) order

 x 

 A null byte

 X 

 Back up a byte

 

The following example shows you how to deal with the Perl pack function and the 'a' template:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $str = pack 'a7', '123a';     # "123a\0\0\0"
 
# split the string into an array of characters
my @array = split //,$str;
 
# converts the elements of the array into their
# equivalent hex codes
@array = map( sprintf("%x", ord), @array);
 
# print the array with spaces between elements
print "@array\n";
 
# it prints: 31 32 33 61 0 0 0
The code begins with the calling of the Perl pack function. $str is the string where the result will be returned, 'a7' is the template and '123a' is the string to be converted. The 7 digit in the template is a modifier and it means that it will be appended so many null bytes until the resulting string will have 7 characters length.

The following lines of the code allow you to see the content of the $str converted in hexadecimal characters.

If you have a list of strings to be converted, you can use the x (repetition) operator like in the following line of code:

my $str = pack 'a' x 7, '12', '34', '56';     # "135\0\0\0\0"
You can use the Perl pack function with the 'a' template to convert a string into an ASCII string followed by a null, that can be used in a C program:
 
my $cStr = pack ('ax', $perlStr);
Here the x character will append a null character as the rightmost character of the string.

The 'A' template is similar with the 'a' template, except that space is used instead of null - see the example shown for the 'a' template. For instance, you can use the line:

my $str = pack 'A7', '123a';     # "123a    "
instead of:
my $str = pack 'a7', '123a';     # "123a\0\0\0"

You’ll get as output: 31 32 33 61 20 20 20 where 20 is the hex code for the space character.

The 'b' format of the pack function packs strings consisting of 0 and 1 characters to bytes. A byte consists of a group of 8 bits as in the following figure:

 1 0 1 1 0 0 1 0 

MSB           LSB

 

LSB means here the least significant bit and it is sometimes referred as the rightmost bit. MSB is the most significant bit and is sometimes referred as the leftmost bit. In the above example, MSB = 1 and LSB = 0.

The 'b' format means that the bits are specified in increasing order from MSB to LSB. For instance, in the next line of code:

my $nr = ord pack ('b8', '10110010');
the $nr variable will be assigned with 77 = 1 + 4 + 8 + 64.

In this representation, the count refers to the number of bits to be packed - in the above example the count is 8.

You can use the Perl pack function with the 'b*' format to translate a string of 0’s and 1’s into a bit string, and the Perl unpack function to get back the list of 0’s and 1’s from the bit string.

Here’s an example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @bitArray = qw(1 0 0 0 1 1 1 1 0 0 1 1);
my $bitString = pack 'b*', join('', @bitArray);
 
@bitArray = split(//, unpack('b*', $bitString));
print "@bitArray\n";
# it prints:      1 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0
Please note that our initial array of bits had 12 elements only, so the pack function initialized the last 4 bits of the $bitString with 0.

The 'B' template is similar with the 'b' template except that the bits are specified in decreasing order from LSB to MSB. For instance, in the next line of code:

my $nr = ord pack ('B8', '10110010');
the $nr variable will be assigned with 178 = 2 + 16 + 32 + 128.

You can use the Perl pack function with the 'B*' format in a similar way as shown in the example for the 'b' format:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @bitArray = qw(1 0 0 0 1 1 1 1 0 0 1 1);
my $bitString = pack 'B*', join('', @bitArray);
@bitArray = split(//, unpack('B*', $bitString));
print "@bitArray\n";
# it prints:      1 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0
 

The 'c' template format is for signed char values. The usage is similar with the 'C' template format – see the 'C' template format for examples.

The 'C' format is used for unsigned characters. Here're a few examples:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $str = pack 'CCCC', 97, 98, 99, 100, 101, 102;
# 97 is the numeric value of the ASCII 'a' character
print "$str\n";     # abcd
 
# 3 is a count for the number of characters packed
$str = pack 'C3', 97, 98, 99, 100, 101, 102;
print "$str\n";     # abc
 
# x is the repetition operator
$str = pack 'C' x 5, 97, 98, 99, 100, 101, 102;
print "$str\n";     # abcde
 
# the '*' is like a wildcard for more of the same.
$str = pack 'C*', 97, 98, 99, 100, 101, 102;
print "$str\n";     # abcdef
The following example shows you how to use the Perl pack function with the 'C*' template in conjunction with other Perl functions. The '*' is like a wildcard for more of the same.

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $str = pack('C*', map ord, split(//,'This is Perl'));
print "$str\n";
# it prints: This is Perl

The split function will create an array from the string 'This is Perl', each character becomes an element of the array. The map function will run the ord function for each element of the array and it will return a list with the ASCII values of the characters.

Finally, the pack function with the 'C*' template (for unsigned characters) is used for all the numbers of the list (if you put an * character inside the template, you don’t need to count the elements of the list argument).

The 'd' format of the pack function is for 64 bit floating point in native machine format. Its usage is similar with the 'f' template format.

The 'f' format of the Perl pack function is for 32 bit floating point in a native machine format. Because of the variety of floating formats around, it’s possible that floating point data written on one machine may not be readable on another – as in the case that the two machines have different endianness.

You can use this format like in the following line of code:

my $float = pack 'f', 23.13421;
where $float will contain the number in a native float format. To extract the number from this string, you need to use the unpack function:
 
my $nr = unpack 'f', $float;

Or you can follow the 'f' specifier with a count, if you know how many floats you want to pack:

my $floats = pack 'f2', 3.14, 2.287;
If you have more single-precision float numbers to pack, you can use the '*' repeat pack-format that will pack all the available float numbers from the list:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @floatArray = (23.13421, 112.78, 77.896);
@floatArray = unpack ('f*', pack('f*', @floatArray));
print "@floatArray\n";
# it displays: 23.1342105865479 112.779998779297 77.8960037231445

Here the Perl pack function will return a string with 3 single-precision float numbers packed into the specific native machine format. The unpack function will unpack the 3 numbers from the pack resulting string into an array.

Finally, the array with the result will be printed. As you can notice, the content is equal with the content of the initial array – there are even a few more decimal digits for each unpacked number.

The 'h' template format is for packing a hex string by putting the low nibble first. Its usage is similar with the 'H' template format.

The 'H' template format of the pack function is for packing a hex string by putting the high nibble first. If you want to get back the unaltered value of the string, you can use the unpack function but with the same template format.

If you use unpack with 'h' format, you’ll get the bytes in the same order but with their nibbles reversed, as you can notice in the next snippet:

my $str = pack'H*','6162636465';
print unpack ('H*', $str), "\n";  # it prints: 6162636465
print unpack ('h*', $str), "\n";  # it prints: 1626364656
Here I put a * character inside the template, to avoid counting the hex characters of the string argument.

This template format of the pack function generates a signed integer and you can use it like this:

my $integer = pack 'i', 150;
The number 150 will be converted into the format used to store integers on your machine and the result will be stored into the $integer variable. If you have many integers to pack, you can use the '*' repeat pack-format that will pack all the integers available in the list:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @integerArray = (150, 160, 170, 180, 190);
@integerArray = unpack ('i*', pack('i*', @integerArray));
 
print "@integerArray\n";
# it displays: 150 160 170 180 190

Here the Perl pack function will return a string with 5 integers packed into the specific integer format to your machine. The unpack function will unpack the 5 integers from the pack resulting string into an array.

Finally, the array with the result will be printed. As you can notice, the content is equal with the content of the initial array.

But the 'i' format is machine dependent, so if you pack a list of integers into a string and then unpack it to another machine, it’s possible to get back a list of weird things.

If you need to pack unsigned characters, you can use the 'I' template format of the Perl pack function. The usage is similar as in the case of the 'i' template format .

The 'l' format generates a signed long format, which generally generates a four-byte number. It depends if the machine is little- or big-endian. See the following lines of code for a short example:

my $str = pack('l', 0x61626364);
print "$str\n";
This code creates a four-byte consisting of either dcba if the machine is little-endian or abcd if the machine is big-endian. Here 61,62,63,64 are the ASCII values for the a,b,c,d characters.

The 'L' format of the pack function generates an unsigned long value, its usage is similar with the signed long format. Its length is exactly 32 bits and could differ from the long format of the local C compiler.

The 'n' format tells to the Perl pack function to create an unsigned short in a network byte order. This format is specific to TCP/IP communications and you need to use this format (or 'N' for bigger numbers) if you do certain types of TCP/IP communication.

You can use it like in the following line of code:

my $nr = pack 'n', 1234, 235;
Because we didn’t provide any qualifier inside the template, the pack function will pack just the first number and it will return it in the $nr variable. The second number (235) from the list will be lost.

The 'N' format tells to the pack function to create an unsigned long in a network byte order. You can use it similar with the 'n' template format.

Here’s a short example:

my $nrs = pack 'N*', 45320..45325;
 
my @array = unpack 'N*', $nrs;
print "@array\n";
# it displays: 45320 45321 45322 45323 45324 45325
If you use the '*' repeat pack-format, you don’t need to provide the count of the numbers you intend to pack. The unpack function was used to extract the numbers from the packed $nrs string and populate an array with them.

This format is for signed short numbers. If you transfer data across the network or onto a disk of another computer, you must consider the endianness of your computers, because the integers and the floating-point numbers could be stored in memory in different orders. So you must take this into considerations when you use the 's' format.

A short example about how to use it:

my $i16 = pack 's*', 21, 77, 100, 256;
In this example the 's' format is associated with '*' that allows you to use the Perl pack function to pack as many short integers as you have in your list.

You can determine the endianess of your system by using this format, as you can see in the example below:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $v = unpack("h*", pack("s", 1));
if($v =~ /^1/) {
  print "Little endian system\n";
} elsif ($v =~ /01/) {
  print "Big endian system\n";
} else {
  print "Unknown endian format\n";
}
print "$v\n";  
# on my Windows system it displays: 1000
On my local Windows computer, after running this code I received the message: 'Little endian system'. The Perl unpack function was used to unpack the packed number in a hex format.

The 'S' format is for unsigned short integers, its usage is similar with the 's' format.

The 'U' template format of the pack function allows you to pack a Unicode number into its UTF-8 representation. The Unicode character sets associate characters with integers and the converting of the Unicode characters to UTF-8 format let you store only the bytes that are needed.

The most common cases are that when the Unicode characters are encoded in one or two bytes only. For instance, the next example converts into UTF-8 the smile face Unicode character:

my $utfSmiley = pack 'U', 0x263A;
 
print "length of \$utfSmiley = ", length ($utfSmiley),
      ", length of 0x263A = ", length(0x263A), "\n";
# it displays: length of $utfSmiley = 1, length of 0x263A = 4
You can notice the difference of the two item lengths in the memory. To get back the information in a Unicode format, you can use the Perl unpack function.

Because of the endianness of a system, the integers and floating-point numbers are stored in a different order, so if you move binary data across the network, you could expect to meet some format issues.

A way to avoid this is by using 'U', the Unicode character number. You can use the Perl pack function to pack a sequence of characters encoded as characters in UTF-8 format on a computer and use the unpack function on another.

See the following example where we use the pack function to pack a few integers into an UTF-8 format:

my @integers = (1234, 23, 456, 789);
my $utfIntegers = pack 'U*', @integers;
 
@integers = unpack 'U*', $utfIntegers;
print "@integers\n";
# it displays: 1234 23 456 789
You can use the 'U' format to encode the Unicode characters of an alphabet. For instance, the Unicode Hebrew alphabet ranges from 0x0590 to 0x05ff. The following example shows you how to pack and unpack the Hebrew Unicode alphabet:
 
my $utfHebr = pack 'U*', 0x0590..0x05ff;
my @UniHebr = unpack 'U*', $utfHebr;

The 'v' format is for 16-bit unsigned short numbers being similar with the 'n' format but refers to a little-endian order. When you need to pack some unsigned short numbers in a little endian format, you should use this format.

The next line of code shows you how to use the Perl pack function to pack it:

my $nr = pack 'v', 3167;
To get back the number, you can use the unpack function.

The 'V' template format is for unsigned long (32 bit) numbers, its usage is similar with the 'v' template format.

The 'x' format is used to pack a null byte. The following example puts a null between the a b c characters. The result is stored in the $str variable.

my $str = pack 'CxCxC', 97..99;

The 'X' format of the Perl pack function is used to move one byte backwards in the string.

Here’s an example:

my $binaryString = pack ('C4X2', 97..105);
print unpack ('C*', $binaryString), "\n";
# it displays: 9798
 
In this code 97..105 are the decimal values of the a-i ASCII characters; the characters 99,100 were removed and the characters 101-105 were not packed at all because there isn’t any specifier for them inside the template. The use of the unpack function tell you that only the first two characters were packed.

To reverse the bits in each character of a string, you can use the split function to turn the string into an array of characters. Then you can use a foreach loop to iterate through this array.

Inside the foreach loop, for each character from the array, calling the Perl unpack and pack functions does the job.

For more details, see the code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $str = "abcdefghi";
 
foreach my $ch (split //,$str ) {
  # reverse bits in each character
  $ch = pack "b*", unpack "B*", $ch;
}