Perl sort Function

The sort function is used to return a list in a specific order.

The sort function sorts a LIST by an alphabetical or numerical order and returns the sorted list value. Keep in mind that the argument list remains unchanged while a new sorted list is returned.

The syntax forms of the Perl sort function are as follows:

sort SUBNAME LIST
sort BLOCK LIST
sort LIST
The third syntax form is the simplest of them and is for the standard comparison order.

See the following example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @array = sort qw(map 23 Perl 101 11 while 1 scalar 102);
print "@array\n";
# it prints: 1 101 102 11 23 Perl map scalar while
 
@array = sort qw(23 101 11 1 102);
print "@array\n";
# it prints: 1 101 102 11 23

The problem is that capital letters have a lower ASCII numeric value than the lowercase letters so the words beginning with capital letters will be shown first, as you can see in our example. One alternative is to transform all the list elements into lowercase or uppercase letters and than perform the Perl sort function. You’ll see an example below.

As you can see from the second example shown above, even if all the elements of the list are numbers, this simple sort will sort the list in an alphanumerical order.

In practice you need to do some additional performing to make your sort accommodate with your task. And here comes the first and the second syntax forms of the Perl sort function. SUBNAME is the name of a subroutine where you describe how to order the elements of the list. Instead of a subroutine, you can provide a BLOCK as an anonymous in-line subroutine.

This sort function uses two special variables $a and $b and they are any two elements from the list that are compared in pairs by sort to determine how to order the list.

Besides these special variables, the sort function uses two operators: cmp and <=>. So you can sort a list either in an alphanumerical or a numerical order. For this you can use the cmp (string comparison operator) or <=> (the numerical comparison operator).

Please recall how these two operators work:

  • cmp returns -1, 0 or 1 depending on whether the left argument is stringwise less than, equal to, or greater than the right argument
  • <=> returns -1, 0 or 1 depending on whether the left argument is numerically less than, equal to, or greater than the right argument

Let’s go back to our previous example where we tried to sort a list of numbers. You can do this either by defining a subroutine where you describe how to order the list or by using an in-line subroutine in a block.

In the first case, you can see how this works in the following example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# define a subroutine
sub numSort {
  if ($a < $b) { return -1; }
  elsif ($a == $b) { return 0;}
  elsif ($a > $b) { return 1; }
}
 
# invoke the Perl sort function with the subroutine
my @array = sort numSort qw(23 101 11 1 102);
 
print "@array\n";
# it prints: 1 11 23 101 102
The previous example is for the first syntax form. You could shorten this code by using the second syntax form of the sort function as in the example below:
 
my @array = sort {$a <=> $b} qw(23 101 11 1 102);
Please note that if you explicitly use the cmp and <=> comparisons operators, it matters if $a or $b is on the left or right side of the operator. For instance, if you use $a<=>$b the list will be sorted in an ascending numerical order and if you use $b<=>$a the list will be sorted in a descending numerical order.

If you have a mixed list with both numerical and string elements, you can use the following code to sort it:

#!/usr/local/bin/perl
 
use strict;
#use warnings;
 
# Perl sort function with the second syntax form
my @array = sort {$a <=> $b || $a cmp $b}
            qw(map 23 Perl 101 11 while 1 scalar 102);
 
print "@array\n";
# it prints: Perl map scalar while 1 11 23 101 102
 
my @array = sort {$a <=> $b} qw(23 101 11 1 102);
The numbers will be sorted in a numerical order and the strings in an ASCII order. However, for this example don’t use warnings or the –w flag if you want to run it.

If you want to order a list of strings in an alphabetical case-insensitive order, as I mentioned before you can use either the lc or uc function as in the following example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @array = sort {lc $a cmp lc $b} qw(map Perl while scalar);
 
print "@array\n";
# it prints: map Perl scalar while
In this case the strings will be sorted in an ascending lexicographical order, it doesn’t matter if you have capital letters or not. Please notice that the lc function doesn't modify the values assigned to $a or $b, but returns a lowercase version of the values.

Considering hashes, you must know that Perl uses internally its own way to store the items. So, generally you can’t keep your hash items in a specific order, except if you use the Tie::IxHash Perl module that preserves the order in which the hash elements were added. But you can access its items in any order you want.

The following code snippet shows you how to print the elements of a hash in a specific order of the keys:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# define a hash
my %hash = (one => 1, two => 2, three => 3, four => 4);
 
# using Perl sort function with a foreach loop
foreach my $key (sort keys %hash) {
  print "$key: $hash{$key}\n";
}
The order of the pair elements of the hash will remain unchanged, but we process the hash elements in the order we need. In the above example, the keys function will return a list with the hash keys, the Perl sort function will sort this list in an alphabetical ascending order (the standard format); the foreach loop will traverse the sorted list and the print function will display the pair elements of the hash.

It produces the following output:

four: 4
one: 1
three: 3
two: 2

You can access the hash elements in a specific order of their values. See the following code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# define a hash
my %hash = (1 => 'compile', 2 => 'binary',
            3 => 'ascii',   4 => 'digit');
 
# print the hash ordered pairs
foreach (sort {$hash{$b} cmp $hash{$a}} keys %hash) {
  print "$_: $hash{$_}\n";
}
Here we used the sort function with the cmp (string comparison) operator and we got the elements of the hash printed in the values alphabetical descending order.

The keys function returns a list with the hash keys.

The elements of this list (the keys of %hash) are assigned to $a and $b for comparisons and the notation $hash{$a} means the corresponding value of the $a hash key. The Perl sort function will return the sorted list as argument to the foreach loop. The foreach loop will iterate through this list using the $_ special variable.

The output is as follows:

4: digit
1: compile
2: binary
3: ascii
 

Perl hashes are not ordered and you must not rely on the order in which you added the hash items – internally, Perl uses its own way to store the items.

The following example shows you a simple way to print the hash elements in the descending order of the keys length, using the sort and the length functions:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# define a hash
my %hash = (car => 1, apple => 1, carrot => 1, furniture => 1);
 
# foreach loop
foreach my $key (sort { length $b <=> length $a } keys %hash) {
  print "$key: $hash{$key}\n";
}
The output:
 
furniture: 1
carrot: 1
apple: 1
car: 1
 
You can note that the length function is used in a syntax format without parentheses to make the code more readable.

The following example shows you how to use the length function to sort by length the values of a hash:

#!/usr/local/bin/perl
 
use warnings;
use strict;
 
# populate a hash
my %names = qw(m1 Minie n1 Nona a1 Ashley a2 Ashlie
               m2 Milly b1 Belynda s1 Sheena b2 Bernice
               s2 Shania s3 Sparrow k1 Kris);
 
my @names = sort { length $names{$a} <=> length $names{$b} }
              keys %names;
        
foreach my $i (0 .. scalar @names - 1) {
  print "\n" unless $i % 5;
  print "$names{$names[$i]}\t";
}
print "\n";
Just a few words about this code:
 
  • To populate the %names hash, the qw operator was used
  • the keys function returns the keys of the hash in an anonymous array
  • the Perl sort function sorts this anonymous array by the length of the values associated with the keys in a ascending numeric order; it returns the sorted keys in the @names array; please note that Perl allows us to use the same identifier (names in our case) for different data types (scalars, arrays or hashes)
  • to print the array the foreach statement with indices is used, where scalar@names returns the number of the @names array elements. To print 5 strings on a line the unless statement and the % operator are used.  
  • $names[$i] is the current element of the @names array that corresponds to a key in the %names hash; $names{$names[$i]} is the hash value associated with the $names[$i] key.   

If you run this code you’ll get the following output:

Kris    Nona    Milly   Minie   Sheena
Shania  Ashlie  Ashley  Belynda Sparrow
Bernice

In the following we intend to sort a big string split on more lines in an ascending order after the last field of each line, using the Perl sort function and the algorithm given by the Schwartzian Transform.

Keep in mind that the algorithm is rather long described than complicated. At a first sight it looks a bit complicated, but if you take your time to read this, you’ll see how simple and clear it is. Without taking into consideration that you will become an expert in using the map function.

A sample string consisting of 4 lines is as follows:

three 13  3  1  91 3
one   11  5  1  45 1
two   12  7  1  33 2
four  14  1  1  26 4
 
and we want to order these lines after the last column (the numbers 3, 1, 2, 4).

Using the Schwartzian Transformation, we execute the following steps:

  1. The string is split into a list whose elements are the lines of the string, except the ending newline; we do this by using the split function
  2. Using the map function, the above list is turned into a list of references; each reference points out to an anonymous array consisting of two elements: the original line and the value of the last field of this line
  3. Using the Perl sort function, the list of references is ascending ordered by the second element of the anonymous array
  4. the map function is used again to get back the original list, but this time having the elements in the desired order
  5. using the join function, the above list is converted back into the original string

Here’s the code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $str = "three 13  3  1  91 3
one   11  5  1  45 1
two   12  7  1  33 2
four  14  1  1  26 4";
 
$str = join "",
      map { $_->[0]."\n" }
      sort { $a->[1] <=> $b->[1] }
      map { [$_, (split)[-1]] }
      split /\n/, $str;
 
print "$str\n";
and the output:
 
one   11  5  1  45 1
two   12  7  1  33 2
three 13  3  1  91 3
four  14  1  1  26 4
 
The join, map, sort and split, like a lot of other functions, work from right to left, returning the result on the left. In our code, the split function will return a list to the second map, the second map will return a list to the Perl sort function, ... and finally the join function will return a string into the $str variable, as you can see in the following diagram:
 
join <- map <- sort <- map <- split
 
Below, we’ll describe the main steps of the algorithm, allowing you to see the intermediate results, to better understand what happens at each step of the algorithm.

The script begins with the assigning of the sample string to the $str variable. The string consists of 4 lines, each line ending in a newline character.

The script continues with a compound statement which, as we mentioned before, must be interpreted from right to left (if we write it on a single line, but to make the code more readable, we split this compound statement on a few lines).

Step 1.

split /\n/, $str;
The split function will convert the $str into a list using the newline delimiter; every line of the string, excepting the newline will become an element of this list, so we’ll have a list with 4 elements, for instance the first element of the list is:
 
three 13  3  1  91 3
 

Step 2.

 map { [$_, (split)[-1]] }
The map function will have as argument the list returned by the split function and it will return a list of references where each reference points out to an anonymous array consisting of two elements: the original line and the last field of the line.

Each element of the argument list will be assigned in turn to $_ and inside the map block the [$_,(split)[-1]] construct returns a reference to a new anonymous array that consists of two elements:

  • $_
  • (split)[-1] where split is called without any argument and that means that it will split the line stored in $_ into a list, by using the whitespace delimiter; )[-1] is the index element of this list that it will be returned (if you use -1 as an index that means the last element of the list)

To see how the map function works, I used the following code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @list = ("three 13  3  1  91 3", "one   11  5  1  45 1",
            "two   12  7  1  33 2", "four  14  1  1  26 4");
 
my @refList = map { [$_, (split)[-1]] } @list;
 
# see what it is in @refList
use Data::Dumper;
print Dumper(@refList);
Here the @list array variable contains the list returned by the split function at the Step 1. The list of references returned by map was stored in the @refList array variable. To see the content of the @refList array, the Data::Dumper module was used.

The output of the above script is as follows:

$VAR1 = [
          'three 13  3  1  91 3',
          '3'
        ];
$VAR2 = [
          'one   11  5  1  45 1',
          '1'
        ];
$VAR3 = [
          'two   12  7  1  33 2',
          '2'
        ];
$VAR4 = [
          'four  14  1  1  26 4',
          '4'
        ];
 
Step 3.

sort { $a->[1] <=> $b->[1] }

The Perl sort function has as argument the list of references described above and will order this list after the second index of the sub-arrays. Here [1] means the index of the sub-array and <=> is the numerical comparison operator.

Because $a->[1] appears at the left side of the <=>, the Perl sort function will order the array of references in the ascending order of the second index of sub-arrays. To print the list returned by the Perl sort function, we use again the Data::Dumper module and we run the following code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $str = "three 13  3  1  91 3
one   11  5  1  45 1
two   12  7  1  33 2
four  14  1  1  26 4";
 
my @refList = sort { $a->[1] <=> $b->[1] }
              map { [$_, (split)[-1]] }
              split /\n/, $str;
 
use Data::Dumper;
print Dumper(@refList);
It will produce the following output:
 
$VAR1 = [
          'one   11  5  1  45 1',
          1
        ];
$VAR2 = [
          'two   12  7  1  33 2',
          2
        ];
$VAR3 = [
          'three 13  3  1  91 3',
          3
        ];
$VAR4 = [
          'four  14  1  1  26 4',
          4
        ];
 
As you can notice, this time the list of references is sorted in the numerical ascending order by the second element of the sub-arrays.
 

Step 4.

map { $_->[0]."\n" }
Now we must get rid of the second element of the sub-arrays, and turn the list of references into a simple list having as elements the lines of the initial string.

The list of references returned by the Perl sort function is the argument of this map function. Each element of the array of references, which as you know is a scalar, will be assigned in turn to $_. Here $_->[0] is the first element (of index 0) of the current sub-array and it will be concatenated with the newline character. It will result an ordered list with the lines of the initial string.

See the code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $str = "three 13  3  1  91 3
one   11  5  1  45 1
two   12  7  1  33 2
four  14  1  1  26 4";
 
my @list = map { $_->[0]."\n" }
           sort { $a->[1] <=> $b->[1] }
           map { [$_, (split)[-1]] }
           split /\n/, $str;
 
use Data::Dumper;
print Dumper(@list);
The output is as follows:
 
$VAR1 = 'one   11  5  1  45 1
';
$VAR2 = 'two   12  7  1  33 2
';
$VAR3 = 'three 13  3  1  91 3
';
$VAR4 = 'four  14  1  1  26 4
';
 
You can see the presence of the newline after the last field of each element of this list.

Step 5.

$str = join "",
The second argument of the join function is the list described at the previous step. Using the "" delimiter, the elements of the list will be concatenated into our initial string, but ordered after the last field.

Finally, we’ll get as output:

one   11  5  1  45 1
two   12  7  1  33 2
three 13  3  1  91 3
four  14  1  1  26 4
 
i.e., the lines of the string are ordered ascending after the last field of the lines.

We intend to give you an example how to use the Perl sort function to sort a matrix by its columns, giving priority to the first column, next to the second and so on. Please note that the following algorithm doesn't change the order of the items in a row, but the order of the rows.

Let’s try to sort the following matrix, which has as elements either numbers or strings (well, we can’t mix the numbers and strings in the same column):

5,   'aaa',  33,  'bbb',  12
11,  'asd',  121, 'bnm',  16
5,   'aaa',  22,  'ewq',  13
5,   'abde', 123, 'aqq',  15
5,   'aaa',  33,  'ccc',  11  
5,   'abde', 78,  'azxx', 14
 
First we’ll store this matrix in an array of arrays (@AoA) and next we’ll sort this array by the indexes of the inner arrays. The elements of the @AoA array are references to the rows of the matrix, each inner array having as elements the items contained in a row. See the code first:
 
#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# define an array of anonymous arrays
my @AoA = (
 [5,  'aaa',  33,  'bbb',  12],
 [11, 'asd',  121, 'bnm',  16],
 [5,  'aaa',  22,  'ewq',  13],
 [5,  'abde', 123, 'aqq',  15],
 [5,  'aaa',  33,  'ccc',  11],
 [5,  'abde', 78,  'azxx', 14]
);
 
# using Perl sort function to sort the @AoA array
@AoA = sort {
 $b->[0] <=> $a->[0] ||
 $a->[1] cmp $b->[1] ||
 $a->[2] <=> $b->[2] ||
 $b->[3] cmp $a->[3];
} @AoA;                  
 
# print the @AoA array
foreach my $item1 (@AoA){
  foreach my $item2 (@{$item1}){
    print "$item2\t"; 
  }
  print "\n";
}
As you can see, we use the sort function to sort the array:
 
  • numerically descending by the first index of the inner arrays, next
  • ASCIIbetically ascending by the second index, next
  • numerically ascending by the third index, next
  • ASCIIbetically descending by the fourth index

You don’t need to sort the matrix by all its columns, for instance we don’t use the last column to sort the matrix. We used the || operator to indicate from left to right the priority of the columns in the sort processing.

Finally, the @AoA is printed using two nested foreach.

Here is the output:

11      asd     121     bnm     16
5       aaa     22      ewq     13
5       aaa     33      ccc     11
5       aaa     33      bbb     12
5       abde    78      azxx    14
5       abde    123     aqq     15

The following example shows you how to sort a list of IP addresses, using the Perl sort Schwartzian Transform and the pack function.

I intend to sort the following IPV4 addresses list in a numerical ascending order:

192.168.100.1
192.168.50.77
20.20.10.100
45.56.67.20
 
The following code does the job:
#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @ipAddr =
   ('192.168.100.1', 
    '192.168.50.77',
    '20.20.10.100',
    '45.56.67.20'
   );
 
@ipAddr = map { $_->[0] }
         sort { $a->[1] cmp $b->[1] }
         map {[ $_, pack( 'C*', split /\./ ) ]} @ipAddr;
 
print "$_\n" foreach (@ipAddr);

After running this code, you’ll get the following output:

20.20.10.100
45.56.67.20
192.168.50.77
192.168.100.1
 
I populated the @ipAddr array with the IP addresses that we want to sort in an ascending numerical order. This means that we want the list sorted after the numeric value of the first number, next after the numeric value of the second number and so on.

The second map function is used with a block and it has as argument the @ipAddr array. At each iteration step, the current element of the array is assigned in turn to $_. Inside the map block, the [$_,pack('C*',split/\./)]construct returns a reference to a new anonymous array that consists of two elements:

  • $_
  • a 4-byte string returned by the pack function (the split function returns a list with the numbers corresponding to an IP address and pack converts this list into a 4-byte string, for example if the IP address is 20.20.10.100, the split function will return the (20,20,10,100) list and the pack function will return a string having as value \x14\x14\x0a\x64)

The map function returns an array consisting of references to the two-elements anonymous arrays discussed before. The Perl sort function orders the list of these anonymous arrays after the second element ($a->[1]). Please note the using of the string cmp comparison operator that allows us to get the correct order.

Well, there is another way to sort the IP list by using the numeric <=> comparison operator and without the pack function. To do this, you can replace the following two lines:

          sort { $a->[1] cmp $b->[1] }
          map {[ $_, pack( 'C*', split /\./ ) ]} @ipAddr;
with the next lines:
 
          sort { $a->[1] <=> $b->[1] ||
                 $a->[2] <=> $b->[2] ||
                 $a->[3] <=> $b->[3] ||
                 $a->[4] <=> $b->[4] }
          map {[ $_, split /\./]} @ipAddr;

Well, I can’t say this code is shorter! The first solution is a better option.