How to deal with arrays of arrays - Part 3

Particularly, a matrix can be represented in memory as a Perl array of arrays (@AoA).

To remove the last row of the matrix, you need to remove the last element of the array of arrays, which is a reference.

To remove the last element of an array you can use the pop function.

But in our case the last element of the array is a reference to an inner array and after pop removes it, the inner array is still in memory - only the reference was removed.

I think it’s a good practice to remove the inner array too, especially as it’s not a big chunk of code to manage. In the example below I show you how to do this.

See the next code snippet:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @AoA = (
  [ 'a1', 'b1', 1 ],
  [ 'a2', 'b2', 2 ],
  [ 'a3', 'b3', 3 ],
);
 
# remove the last element of @AoA and save it in $ref
my $ref = pop @AoA;
 
# print the @AoA
print map "@$_\n", @AoA;
 
# print the inner array referenced by $ref
print "\$ref = @$ref\n";

This code outputs:

a1 b1 1
a2 b2 2
$ref = a3 b3 3

To remove the inner array referenced by $ref, you can assign an empty list to it:

@$ref = ();

Or to completely remove the last element of the Perl @AoA array and the inner array referenced by this element, you can use an one line statement, as you can see below:

@{pop @AoA} = ();

To entirely clear a Perl array of arrays (@AoA), you can use the following code:

@$_ = () foreach (@AoA);
@AoA = ();

The first statement clears all inner arrays (in a foreach loop) and after that the second statement clears the main array.

To clear an array, instead of assigning an empty list to it you can use undef:

undef @array;

(but not @array = undef which assigns the undef value as the first element of the array)

So the previous code can be rewritten as:

undef @$_ foreach (@AoA);
undef @AoA;

If you have a matrix represented as a Perl array of arrays (@AoA) and you want to insert a column at the beginning of your matrix, you can do this by using the unshift function as in the following example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @AoA = (
 [12, 02, 34],
 [29, 22, 89],
 [11, 35, -5],
);
 
my @column = (1, 2, 3);
 
for (my $i=0; $i <= $#AoA; $i++) {
 unshift @{$AoA[$i]}, $column[$i];
}
 
# print the @AoA
print map "@$_\n", @AoA;

The output is as follows:

1 12 2 34
2 29 22 89
3 11 35 -5

As you can see, the inserted column became the first column of your matrix.

This time I used the for loop to traverse the Perl array of arrays. Here $AoA[$i] means a reference to the $i matrix row (starting with 0). This is a reference to an anonymous array and because unshift expects an array as argument, you need to dereference it by enclosing in braces and prepending with the @ sign.

Finally, the array of arrays is printed by using print and map.

If you have a matrix represented as a Perl array of arrays (@AoA) and you want to insert a row at the beginning of your matrix, you can use the unshift function as shown in the following example:

#!/usr/local/bin/perl

use strict;
use warnings;

my @AoA = (
  [12, 02, 34],
  [29, 22, 89],
  [11, 35, -5],
);

my @row = (1, 2, 3);
unshift @AoA, \@row;

# print the @AoA
print map "@$_\n", @AoA;

The output of this snippet is as follows:

1 2 3
12 2 34
29 22 89
11 35 -5

As you can see, the inserted row became the first row of your matrix.

Here \@row means a reference to @row array. This reference is inserted at the beginning of the @AoA array using unshift.

If you don’t want to use an explicit array variable for your new row, you can use the [] array constructor instead:

unshift @AoA,[1, 2, 3];
 
As you know, the array constructor returns a reference to the anonymous array whose elements are enclosed between square brackets.

Finally the array of arrays is printed by using print and map.

If you have a matrix represented as a Perl array of arrays (@AoA) and you want to append a column at the end of your matrix, you can see an example here:

#!/usr/local/bin/perl

use strict;
use warnings;

my @AoA = (
 [12, 02, 34],
 [29, 22, 89],
 [11, 35, -5],
);
my @column = (1, 2, 3);

for (my $i=0; $i <= $#AoA; $i++) {
 push @{$AoA[$i]}, $column[$i];
}

# print the @AoA
print map "@$_\n", @AoA;

The output is as follows:

12 2 34 1
29 22 89 2
11 35 -5 3

This time I used the for loop to traverse the array of arrays. Here $AoA[$i] means a reference to the corresponding inner array. This is a reference to an anonymous array and because push expects an array as argument, you need to dereference it by enclosing in braces and prepending with the @ sign.

Finally the array of arrays is printed by using print and map.

If you have a matrix represented as a Perl array of arrays (@AoA) and you want to append a row to your matrix, you can see an example here:

#!/usr/local/bin/perl

use strict;
use warnings;

my @AoA = (
 [12, 02, 34],
 [29, 22, 89],
 [11, 35, -5],
);

my @row = (1, 2, 3);
push @AoA, \@row;

# print the @AoA
print map "@$_\n", @AoA;

The output of this snippet is as follows:

12 2 34
29 22 89
11 35 -5
1 2 3

Here \@row means a reference to @row array. This reference is appended to @AoA array using push.

If you don’t want to use an explicit array variable for your new row, you can use the []array constructor instead:

push @AoA,[1, 2, 3];

As you know, the array constructor returns a reference to the anonymous array whose elements are enclosed between square brackets.

Finally the array of arrays is printed by using print and map.

The above example is for a matrix but you can use the code in a similar way to append a new reference to any array of arrays.

The syntax form you can use is the following:

push @AoA, [ list ];

where list represents the elements of the new anonymous array. If you want, you can see the snippet below for an example:

#!/usr/bin/perl
 
use strict;
use warnings;
 
my @AoA = (
 [ qw(blue red yellow green) ],
 [ qw(circle triangle rectangle) ],
);
push @AoA, [ qw(one two three four) ];
 
# print the @AoA array
print map "@$_\n", @AoA;

The output produced by his code is the following:

blue red yellow green
circle triangle rectangle
one two three four

To quote the elements of a list or an array the qw operator was used.

Let’s say you have a flat file named Persons.txt that has the following record structure:

firstname,lastname,salary

We’ll test this script on a file with three records only:

John,Silva,2520
Mary,Brown,3200
Anne,Williams,5280

We intend to read this file and create a Perl @AoA array of arrays with the records of the file.

Each element of this array is a reference of the @columns array that has as elements the items from the file raw:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @AoA;
open FILE, "persons.txt" or die $!;
 
# we will read the file line by line
while(<FILE>)
{
 # read each line in $_ 
 # chomp the trailing newline from $_
 chomp;
 my @columns = split ",", $_;
 push @AoA, \@columns;
}
 
# print the array
print map "@$_\n", @AoA;

The output of this script is as following:

John Silva 2520
Mary Brown 3200
Anne Williams 5280

Please note the way the split function was used to split the record of the file into columns.

You can represent a matrix with n rows and m columns as a Perl array of arrays (@AoA). This array has n scalar elements where the first element is a reference to an anonymous array whose elements are the elements of the first matrix raw, the second element is a reference to an anonymous array whose elements are the elements of the second matrix raw, … and so on, each inner array having m elements.

You can use the splice function to replace a contiguous slice of columns with other columns taken from another matrix. In this example we assume that both matrices have the same number of rows.

First, see the code:

#!/usr/local/bin/perl

use strict;
use warnings;

my @AoA = (
 ['a11', 'a14', 'a15', 'a15'],
 ['a21', 'a24', 'a25', 'a25'],
 ['a31', 'a34', 'a35', 'a35'],
);

my @tempAoA = (
 ['a12', 'a13', 'a14'],
 ['a22', 'a23', 'a24'],
 ['a32', 'a33', 'a34'],
);

my $i;
for ($i=0; $i <= $#AoA; $i++) {
 splice @{$AoA[$i]}, 1, 2, @{$tempAoA[$i]};
}

# clear the @tempAoA array
@$_ = () foreach(@tempAoA); # the inner arrays first
@tempAoA = ();

# print the @AoA
print map "@$_\n", @AoA;

The output is as follows:

a11 a12 a13 a14 a15
a21 a22 a23 a24 a25
a31 a32 a33 a34 a35
 
Notice that we replaced the second and third columns of the AoA matrix with the columns of the tempAoA matrix.

We used a for loop to step through the elements of the Perl @AoA array. The $i scalar variable is used to cycle through the matrix rows. The splice function acts against the inner arrays of the @AoA and @tempAoA arrays.

After replacing the elements of the inner arrays using the splice function, you can empty the @tempAoA array without affecting the content of the @AoA array.

If you want to delete columns only, you can use splice like in the following example:

 splice @{$AoA[$i]}, 1, 2;

This code line removes two columns from the AoA matrix, beginning with the second column (which has the index 1 in the @AoA array). The appropriate output is as follows:

    a11 a15
    a21 a25
    a31 a35

As you know, a Perl array of arrays (@AoA) is an array whose elements are references to other arrays (we say that these references point to inner arrays).

The first example shows you how to get a slice from a given inner array. Have a look at the following code:

#!/usr/local/bin/perl

use warnings;
use strict;

my @AoA = (
 [12, 35, 73, 11, 98],
 [10, 32, 76, 43],
 [21, 19, 56, 84, 17, 71]
);

my @slice = ();
my $i = 2; # the index of the last inner array

for( my $j=2; $j<=4; $j++ ) {
 @slice = (@slice, $AoA[$i][$j]);
}

print "@slice\n";
# it prints: 56 84 17
I used for to loop through the elements of the last inner array. An alternative inside the for loop is to use the push function to add elements to our @slice array:
 push @slice, $AoA[$i][$j];
You can even give up at the for loop and replace it with a slice operation:
@slice = @{ $AoA[2] } [ 2..4 ];
Here $AoA[2] is a reference to the third inner array and to dereference it the @{} notation is used.
 
The next example shows you how to get a slice from the @AoA array of arrays; as result you’ll get a Perl array of arrays (@AoAslice) too. I’ll show you an example by using the same array of arrays from the previous example.
#!/usr/local/bin/perl

use warnings;
use strict;

my @AoA = (
 [12, 35, 73, 11, 98],
 [10, 32, 76, 43],
 [21, 19, 56, 84, 17, 71]
);

my @AoAslice = ();

for(my $i=1; $i<=2; $i++) {
 push @AoAslice, [ @{$AoA[$i]} [1..3] ];
}

# print the @AoAslice array 

foreach my $item1 ( @AoAslice ){
 foreach my $item2 ( @{ $item1 } ){
   print "$item2\t"; 
 }
 print "\n";
}

This snippet produces the following result:

32 76 43
19 56 84

We use the for loop and the $i variable to cycle through the @AoA array; the $i variable takes values from 1 to 2 (the second and third element of the @AoA array).

Inside the block of the for loop we use the push function to append to the @AoAslice array a reference to a slice of the corresponding inner array (more exact, the elements having indexes from 1 to 3).

Finally, the resulting array is printed using two nested foreach.

Instead of push you can assign to the @AoAslice array a list of two elements:

  • the array itself
  • the element to be appended to it
@AoAslice = (@AoAslice, [ @{$AoA[$i]} [1..3] ]);
As you can see, the resulting (@AoAslice) Perl array of arrays corresponds to a rectangular matrix. If you need to apply this mechanism more than once, you can write your own subroutine that does this. To give you an idea, here is a sample of this subroutine:
 
#!/usr/local/bin/perl
 
use warnings;
use strict;
 
my @AoA = (
 [12, 35, 73, 11, 98],
 [10, 32, 76, 43],
 [21, 19, 56, 84, 17, 71]
);
 
# invoke the subroutine
my $ref = &MatrixSlice(\@AoA, 1, 2, 1, 3);
 
sub MatrixSlice {
 my $AoARef = shift;
 my ($rMin, $rMax, $cMin, $cMax) =
     (shift, shift, shift, shift);
 my @slice = ();
 for(my $i=$rMin; $i<=$rMax; $i++) {
   @slice = (@slice, [ @{$$AoARef[$i]} [$cMin..$cMax] ]);
 }
 return \@slice;
}
 
# print the resulting slice array 
 
foreach my $item1 ( @$ref ){
 foreach my $item2 ( @{ $item1 } ){
   print "$item2\t"; 
 }
 print "\n";
}

Here $rMin,$rMax are respectively the minimum and maximum for rows, and $cMin,$cMax the minimum and maximum for columns (starting with 0).

This subroutine returns a reference to the rectangular matrix we are after. As an exercise, you can try to validate the values provided for rows and columns.

A Perl array of arrays (@AoA) is an array that has as elements references to other arrays. There are two ways to copy an array of arrays into a new one:

A shallow copy – it assumes to copy the content of the array of arrays into a new array. You can do this by a simple assignment, as you can see below:

my @newAoA = @AoA;

Please note that by using this method you just copy the references from @AoA into @newAoA. The two arrays will share the inner arrays, in such a way that if you change an inner array, both @AoA and @newAoA are changed as they both point to the same anonymous array.

A deep copy – it assumes to copy the references and the content of the inner arrays too. The following example shows you how to use a recursive subroutine to copy each of the data inside the new array of arrays:

#!/usr/local/bin/perl

use strict;
use warnings;

my @AoA = (
 [12, 02, 34],
 [29, 22, 89],
 [11, 35, -5],
);

my @newAoA = clone(@AoA);

# alter the first element of the second inner array
$AoA[1][0] = 100;

sub clone {
 map { ! ref() ? $_ : [clone(@$_)] } @_;
}

# print the @AoA
print "\@AoA:\n";
print map "@$_\n", @AoA;

# print the @newAoA
print "\n\@newAoA:\n";
print map "@$_\n", @newAoA;
The output is as follows:
 
    @AoA:
    12 2 34
    100 22 89
    11 35 -5

    @newAoA:
    12 2 34
    29 22 89
    11 35 -5

In this example we use the clone() subroutine to copy all the elements of our array of arrays.

Inside the body of the subroutine we use the map function that loops through @_(the special @_ array has as elements the values passed to the subroutine). At each iteration step the current element of the @_ array is assigned in turn to $_.

Inside the map block the ? ternary operator and the ref function are used to test if an element of @_ array is a reference.

The map function will return:

  • the value stored in $_ if this value is not a reference, otherwise
  • a reference to a new independent anonymous array created by the [] array constructor; in the same time we need to call the subroutine again to copy the elements of the array referenced by $_

After the array was duplicated, in @AoA array the content of the first element of the second inner array was modified.

As you can see from the output, the contents of the two arrays are different, the @newAoA array haven’t been affected by this modification.

For more complicated structures you can use the Storable module which provides the dclone function that allows you to do recursively copies too (See perlfaq4).

Let’s say you have a Perl array of arrays (@AoA) and you want to use it within a subroutine.

A common way to do this is by passing the array of arrays by reference.

See a simple example below:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @AoA = (
 ['a11', 'a12'],
 ['a21', 'a22'],
);
 
# invoke the subroutine
&myPrint(\@AoA);
 
sub myPrint{
 my $arrayRef = shift;
 
 foreach my $item1 (@$arrayRef){
   foreach my $item2 (@$item1){
     print "$item2 "; 
   }
   print "\n";
 }
}
This script produces the following output:
 
    a11 a12
    a21 a22

First we populate the @AoA array of arrays with a few entries.

The [] is the array constructor and returns a reference to the anonymous array (or list) whose elements are included between square brackets.

myPrint subroutine is used to print the array of arrays. It has as argument a reference to an array of arrays.

Inside the subroutine body we use the shift function to discharge the argument and assign it to the $arrayRef scalar variable. So in $arrayRef we have a reference to our array of arrays. To dereference the array references, we prefix them with an @ sign.

To print the array of arrays two nested foreach are used.

In a similar way you can modify the subroutine and write your own code in order to perform inside its body whatever you want. 

In a Perl array of arrays (@AoA), you can use the exists function to avoid autovivification when you don’t intend to use it.

See the following example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @AoA = (
 ['a11'],
 ['a21', 'a22'],
);
 
# autovivification
$AoA[2][2] = 'a33'; 
defined $AoA[3][5] || print "not found\n";
 
use Data::Dumper;
print Dumper \@AoA;
First we populate an array of arrays with a few entries. An array of arrays is an array whose elements are references to other arrays. To get references to anonymous arrays, the [] array constructor was used.

Now let’s pay a bit of attention to this code.

The first assignment statement:

$AoA[2][2] = 'a33';
adds an entry to our array of arrays. Because $AoA[2] doesn’t exist it will be created with an appropriate value, so you don’t need to create yourself the inner array ($AoA[2] = []). This process is called autovivification and it is very useful when you have to deal with this kind of assignments. The expression can be arbitrary complicated and Perl will create all the structures it needs to make the assignment.

But if you look at the next statement:

defined $AoA[3][5] || print "not found\n";
first it will be evaluated the defined $AoA[3][5] expression and because the result is false the print function will be executed. But in the process of evaluation Perl needs to create the $AoA[3] element which will remain as a reference entry to an empty inner array – see the output. This time the process of autovivification enlarged our @AoA structure with an unnecessary entry.

Please note the using of || short-circuit operator that evaluates the second operand only if the first operand is evaluated false.

To see what is happening, I printed the hash using the Data::Dumper module. The output of this script is as follows:

not found
$VAR1 = [
          [
            'a11'
          ],
          [
            'a21',
            'a22'
          ],
          [
            undef,
            undef,
            'a33'
          ],
          []
        ];

As you can see our array was enlarged with an empty array reference.

As I mentioned at the beginning of this script, to avoid autovivification in this last case, you can use the exists function:

if(exists $AoA[3] && defined $AoA[3][5]) {
 print "found\n"; 
};
Here we use the && short-circuit operator, first we test if exists $AoA[3] is evaluated true and only in this case we check if $AoA[3][5] is defined.

Please note the using of && short-circuit operator that evaluates the second operand only if the first operand is evaluated true.