Perl chomp Function

The Perl chomp function is used to remove the trailing newlines from a string.

 

There are three syntax forms for this function:

chomp VARIABLE
chomp (LIST)
chomp
chomp VARIABLE

If you use the first syntax form, you call this function having as argument the name of a scalar variable. In this case the function will remove any trailing characters at the end of the variable.

Perl chomp function will remove only the characters that match the special variable $/.  If there are not any trailing characters at the end of the variable, the variable will remain unchanged. It returns the total number of characters removed.

chomp (LIST)

If the first syntax form was for scalar variables, the second is for lists. You can chomp any list, array or hash (associative array). It doesn’t matter if you want to chomp an array or a hash but it is safer to put the array or hash inside the parentheses, otherwise the result could be one that you wouldn’t expect.

 

If you chomp an array, all the trailing characters that match $/, from all the elements will be removed. It returns the total number of characters removed.

 

In the case of hashes, only the values will be chomped, whereas the keys will remain unchanged. It returns the total number of characters removed.

 

chomp

 

The last syntax form of Perl chomp function is that when the argument variable is omitted. In this case, it will chomp the special scalar variable $_.

 

The chomp function is more frequently used when $/ has the default value "\n". In this case, Perl chomp function avoids uncertainty about whether a line of input has an ending newline character or not – if it will find a newline as the rightmost character of the line it will remove it, otherwise it will do nothing.

 

If you use <STDIN> handle to input some data, you should always chomp the line of input immediately after reading it.

 

 

The chomp function returns the total number of characters removed from all its arguments, either you use it in scalar or list context. Please note that this function acts upon the arguments and does not return a lvalue, so the following two calling examples of Perl chomp function are totally wrong:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $line;
$line = chomp ($line = );  # wrong!
print "line = $line\n";           # it expects: line = 1
 
$line = "1234\n";
$line = chomp $line;              # wrong!
print "line = $line\n";           # it expects: line = 1
In the first example, the script will read from STDIN a line and it will store it in the scalar variable $line. Next chomp function will remove the ending newline from $line and it will return the number of characters removed in the $line variable. So the $line variable will be set to 1 (the number of characters removed by chomp).

In the second example, the $line variable will be assigned to "1234\n". The Perl chomp function will remove the trailing newline from $line. Finally, the $line will be assigned with the value returned by chomp, i.e. 1 (the number of characters removed).

The following example will show you an example about how to use the return value of the chomp function when you chomp a scalar, an array or a hash.

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# scalar example
my $nr = chomp (my $color = "blue\n");
print "Newlines removed: $nr\n";
# it expects: Newlines removed: 1
 
# array example
my @array = ("blue\n", 1, 3.14, "12\n", "Perl");
$nr = chomp(@array);
print "Newlines removed: $nr\n";
# it expects: Newlines removed: 2
 
# hash example
my %ages = ("John\n", 45, "Paul\n", "25\n", "Marie\n", "22\n");
$nr = chomp %ages;
# it removes only the newlines from the hash values!
print "Newlines removed: $nr\n";
# it expects: Newlines removed: 2

In the next code I assigned the string value "blue\n" to the $color variable. The Perl chomp function will remove the newline character from the end of the $color variable.

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $nr = chomp (my $color = "blue\n");
 
print "The color is $color (the newline was removed)\n";
# it displays: The color is blue (the newline was removed)
 
print "Characters removed: $nr\n";
# it displays: Characters removed: 1
The next example will read from keyboard the color chosen by user and it will store it in the variable $color. This variable will have at the end a newline character, inserted after the user hit the return key. The Perl chomp function will take it off.
#!/usr/local/bin/perl
 
use strict;
use warnings;
 
print "Which color: ";
my $color;
my $nr = chomp ($color = <STDIN>);
 
print "Your colour is : $color\n";
print "Characters removed: $nr\n";
Please note that when it is assigned to a scalar variable, the input operator reads one line only.

 

To illustrate how to use the Perl chomp function with an array, take a moment to examine the next snippet code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize an array
my @numbers = ("one\n", "two\n", "three\n", "four\n");
 
my $nr = chomp(@numbers);
 
print "\@numbers = @numbers\n";
print "Newlines removed: $nr\n";
Output:
 
@numbers = one two three four
Newlines removed: 4
 
The Perl chomp function removed the trailing newline from all the array elements and returned the number of removed newlines (in our example 4).

 

I want to make a note about the first print statement. As you can see, the string that I intended to print was enclosed by double quotes. That means that it will be interpolated, i.e. any variable or escaped character found between the quotes will be replaced with their values. 

If an array (in our case @numbers) is inside the double quotes, the array will be interpolated and its elements will be printed separated by space or whatever you have in $". The first @ character was prepended by the backslash character \ in order to be printed as a normal character and not be interpolated.

In the second print the $nr scalar variable will be interpolated too – that is, it will be replaced by its value: 4.

 

 

The usual way to deal with arrays and chomp is when you read a file and you need to remove the eventually trailing newline from the end of each line. I will illustrate this with the __DATA__ pseudo file handle. Take a moment to examine the next code snippet:

 

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
chomp(my @colors = <DATA>);
print "\@colors = @colors\n";
 
__DATA__
red
yellow
White

This code will read from <DATA> handle all the lines up to EOF and it will create the @colors array - the content of each line will be stored as an element of the array. Next, the Perl chomp function will remove the trailing newline from each line of the array and it will return the total number of characters removed. If White is the last line of the script, the output is as follows:

@colors = red yellow White
However,  keep in mind that this approach is not suitable for large amounts of data because the file will be entirely read in the array and it can eat a big chunk of memory. This could run you into some memory issues.

 

 

One approach to chomp the elements of an array is to use the Perl chomp function inside a while loop:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize the array
my @fruits = ("lemon\n", "orange\n", "plum\n", "nut\n");
my $nr;
 
my $count = scalar @fruits;
while ($count--) {
  $nr += chomp $fruits[$count];
  # some other statements here
}
 
print "Characters removed: $nr\n";
print "The array content: @fruits\n";
I use the same array as in the previous example. The scalar function will return the size of the $fruits array (the number of elements) in the $count scalar variable. The while loop will execute the block as long as $count is greater than zero. At each iteration step:
 
  • $count is decremented
  • the current element of the array is chomped and the $nr variable is incremented with the number of removed characters

Finally, when the loop ends, the content of the array and the number of characters removed will be printed. Please note that the elements of the array will be chomped from right to left (because the index use to access the elements of the array is decremented).

 

See the following code sample :

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize a hash
my %ages = ("John\n", "43\n", "Paul\n", "25\n", "Marie\n", "22\n");
my $nr = chomp %ages;
print "Newlines removed: $nr\n";
 
# well, I use Data::Dumper module to see
# what it is in the hash
use Data::Dumper;
print Dumper(\%ages);
If you call the Perl chomp function against a hash variable, only the hash values will be chomped while the keys will remain unchanged. After processing all the elements of a hash, the chomp function will return the number of removed characters.

The above code will produce the following output:

Newlines removed: 3
$VAR1 = {
          'Marie
' => '22',
          'John
' => '43',
          'Paul
' => '25'
        };
Here I used Data::Dumper module in order to print the hash. As you can see, the Perl chomp function removed only the ending newline from the values, but not from the keys.

 

If you look at the output, you’ll guess the presence of the newline character after each key (Marie, John, Paul). The three newlines removed correspond to the hash values.

 

If you want to read the elements of a hash from a file, be careful about how you organize your input file.

 

See the next code snippet:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# read the pair elements of the hash from a file
chomp(my %fruitsColors = <DATA>);
 
__DATA__
apricot
yellow
cherry
red
plum
darkblue
In this example I used __DATA__ marker as a pseudo datafile. The keys will be: apricot, cherry and plum. The Perl chomp function will globally chomp the hash (after the hash is completely read from the input file), but the problem is that only the values will be chomped. Each key will have a newline attached as its rightmost character.  

One way to bypass this is to reorganize the format of the input file and read the file one record at a time, using the while statement:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# read the file one record at the time
my % fruitsColors = ();
while(<DATA>) {
  chomp;
  my @line = split(/,/);
  $fruitsColors{$line[0]} = $line[1];
}
 
use Data::Dumper;
print Dumper(\%fruitsColors);
 
__DATA__
apricot,yellow
cherry,red
plum,darkblue
The while loop will read the file one record at a time. At each iteration step:

 

  • the current line of the file is assigned to the special variable $_
  • the chomp function removes the newline from $_ (this is the case when the Perl chomp function has no argument and it is called against $_ - the third syntax form of the chomp function)
  • the line from $_ will be split into the @line array, using the comma separator
  • the new current pair element is added to %fruitsColors hash, where the key is set to $line[0] (the first element of the @line array), while the value is set to $line[1] (the second element of the same array). Please note that the hash members are accessed with {} and the array members are accessed with [].

The output of this code snippet is as follows:

$VAR1 = {
          'cherry' => 'red',
          'plum' => 'darkblue',
          'apricot' => 'yellow'
        };

Please note that the hash is not ordered and you couldn’t expect any order when you display its elements. If you want to process the elements of the hash in the order they have been inserted, you can use the Tie::IxHash module (search CPAN for it).

 

Next, I’ll show you another example that reads a set of hash keys from a pseudo-datafile. See in the next code snippet how you can get rid of the trailing newlines:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize a hash
my %fruits = map {chomp; $_ => 0} <DATA>;
 
# I’ll print the hash keys using the foreach statement
# as a modifier of a statement
print "$_ " foreach (keys %fruits);
print "\n";
 
__DATA__
apricot
cherry
plum

I want to make some comments about this code. If you remember one of the syntax forms of map function, it runs a block of statements on each element of an array and returns a new array:

@LIST = map BLOCK @ARRAY

In our example, the Perl map function waits for an array as its second argument, so Perl will create an array when reading from <DATA>. Next, the map function will run the block {chomp;$_=> 0} on each element of this array:

 

  • each element of the array will be assigned in turn to $_
  • chomp will remove the trailing newline from $_
  • the pair element ($_,0) will be added to %fruits hash

 

Finally, I print the key of %fruits hash using the foreach statement and the keys function. The keys function creates an array with the keys of the hash and only afterwards foreach statement will loop through the array keys.

 

Please note that this approach to print a hash using foreach and keys is not very suitable for hashes with large amount of data.

 

The output is as follows:

cherry plum apricot

 

The special variable $_ is a scalar variable, so in this case we use Perl chomp in a scalar context. If you recall the third syntax form of Perl chomp function, if you use chomp without argument that means that the content of $_ variable will be chomped. See the next short example for this:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
$_ = "123\n";
my $nr = chomp;      # or my $nr = chomp $_;
print "Characters removed: $nr\n";
# it expects: Characters removed: 1
You can use this short form of Perl chomp (where the argument by default is $_) within a block of a statement or with a function that uses $_ (like foreach, map, etc).

The next example chomps the elements of an array one by one, using the foreach statement:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize an array
my @numbers = ("one", "two\n", "three\n", "four");
my $nr;
 
$nr += chomp foreach(@numbers);
 
print "Characters removed: $nr\n";
print "The array content: @numbers\n";
The output:
 
Characters removed: 2
The array content: one two three four
 
The Perl foreach statement goes through the @numbers array and chomps the array elements one by one. If you have more things to do inside the foreach loop, you can use the following code lines:
 
foreach(@numbers) {
  $nr += chomp;
  # some other statements here
}
instead of:
 
$nr += chomp foreach(@numbers);

If you want to deal with Perl chomp and scalar references, you can look at the following code sample:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
print "Type a word: ";
my $scalarVar = <STDIN>;
 
# define a scalar reference
my $scalarVarRef = \$scalarVar;
 
# chomp the reference
my $nr = chomp $$scalarVarRef;
 
print "Characters removed: $nr\n";
# it outputs: Characters removed: 1
This code reads a line from STDIN and store it in $scalarVar. We define a reference to this variable by putting a \ in front of the variable and next we chomp the reference.

In order to chomp the reference, you need to dereference it by putting the appropriate symbol in front of the reference – in our case the $ symbol, because we have a scalar data type here.

 

The next code shows you an example about how to chomp an array reference:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize an anonymous array and define a reference to it
my $arrayRef = ["John\n", "Peter", "Alice\n"];
 
# chomp the array reference
my $nr = chomp @$arrayRef;
 
print "Characters removed: $nr\n";
print "The array content: @$arrayRef\n";
The output:

Characters removed: 2
The array content: John Peter Alice
 
By enclosing an item between square brackets we make a new anonymous array that returns a reference to that array.

To chomp the array reference, you must dereference it by putting a @ symbol in front of the $arrayRef reference (or you can put the array reference in curly braces: @{$arrayRef}).

Finally we print how many characters were removed and the content of the array.

Please note that not all the elements of the array end in newline: Perl chomp will check if the last character is a newline and only then it removes it, otherwise the array element remains unchanged.

 

The following examples show you some ways to chomp a hash reference. Let’s start by showing you the first code sample:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize an anonymous hash and defined a reference to it
my $hashRef = {"John\n", "23", "Peter", "45\n", "Alice\n", "32"};
 
# chomp the hash reference
my $nr = chomp %$hashRef;
print "Characters removed: $nr\n";
The output:

Characters removed: 1

By enclosing an item between braces we make a new anonymous hash that returns a reference to that hash. To chomp the hash reference, you must dereference it by putting a % symbol in front of the $hashRef reference (or you can put the hash reference in curly braces: %{$hashRef}).

But if you look at the output, you’ll see that the Perl chomp function removed only 1 newline, because the hash keys are not chomped.

You can loop through the hash using the while statement with the each function and chomp the keys one by one:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize an anonymous hash and defined a reference to it
my $hashRef = {"John\n", "23", "Peter", "45\n", "Alice\n", "32"};
# chomp the values of the hash reference
my $nr = chomp %{$hashRef}; # or my $nr = chomp %$hashRef;
 
# chomp the keys of the hash reference
while (my ($key, $val) = each %{$hashRef}) {
  delete ${$hashRef}{$key};
  $nr += chomp $key;
  ${$hashRef}{$key} = $val;
}
 
print "Total characters removed: $nr\n";
# next I use Data::Dumper module to see
# what is in the hash
use Data::Dumper;
print Dumper(\%{$hashRef});
Like in the previous example, first we chomp the hash values globally and then we chomp the keys of the hash one by one; at each iteration step of the Perl while loop:
 
  • the each function will return the key and the value for the next element of the hash, apparently in a random order
  • the current $key is deleted from the hash
  • $key is chomped and $nr is incremented with the number of characters removed
  • the pair ($key, $val) is added to the hash

At the end of the script the number of characters removed and the content of the hash will be printed. Please note that first I deleted the key and only afterwards I chomped it.

The output of this script will be as follows:

$VAR1 = {
          'John' => '23',
          'Alice' => '32',
          'Peter' => '45'
        };

Another approach is to copy the hash into another hash and then Perl chomp the keys and the values one by one using the map function like in the following example:
#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $hashRef = {"John\n", "23", "Peter", "45\n", "Alice\n", "32"};
my $nr;
 
my %newHash = map {
  # save and chomp the current hash value
  my $val = ${$hashRef}{$_};
 
  # or my $val = $$hashRef{$_};   # or my $val = $hashRef->{$_};
  $nr += chomp $val;
  # chomp the key stored in $_
  $nr += chomp;
 
  # add the new element to %newHash
  $_ => $val;
} keys %$hashRef;
 
print "Total characters removed: $nr\n";
# see what is in the new hash
use Data::Dumper;
print Dumper(\%newHash);
In this code the keys function will create a temporary array with all the keys of the hash defined by the $hashRef reference. Next, map function will loop through the keys array, running the block on each key of the temporary array.

At each iteration step of the Perl map function, the current key of the temporary array will be assigned to the special variable $_ and then the block will be executed.

Finally, we’ll get the new copy of our hash with all the keys and values chomped. You can use this method in more general cases where you need to modify the key and the value of the hash’s elements and get a copy of the original hash with the new keys and values.

The output is identical to the one in the previous example.

 

This function is very useful when you read data from the keyboard through the special file handle STDIN. The standard input stream will add a newline character (\n) at the end of each line and the Perl chomp function will take it off, like in the following example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
print "User name: ";
 
my $name = <STDIN>;
chomp ($name);
print "$name is your user name\n";

This code will read from STDIN the user name and it will store it in the variable $name – when assigned to a scalar variable, the input operator reads one line only.

This variable will have at the end a newline character, inserted after the user hit the return key. The Perl chomp function will remove it. (If you want to test this code, after typing the user name, hit the Enter key).

You can omit the file handle (by default STDIN is considered to be the keyboard) and than the input operator will read either from any files specified on the program’s command line (which are stored in the special array variable @ARGV) or from STDIN if none are specified. For this you may replace the line:

my $name = <STDIN>;

with:

my $name = <>;

So, if you use <> and no argument is supplied to the script in command line, the input operator will read from STDIN (usually the keyboard).

You can play with the command line arguments inside a Perl script, without specifying them in the command line. See the following snippet code:
#!/usr/local/bin/perl
 
use strict;
use warnings;
 
push @ARGV, "psw.txt";
my $psw = <>;
chomp ($psw);
print "Your password is : $psw\n";
The push function will append at the @ARGV array the filename "psw.txt". The next statement will read the first line of the psw.txt file in the scalar variable (because $psw is a scalar and in the scalar context the input operator <> will read only one line). Next, the Perl chomp function will remove the trailing newline.

If you want to test this script, you need to create in the current directory with a text editor the file "psw.txt", here is an example of the content of the file:

qwert
user##~1
user^&*2
user+=&^a
 
If you need to read the entire file and next print it, you can do it by reading the file into an array and than chomp the entire array at once:
 
#!/usr/local/bin/perl
 
use strict;
use warnings;
 
push @ARGV, "psw.txt";
my @psw = <>;
chomp (@psw);
print "Your passwords are : @psw\n";
If you need to know the numbers of characters removed, you can use the following short script:
 
#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $count = chomp(my @array = <>);
 
print "Characters removed: $count\n";

The parentheses force chomp to act on the result of what is between them.

First the diamond operator <> is evaluated and all the lines read from STDIN (by default) will be stored as elements in @array.

Next the entire @array will be chomped and the total number of newlines removed will be store in the scalar variable $count.

If you read a single line from STDIN you must chomp it in order to remove the ending newline, and you can do this on a single line of code like in the following short script:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $nr = chomp(my $line = <STDIN>);
 
print "Number of characters removed: $nr\n";
print "\$line = $line\n";
Please note that if you use a scalar variable to read from STDIN, Perl will read only one line.

To print the dollar sign without making Perl think it is a scalar, you need to precede it by the backslash (\) escaping character.

If you read more lines from STDIN and intend to store them in an array, you can use the following snippet code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $nr = chomp(my @array = <STDIN>);
 
print "Number of characters removed: $nr\n";
print "\@array = @array \n";
As you can see, I wrote this on a single code line, too. After executing the code, the @array will contain as elements the lines read from STDIN but without the ending newline.

If you want to test and run this code, in the last input line, you must type Ctrl/d in Linux (Ctrl/z in Windows followed by Enter key) to tell Perl that you finished input the data.

If you want to store the lines read from STDIN as pair elements in a hash (first line for key, the next for value and so on), similarly with the previous snippet code, you can write:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
chomp(my %hash = <STDIN>);
One problem with this code is that in the case of hashes, only the values are chomped, but not the keys (see how to chomp a hash for more details).

 

Generally, the Perl chomp function removes only the last newline character but what if there are more? In this case you can play with the special variable $/ (the input record separator), using the paragraph mode by setting $/ to "", as in the next example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# save the current value of $/
my $crtValue = $/;
$/ = "";
my $v = "\n\nsome text here\n\n\n\n";
my $nr = chomp $v;
print "The last $nr newlines have been removed\n";
 
# restore the $/ current value
$/ = $crtValue;
 
# some more statements here
The input record separator ($/) is used to delimitate readable pieces from input. The most usual case is when this special variable has the value \n that delimitates the lines read from STDIN.

After running this code, the first two newlines will remain unchanged, but the last four will be removed. The variable $nr will be set to 4 – the number of newline characters removed.

Be careful, however, when you alter the content of the special variable $/ and restore it to its current value, when you consider necessary (especially if your code is longer than the above snippet one).

 

 

One approach to chomp the elements of an array is to use the Perl chomp function inside a foreach loop:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize the array
my @fruits = ("lemon\n", "orange\n", "plum\n", "nut\n");
my $nr;
 
foreach my $item (@fruits) {
  $nr += chomp $item;
  # some other statements here
}
 
print "Characters removed: $nr\n";
print "The array content: @fruits\n";
The foreach loop will iterate through the @fruits array using the $item iterator.

At each iteration step the current element of the array will be assigned to $item, the Perl chomp function will chomp the value stored in $item and will increment the scalar $nr variable with the number of characters removed.

Please note that in a foreach loop the iterator variable is rather an alias of the current element of the array, so if you change the iterator variable content, the current element content of the array will be changed.

Practically, by chomping the iterator we really chomp the current element of the array.

Finally, when the loop ends, the content of the array and the number of characters removed will be printed.

 

If you have an array and you want to chomp all the elements of the array using a foreach statement, you can use the following code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @numbers = ("1\n", "2\n", "3\n", "4\n");
# lets loop through arrays elements using foreach
foreach (@numbers) {
  chomp;
}
 
print "@numbers\n";
# it outputs 1 2 3 4

The foreach statement loop iterator variable is missing and that means that foreach will use for iteration the special variable $_, reading the elements of the @numbers array one by one. At each step of iteration:

  • the current element of the array will be assigned to $_
  • the value assigned to $_ will be chomped by its ending newline – because I used Perl chomp without any argument, this function will remove the trailing newline from $_
  • finally, I used print function with double quotes in order to display the array elements separated by space.

In the next example, the foreach statement is used with an iterator variable and in connection with STDIN filehandle:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize an array
my @colors = ();
 
foreach my $line(<STDIN>)
{
  chomp $line; 
  # append the line to the array
  push @colors, $line;
}  
print "@colors\n";
foreach statement is designed to work with lists, so Perl will create an array from the lines read from STDIN; next, foreach loop goes through all the elements of the array, and at each iteration step:

  • the current element of the array is assigned to the $line variable
  • the Perl chomp function removes the ending newline from the $line scalar variable
  • the push function appends the content of $line to @colors array

Finally, the @colors array is printed. A possible output for the above sample code is as follows:

blue
white
Ctrl/z in Windows   (Ctrl/d in Linux)
blue white

You can use the while loop to read some lines from STDIN, chomp them one by one and next append them to an array.

This approach (using  while and not foreach) is preferable if you need to read large amount of data from STDIN or other input file handle.

See the following code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @colors = ();
 
chomp, push @colors, $_ while(<STDIN>);
 
# or if you want to use an iterator scalar variable instead of $_
# you can replace the above line with the next two lines
# my $line;
# chomp $line, push @colors, $line while($line = <STDIN>);
 
# or if you have some more things to do in the
# while loop you can use the next 5 code lines instead:
# while(<STDIN>)
# {
#   chomp;
#   push @colors, $_;
# }
 
print "\@colors = @colors\n";

The while loop will read from standard input some lines of text:

  • each line is assigned in turn to the special variable $_
  • the Perl chomp function will remove the last newline from $_ (because the chomp function is called without any argument, it will remove the newline from $_)
  • the content of each line is then added to @colors array by using the push function.

An example of output could be the following:

blue
yellow
Ctrl/Z in Windows   (Ctrl/d in Linux)
@colors = blue yellow

To indicate the EOF of the file, after the input lines, you must type Ctrl/d in Linux (Ctrl/z in Windows and than strike the Enter key) as the last line.

When you want to read from STDIN using the while statement, you can use any of the two code lines, too:

while(defined(my $input = )) {   # the long form
while($_ = ) {                   # using $_ explicitly

 

In order to overwrite the loop normal behavior, Perl language provides three loop control operators: last, next and redo. The following snippet cod shows you how you can use them in conjunction with Perl chomp function.

Please take a look at the following code snippet:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $done = 0;
 
while(!$done) {
  chomp (my $line = lc <>);
  last if $line eq 'last';
  next if $line eq 'next';
  $done = 1, redo if $line eq 'redo';
  print "$line\n";
}
The while loop will iterate as long as the $done scalar variable is different from 0. At each iteration step:
 
  • Perl will read a line from STDIN (if no file handle is used with the diamond operator <> the interpreter will examine the @ARGV special array variable and if this array is empty, the operator will read from STDIN); because the $line is a scalar variable, Perl will read only one line from STDIN, will convert it to lowercase and then it will store it in the $line variable ;
  • chomp will remove the ending newline from $line
  • if $line is equal with the word last, the while loop will end the loop entirely, skipping the remaining statements in the block
  • if $line is equal with the word next, the while loop will skip the rest of the code and it will move to the next iteration
  • if $line is equal with the word redo, the while loop will repeat the same iteration but without reevaluating the condition (before executing redo I assigned 1 to $done, to show you that the condition statement is not evaluated in this case)
  • if $line is some other word, it will be printed by the last statement of the block and the while loop will reiterate

 

If you have a file with large amount of data – let’s say it is named "fn.txt", you can’t slurp the file in an array and globally chomp it as in the next sample code:

open (FILE, "fn.txt") || die "Can't open fn.txt: $!";
chomp(my @array = <FILE>);
because it can eat a big chunk of memory and you could run into some memory issues. Of course this way can make it easier to search through the file as you can simply access the array by a line number. But this is not the way to do.

In this case you need to read the records of the file one at a time, process the record and then read the next record and so on. There are many approaches to do it, I’ll show you an example by using the while statement.

In the next code snippet we use Perl defined function to test if the line-input operator <FILE> reached the end of file. In the scalar context this operator returns a line of text, but if there are no more input lines, it returns the undef value.

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
open (FILE, "fn.txt") || die "Can't open fn.txt: $!";
 
while(defined(my $line = <FILE>)) {
  chomp $line;
  # some other statements here to process the line
}
close FILE;

The loop while will end when <FILE> handle will return undef, i.e. the end of file will be reached. The Perl chomp function is used to get rid of the eventually newline attached at the end of the $line variable.

Anyway, using defined function is the longest way to read from a filehandle using while statement. Any of the two below code lines are equivalent:

while($_ = ) {  # is using $_ explicitly
while(my $line = ) {      # the short form

 

One approach to chomp the elements of an array is to use the Perl chomp function inside a map block:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# initialize the array
my @fruits = ("lemon\n", "orange\n", "plum\n", "nut\n");
my $nr;
 
map {
  $nr += chomp;
  # some other statements here
} @fruits;
 
print "Characters removed: $nr\n";
print "The array content: @fruits\n";
In the last example I use the syntax form of the map function with a block. The map function will iterate through the @fruits array and at each iteration step it will assign the current element of the array to $_ and then it will execute the block.

Inside the map block the Perl chomp function will remove from the element assigned to $_ the ending characters that match $/ (in our case it is only a newline) and it will increment the $nr variable with the number of chomped characters.

 

You can use the defined function to test if the line-input operator <STDIN> reached the end of the file. Used in scalar context this operator returns a line of text, but if there are not more input lines, it returns the undef value. It is very useful when you want to distinguish between an empty line and the end of file. See the following snippet code for an example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
while(defined(my $input = <STDIN>)) {
  chomp $input;
  print "input line: $input\n";
}
The loop while will end when <STDIN> will return undef, i.e. the end of file will be reached (either we read from a file handle or from the keyboard). The Perl chomp function is used to get rid of the eventually trailing newline.

Anyway, using defined function is the longest way to read from STDIN using while statement. Any of the two below code lines are equivalent:

while($_ = ) {  # is using $_ explicitly
while(my $input = ) {      # the short form

 

The Perl chomp function removes any trailing string that corresponds to the special variable $/ (input record separator). By default, $/ is set to the newline character, i. e. "\n". You can play with the input record separator to perform different tasks. You can set this variable to any string you like.

I think the best way to deal with this is to perform the respectively tasks inside a block where you declare the $/ special variable with local (another idea is to save the value of $/ in a scalar variable and restore it later). It is very important not to alter the value of this variable throughout your script, otherwise you could expect to some file reading issues.

You can set the $/ special variable to the ""– this is what it is called the "paragraph mode". In this mode you can read a text file paragraph by paragraph and chomp all the newlines from the end of the paragraph.

An illustrative example here:

#!/usr/local/bin/perl
 
use warnings;
use strict;
 
# initialize an array with an empty list
my @array = ();
{
  local $/ = "";
  foreach () {
    chomp;
    push @array, $_;
  }
}
 
print $_, "\n" foreach (@array);
 
__DATA__
The if statement is very often used in Perl.
 
This is Perl.
Some Perl functions here: push, substr, index.
 
 
This is Perl too.
In this script it is used again the __DATA__ pseudo-datafile marker instead of a real file, to simplify the test of this script.

I’ll examine below how this script works:

  • the @array is initialized with an empty list (actually my @array; is enough to initialize an array with an empty list because the array is declared with my)
  • the file is read in an anonymous block because $/ is declared here with local and after exiting from the block, Perl will automatically restore the previous value of $/ (an alternative to the block is to store the current value of $/ in a scalar variable and restore it to its initial value after reading the file).
  • to read the file the foreach statement is used:
    • each line of the file is set in turn to $_;
    • chomp will remove the trailing newline from $_
    • push will append the line assigned to $_ to @array
  • finally, the @array is printed, each element on a new line.

This script produces the following result:

The if statement is very often used in Perl.
This is Perl.
Some Perl functions here: push, substr, index.
This is Perl too.
 
You can imagine a lot of other ways to use and set the $/ in Perl. For example, if you need to read a file with fixed length records, you can set the $/ to the length of the record and so on.

But finally, I can’t stretch enough to restore the $/ value at the end of your processing, if you altered it. Especially if your script is a more elaborate one.

 

If you have some strings split over two or more lines, you can use a continuation character to tell Perl that a continuation line follows. See the next code snippet:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @array = ();
my $line = "";
 
while($line .= <DATA>) {
  chomp $line;
  if($line =~ s/\\\s*$//) {
    next;
  }
  push @array, $line;
  $line = "";
}
 
foreach (@array) {
   print "$_\n";
}
 
__DATA__
This \
is a \
multiline string
This is \
another one
This code will produce the following output:
 
This is a multiline string
This is another one

In the above code we used the backslash character to indicate that a line will be continued and the __DATA__ pseudo-datafile marker instead of a real file, to simplify the test of this script.

The while loop will read the lines one at a time and will concatenate the $line variable with the current line read from the <DATA> pseudo-filehandle. At the beginning of the script the $line variable will be initialized with the null string.

Inside the while block we do the following:

  • the Perl chomp function will remove the trailing newline from $line
  • we use the substitution operator (s///) to remove the ending backslash for $line (the ~= operator will bind the substitution operator to $line variable); the first backslash is the escape character for the next special character, in our case it is the backslash character too; \s matches the eventual whitespace (\t\f\r\n)) characters that follow the backslash character that we used as a continuing character; the * character causes the precedent character (in our case a whitespace character) to be matched 0 or more times 
  • if we succeed - that means we have a continuation line - the next control will skip the rest of the block, the script continuing with the reading of another line
  • if the substitution operator returns false, the $line remains unchanged and it will be appended to @array using the push function, the $line variable will be initialized to the null string and the script will continue with a new iteration of the while loop

Finally, we used the foreach statement to print the array.

 

Please recall that the Perl chomp function removes the input record separator stored in $/ special variable. We will examine below the case when $/ is set to LF ("\n").

You don’t need to worry about the specific platform you are running your script. When you read a line from a file, you need to remove the trailing newline from your line. The chomp function will do the right thing and it will remove the record separator at the end of your string.

On a Unix platform one line read from a file is expected to end in only a newline (LF), and the Perl chomp function will remove the newline found at the end of the line.

On a Windows platform, when reading a file it is expected that one line read from the file will end in both a CR ("\r") and LF ("\n"), but Perl automatically converts CRLF to LF. The chomp function will remove the ending newline (LF) as in the previous case.

However, some issues could happen if you read documents written on one platform, from another platform.

An alternative to the Perl chomp function is to use the regular expressions. Here is an example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my @array = ();
 
foreach my $line (<DATA>) {
  $line =~ s/\r?\n$//;
  push @array, $line;
}
 
# print the array
foreach (@array) {
  print "$_\n";
}
 
__DATA__
first line
second line
And the output:

first line
second line

In the above script we used the __DATA__ pseudo-datafile marker instead of a real file, to simplify the test of this script.

The script will read the entire file into a temporary array and foreach will go through the elements of this array, removing the CR and the LF characters, using the substitution operator (s///).

The pattern binding operator (=~) allows us to make the substitution on the $line string variable.

The quotation mark (?) causes the precedent character (\r) to be matched either 0 or once and the dollar sign ($) tells Perl to make the substitution at the end of the $line string.

After removing the eventually CR and the LF characters, the current line is appended to @array.

Finally, we print the @array using the foreach statement. As you can see in the output, the newlines were removed.

 

In the following example we create a binary file from an array – each string element of the array as a record of the file.

The string elements of the array will be converted into hexadecimal values and then the hex strings will be appended onto the binary file.

After each record we write the "\0" character in order to delimitate them.

Next, the array used to create the file will be initialized to an empty list.

First we need to set the $/ (input record separator) with "\0", and restore it at its current value after reading and processing the file.

We are going on with reading the file one line at a time, removing the trailing "\0" character with the Perl chomp function and converting the hex string read from the file record back into its character string.

After that the text string is appended to our initial array.

Finally we print the array, each element on a line.

Shortly, we have the following steps:

  • initialize an array with a few string elements
  • create from the array a binary file with the records delimitated by "\0" separator
  • empty the initial array
  • set the $/ with "\0"
  • read the binary file line by line, chomp the "\0" rightmost character from the current line and append this line to our array
  • print the content of the array

Notice that we’ll get back the initial content of our array.

Now, here is the script code:

#!/usr/local/bin/perl
 
use warnings;
use strict;
 
# initialize an array
my @array = ("Perl statements", "Perl functions",
             "Perl arrays", "Perl hashes");
 
# create a binary file
 
my $file = "file.bin";
 
open(my $fh, "> $file") or die "error opening file '$file': $!";
 
binmode($fh);
 
foreach (@array) {
  # convert character string into a hex string
  s/(.)/sprintf("%x",ord($1))/eg;
  print $fh $_,"\0";
}
 
close $fh or die "error closing file '$file': $!";
 
# read the binary file
 
# empty the array
@array = ();
 
open ($fh, "$file") or die "error opening file '$file': $!";
 
# use local in an anonymous block to save and restore $/
{
  local $/ = "\0";
  while(<$fh>) {
    chomp;
    # convert hex string into a character string
    s/([a-fA-F0-9][a-fA-F0-9])/chr(hex($1))/eg;
    push @array, $_;
  }
}
 
close $fh or die "error closing file '$file': $!";
 
# print the array each element on a line
print $_,"\n" foreach (@array);
And the output:
 
Perl statements
Perl functions
Perl arrays
Perl hashes
 
As usual, I’ll explain below in details how this script works:
 
  • the @array is initialized with some few strings
  • the binary file named "file.bin" is created from the array elements:
    • we create the $file filehandle to be used with this file
    • we open the file in writing mode and we use the die operator if we can’t open the file
    • the binmode function is used to tell Perl that we want to create a binary file
    • the foreach loop iterates through the @array; because foreach is used without any iterator, the special variable $_ will be used instead; at each iteration step:
      • the current element of @array will be assigned to $_
      • the string value stored in $_ will be converted to its hexadecimal corresponding value using the s/searchpattern/replacement/modifiers substitution regexp operator (please note that if you don’t use =~, the substitution operator will search by default $_):

-       (.)This is the search pattern: the dot means we match any single character and the using of parentheses allow us to store the matched character in the special variable named $1 (if you have more parentheses, the expression included in the second parenthesis will be assigned to $2, and so on)

-      sprintf("%x",ord($1))is the replacement argument of the substitution operator and it is pure Perl code; the ord function returns the ASCII numeric value of the character stored in $1; next sprintf will convert this numeric value into its hex corresponding value (you can use %X if you want the hex values in uppercase)

-       e is a modifier and it tells the regex engine to treat the replacement field as Perl code (see above)

-       g is a modifier and tells the regex engine to repeatedly apply the substitution for all the characters of the string, starting with the first one

    • we close the binary file
  • the @array is initialized with an empty list
  • the $fh filehandle is associated with the "file.bin" and we open this file in reading mode; if an error arises, the die operator will print the appropriated error message and the script will end; please note that for the $fh filehandle we are under the context of binmode function that we called at the beginning of the script
  • in order to save the content of $/, we read the binary file inside an anonymous block; when we exit from the block, the initial content of $/ will be restored:
    •  the special variable $/ is declared local and it is assigned with the "\0" value that is the delimiter for the records of our binary file
    • a while loop allows us to read the file one record at a time; for each line (or record) read from the file, the following steps are processed:
      • the current line is assigned to $_
      • the chomp function will remove the  "\0" character from the end of $_
      • we need to convert the hex string stored in $_ to a character string and we use the s/// substitution operator:

-       ([a-fA-F0-9][a-fA-F0-9]) match any two hexadecimal digits and store them in $1

-       chr(hex($1)) is the replacement argument of the substitution operator; the Perl hex function will convert the hexadecimal string stored in $1 in its decimal corresponding value and it will return this value; the chr function will return the character represented by the numeric value returned by the hex function

-       e and g modifiers have the same meaning as in the above code where we converted a character string into a hex string

      • we append the content of $_ to the @array
  • we close the binary file
  • finally, the array will be printed each element on a line, using a foreach loop.

Well, it’s quite a long story, if it’s boring for you let it aside and look at the code – it works!

 

I’ll give you a simple example about how to chomp an input line read fro STDIN and perform some actions depending on the content of the input line.

Take a look at the following sample code:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
chomp (my $choice = lc <STDIN>);
if ($choice eq 'yes') {
  print "Choice is Yes\n";
} elsif ($choice eq 'no') {
  print "Choice is No\n";
} else {
  print "Error: Choice $choice\n";
}
Let’s see how it works:
 
  • the lc function expects a scalar as argument so the <STDIN> operator will read only a line from the input standard; this line will be converted in lowercase and then will be assigned to the $choice scalar variable; next, the Perl chomp function will remove the trailing newline from the $choice variable
  • we use the if-elsif-else statement to check the content of the variable read from STDIN and perform different actions depending on the value stored in $choice;

In the above example we could use the uc (uppercase) function too, modifying the comparisons appropriate.

  

Our task is to print from a flat file database the names of cities located in Europe. We’ll name our file "cities.txt" and we suppose its records have the following structure:

continent:country:city

As you can see, the fields of a record are delimitated by ":" separator. A sample of this file is as follows:

Asia:China:Beijing
Europe:France:Paris
Europe:Austria:Vienna
Africa:Kenya:Nairobi
Asia:Japan:Tokyo
Europe:Greece:Athens
 
If you want to work with the following example, you need to create a new text file called "cities.txt", copy and paste inside this file the above lines of text and save this file in the same directory as the script file itself. The content of the script file is as follows:
#!/usr/local/bin/perl
 
use strict;
use warnings;
 
open (FN, "cities.txt") || die "Can't open cities.txt: $!";
my @cities;
while (<FN>) {
  chomp;
  next if index(uc $_, "EUROPE") == -1;
  # or next if index(lc $_, "europe") == -1;
 
  my @tmp = split(/:/);
  push @cities, $tmp[2];
  # or unshift @cities, $tmp[2];
}
print "Cities from Europe: @cities\n";
First, the script creates the FN filehandle needed to read from "cities.txt" and open it. If the file is not found or you cannot read it for any other reason, Perl will end the script and will warn you through the die signal, printing the appropriate message error stored in the $! special variable, like in the following output:
 
Can't open cities.txt: No such file or directory at 1.pl line 6.
 
To get this message, I renamed "cities.txt" as "cities" and of course Perl didn’t find the file. OK, let’s rename back our file with the correct name.

If everything is OK, the while loop will automatically read each line of our file one at a time and will assign the value of the current line to the special variable $_. Next, let’s see what it happens inside the while loop, at each iteration step:

  • the Perl chomp function is called with no argument, so it will remove the ending newline from the value assigned to $_
  • we try to see if the word "Europe" is found in the line assigned to $_
    • we need to make our search case insensitively, so we can use uc function to return the value of $_ converted in uppercase
    • the index function searches for the text "EUROPE" in the string contained in $_ and if it doesn’t find it, will reiterate the loop by using the next control
  • if the string is found, the split function will create the @tmp array with the fields of the current record
  • the push function will append the last element of the @tmp array to the @cities array

Finally, the script will print the cities we searched for:

Cities from Europe: Paris Vienna Athens

You can use either uc or lc to search for our substring case insensitively. Please note that we used push function to append the city name to @cities array and because of this we get the cities listed in the same order as they are in the file. But we can use the unshift function too, and in this case we will get the cities listed in a reverse order.   

If you want the cities listed in ascending alphabetical order, you need to sort the @cities array, i.e. to insert the next line before printing the array:

@cities = sort {$a cmp $b} @cities;

 

In the next example we read a few hexadecimal numbers from STDIN and we want to translate them in characters. The hexadecimal numbers are separated by the ', ' delimiter.

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $line;
chomp ($line = <STDIN>);
 
print map { chr hex } split /, /, $line;
print "\n";
A few words about how it works:
 
  • the STDIN operator works in scalar context, so only one line will be read from the input standard; the line will be assigned to the $line scalar variable and the Perl chomp function will remove the trailing newline from it
  • the $line will be split using the ', ' separator into a temporary array; the map function will run its block on each element (hexadecimal number) of the array, the elements of the array will be assigned in turn to $_; for every element stored in $_:
    • the hex function will convert the hexadecimal number into its ASCII corresponding value; please note that in the calling of hex function the expression argument is omitted and hex will use by default the hexadecimal value stored in $_
    • the chr function will convert the ASCII value returned by hex function into its equivalent character
  • the print function will print the list returned by the map function

A sample of output:

54, 68, 69, 73, 20, 69, 73, 20, 50, 65, 72, 6c, 21
This is Perl!

 

You can use Perl push function if you need to read some lines one by one from STDIN, chomp and store them in an array. The next snippet code is an example about how you can do it.

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
# declare and initialize an array
my @lines = ();
 
while (<STDIN>) {
    chomp;
    push @lines, $_;
}
 
print "@lines\n";
The while statement reads some lines from STDIN, one by one, until the EOF is met (if STDIN is set by default on keyboard input, the EOF means typing Ctrl/Z and Enter in Windows  and Ctrl/d in Linux). Inside the block of the while statement, after a line is read from the input standard, the following steps are performed:
 
  • each line is assigned to the special variable $_
  • calling Perl chomp function without argument has as result the removing of the trailing newline from $_
  • the value assigned to $_ is appended to the @lines array

Finally, the array is printed.

You can perform some other actions inside the while loop if you consider necessary (like using regexp to check and validate the input line, split in order to split the line in more individual units, comparisons and the list can continue …). But if you don’t need more than in the above example, you can shorten the code by writing the while statement on a single line:

chomp, push @lines, $_ while (<STDIN>);
If you have some nested statements or blocks and you don’t want to use $_ in order not to alter its value, you can use a temporary scalar variable to read from STDIN and replace the above code line with the next two:
 
my $line;
chomp $line, push @lines, $line  while ($line = );
 

You can use the Perl chomp and split functions to read a csv file line by line, chomp each line and next split the content of the line into different variables. A csv file is a comma delimited text data file that has the extension ".csv".

There are a lot of applications such as Excel and MS Access that allow you to save the data in a csv file format. It’s very easy to create a csv file yourself, by delimiting the data with commas and ending each line with a newline character.

In the following example I’ll use the __DATA__ pseudo-datafile marker instead of a real file, to simplify the test of this script.

Please have a look at it:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
foreach my $line (<DATA>) {
  chomp $line;
  my ($lastname, $firstname, $country, $age) = split /,/,$line;
  print "Last Name: $lastname, First Name: $firstname,
         Country: $country, Age: $age\n";
}
 
__DATA__
John,Silva,USA,23
Antoine,Chevron,France,45
And the output:
 
Last Name: John, First Name: Silva,
         Country: USA, Age: 23
Last Name: Antoine, First Name: Chevron,
         Country: France, Age: 45
 
A few words about how this code operates. The foreach statement will expect an array as argument so the <DATA> operator will create a temporary array from our pseudo-data file (each line of the file will be appended as an element to this array).

Next, the foreach statement will loop through this temporary array element by element, loading the current element into $line variable. Inside the block of the foreach statement, for each line the following steps will be executed:

  • the Perl chomp function will remove the ending newline from the $line scalar variable
  • the split functions will split the content of the $line into the $lastname, $firstname, $country and $age variables, using the comma separator
  • the content of the four variables will be printed

 

While a chop function deletes the last ending character of a string regardless of whatever it is, the Perl chomp function checks whether the last character(s) matches the input record separator ($/) and only then it deletes them.

If the last character of a string is a newline (and $/ is set by default to "\n"), both functions do the same thing, i.e. remove the newline character.

Please look at the following example:

#!/usr/local/bin/perl
 
use strict;
use warnings;
 
my $string = "Chomp and chop functions";
chomp($string);
print "$string\n";
# The $string remains unchanged (no ending newline)
 
chop($string);
print "$string\n";
# chop will delete the last character of the string
 
$string = "Chomp and chop functions\n";
chomp($string);
print "$string: the newline was removed \n";
# chomp will remove the ending newline
Output:
 
Chomp and chop functions
Chomp and chop function
Chomp and chop functions: the newline was removed