Perl chomp Function
The Perl chomp function is used to remove the trailing newlines from a string.
There are three syntax forms for this function:
If you use the first syntax form, you call this function having as argument the name of a scalar variable. In this case the function will remove any trailing characters at the end of the variable.
Perl chomp function will remove only the characters that match the special variable $/. If there are not any trailing characters at the end of the variable, the variable will remain unchanged. It returns the total number of characters removed.
chomp (LIST)
If the first syntax form was for scalar variables, the second is for lists. You can chomp any list, array or hash (associative array). It doesn’t matter if you want to chomp an array or a hash but it is safer to put the array or hash inside the parentheses, otherwise the result could be one that you wouldn’t expect.
If you chomp an array, all the trailing characters that match $/, from all the elements will be removed. It returns the total number of characters removed.
In the case of hashes, only the values will be chomped, whereas the keys will remain unchanged. It returns the total number of characters removed.
chomp
The last syntax form of Perl chomp function is that when the argument variable is omitted. In this case, it will chomp the special scalar variable $_.
The chomp function is more frequently used when $/ has the default value "\n". In this case, Perl chomp function avoids uncertainty about whether a line of input has an ending newline character or not – if it will find a newline as the rightmost character of the line it will remove it, otherwise it will do nothing.
If you use <STDIN> handle to input some data, you should always chomp the line of input immediately after reading it.
The chomp function returns the total number of characters removed from all its arguments, either you use it in scalar or list context. Please note that this function acts upon the arguments and does not return a lvalue, so the following two calling examples of Perl chomp function are totally wrong:
In the second example, the $line variable will be assigned to "1234\n". The Perl chomp function will remove the trailing newline from $line. Finally, the $line will be assigned with the value returned by chomp, i.e. 1 (the number of characters removed).
The following example will show you an example about how to use the return value of the chomp function when you chomp a scalar, an array or a hash.
In the next code I assigned the string value "blue\n" to the $color variable. The Perl chomp function will remove the newline character from the end of the $color variable.
To illustrate how to use the Perl chomp function with an array, take a moment to examine the next snippet code:
Newlines removed: 4
I want to make a note about the first print statement. As you can see, the string that I intended to print was enclosed by double quotes. That means that it will be interpolated, i.e. any variable or escaped character found between the quotes will be replaced with their values.
If an array (in our case @numbers) is inside the double quotes, the array will be interpolated and its elements will be printed separated by space or whatever you have in $". The first @ character was prepended by the backslash character \ in order to be printed as a normal character and not be interpolated.
In the second print the $nr scalar variable will be interpolated too – that is, it will be replaced by its value: 4.
The usual way to deal with arrays and chomp is when you read a file and you need to remove the eventually trailing newline from the end of each line. I will illustrate this with the __DATA__ pseudo file handle. Take a moment to examine the next code snippet:
This code will read from <DATA> handle all the lines up to EOF and it will create the @colors array - the content of each line will be stored as an element of the array. Next, the Perl chomp function will remove the trailing newline from each line of the array and it will return the total number of characters removed. If White is the last line of the script, the output is as follows:
One approach to chomp the elements of an array is to use the Perl chomp function inside a while loop:
- $count is decremented
- the current element of the array is chomped and the $nr variable is incremented with the number of removed characters
Finally, when the loop ends, the content of the array and the number of characters removed will be printed. Please note that the elements of the array will be chomped from right to left (because the index use to access the elements of the array is decremented).
See the following code sample :
The above code will produce the following output:
$VAR1 = {
'Marie
' => '22',
'John
' => '43',
'Paul
' => '25'
};
If you look at the output, you’ll guess the presence of the newline character after each key (Marie, John, Paul). The three newlines removed correspond to the hash values.
If you want to read the elements of a hash from a file, be careful about how you organize your input file.
See the next code snippet:
One way to bypass this is to reorganize the format of the input file and read the file one record at a time, using the while statement:
- the current line of the file is assigned to the special variable $_
- the chomp function removes the newline from $_ (this is the case when the Perl chomp function has no argument and it is called against $_ - the third syntax form of the chomp function)
- the line from $_ will be split into the @line array, using the comma separator
- the new current pair element is added to %fruitsColors hash, where the key is set to $line[0] (the first element of the @line array), while the value is set to $line[1] (the second element of the same array). Please note that the hash members are accessed with {} and the array members are accessed with [].
The output of this code snippet is as follows:
$VAR1 = {
'cherry' => 'red',
'plum' => 'darkblue',
'apricot' => 'yellow'
};
Please note that the hash is not ordered and you couldn’t expect any order when you display its elements. If you want to process the elements of the hash in the order they have been inserted, you can use the Tie::IxHash module (search CPAN for it).
Next, I’ll show you another example that reads a set of hash keys from a pseudo-datafile. See in the next code snippet how you can get rid of the trailing newlines:
I want to make some comments about this code. If you remember one of the syntax forms of map function, it runs a block of statements on each element of an array and returns a new array:
In our example, the Perl map function waits for an array as its second argument, so Perl will create an array when reading from <DATA>. Next, the map function will run the block {chomp;$_=> 0} on each element of this array:
- each element of the array will be assigned in turn to $_
- chomp will remove the trailing newline from $_
- the pair element ($_,0) will be added to %fruits hash
Finally, I print the key of %fruits hash using the foreach statement and the keys function. The keys function creates an array with the keys of the hash and only afterwards foreach statement will loop through the array keys.
Please note that this approach to print a hash using foreach and keys is not very suitable for hashes with large amount of data.
The output is as follows:
cherry plum apricot
The special variable $_ is a scalar variable, so in this case we use Perl chomp in a scalar context. If you recall the third syntax form of Perl chomp function, if you use chomp without argument that means that the content of $_ variable will be chomped. See the next short example for this:
The next example chomps the elements of an array one by one, using the foreach statement:
The array content: one two three four
If you want to deal with Perl chomp and scalar references, you can look at the following code sample:
In order to chomp the reference, you need to dereference it by putting the appropriate symbol in front of the reference – in our case the $ symbol, because we have a scalar data type here.
The next code shows you an example about how to chomp an array reference:
Characters removed: 2
The array content: John Peter Alice
To chomp the array reference, you must dereference it by putting a @ symbol in front of the $arrayRef reference (or you can put the array reference in curly braces: @{$arrayRef}).
Finally we print how many characters were removed and the content of the array.
Please note that not all the elements of the array end in newline: Perl chomp will check if the last character is a newline and only then it removes it, otherwise the array element remains unchanged.
The following examples show you some ways to chomp a hash reference. Let’s start by showing you the first code sample:
Characters removed: 1
By enclosing an item between braces we make a new anonymous hash that returns a reference to that hash. To chomp the hash reference, you must dereference it by putting a % symbol in front of the $hashRef reference (or you can put the hash reference in curly braces: %{$hashRef}).But if you look at the output, you’ll see that the Perl chomp function removed only 1 newline, because the hash keys are not chomped.
You can loop through the hash using the while statement with the each function and chomp the keys one by one:
- the each function will return the key and the value for the next element of the hash, apparently in a random order
- the current $key is deleted from the hash
- $key is chomped and $nr is incremented with the number of characters removed
- the pair ($key, $val) is added to the hash
At the end of the script the number of characters removed and the content of the hash will be printed. Please note that first I deleted the key and only afterwards I chomped it.
The output of this script will be as follows:
$VAR1 = {
'John' => '23',
'Alice' => '32',
'Peter' => '45'
};
At each iteration step of the Perl map function, the current key of the temporary array will be assigned to the special variable $_ and then the block will be executed.
Finally, we’ll get the new copy of our hash with all the keys and values chomped. You can use this method in more general cases where you need to modify the key and the value of the hash’s elements and get a copy of the original hash with the new keys and values.
The output is identical to the one in the previous example.
This function is very useful when you read data from the keyboard through the special file handle STDIN. The standard input stream will add a newline character (\n) at the end of each line and the Perl chomp function will take it off, like in the following example:
This code will read from STDIN the user name and it will store it in the variable $name – when assigned to a scalar variable, the input operator reads one line only.
This variable will have at the end a newline character, inserted after the user hit the return key. The Perl chomp function will remove it. (If you want to test this code, after typing the user name, hit the Enter key).
You can omit the file handle (by default STDIN is considered to be the keyboard) and than the input operator will read either from any files specified on the program’s command line (which are stored in the special array variable @ARGV) or from STDIN if none are specified. For this you may replace the line:
with:
So, if you use <> and no argument is supplied to the script in command line, the input operator will read from STDIN (usually the keyboard).
You can play with the command line arguments inside a Perl script, without specifying them in the command line. See the following snippet code:If you want to test this script, you need to create in the current directory with a text editor the file "psw.txt", here is an example of the content of the file:
user##~1
user^&*2
user+=&^a
The parentheses force chomp to act on the result of what is between them.
First the diamond operator <> is evaluated and all the lines read from STDIN (by default) will be stored as elements in @array.
Next the entire @array will be chomped and the total number of newlines removed will be store in the scalar variable $count.
If you read a single line from STDIN you must chomp it in order to remove the ending newline, and you can do this on a single line of code like in the following short script:
To print the dollar sign without making Perl think it is a scalar, you need to precede it by the backslash (\) escaping character.
If you read more lines from STDIN and intend to store them in an array, you can use the following snippet code:
If you want to test and run this code, in the last input line, you must type Ctrl/d in Linux (Ctrl/z in Windows followed by Enter key) to tell Perl that you finished input the data.
If you want to store the lines read from STDIN as pair elements in a hash (first line for key, the next for value and so on), similarly with the previous snippet code, you can write:
Generally, the Perl chomp function removes only the last newline character but what if there are more? In this case you can play with the special variable $/ (the input record separator), using the paragraph mode by setting $/ to "", as in the next example:
After running this code, the first two newlines will remain unchanged, but the last four will be removed. The variable $nr will be set to 4 – the number of newline characters removed.
Be careful, however, when you alter the content of the special variable $/ and restore it to its current value, when you consider necessary (especially if your code is longer than the above snippet one).
One approach to chomp the elements of an array is to use the Perl chomp function inside a foreach loop:
At each iteration step the current element of the array will be assigned to $item, the Perl chomp function will chomp the value stored in $item and will increment the scalar $nr variable with the number of characters removed.
Please note that in a foreach loop the iterator variable is rather an alias of the current element of the array, so if you change the iterator variable content, the current element content of the array will be changed.
Practically, by chomping the iterator we really chomp the current element of the array.
Finally, when the loop ends, the content of the array and the number of characters removed will be printed.
If you have an array and you want to chomp all the elements of the array using a foreach statement, you can use the following code:
The foreach statement loop iterator variable is missing and that means that foreach will use for iteration the special variable $_, reading the elements of the @numbers array one by one. At each step of iteration:
- the current element of the array will be assigned to $_
- the value assigned to $_ will be chomped by its ending newline – because I used Perl chomp without any argument, this function will remove the trailing newline from $_
- finally, I used print function with double quotes in order to display the array elements separated by space.
In the next example, the foreach statement is used with an iterator variable and in connection with STDIN filehandle:
- the current element of the array is assigned to the $line variable
- the Perl chomp function removes the ending newline from the $line scalar variable
- the push function appends the content of $line to @colors array
Finally, the @colors array is printed. A possible output for the above sample code is as follows:
blue
white
Ctrl/z in Windows (Ctrl/d in Linux)
blue white
You can use the while loop to read some lines from STDIN, chomp them one by one and next append them to an array.
This approach (using while and not foreach) is preferable if you need to read large amount of data from STDIN or other input file handle.
See the following code:
The while loop will read from standard input some lines of text:
- each line is assigned in turn to the special variable $_
- the Perl chomp function will remove the last newline from $_ (because the chomp function is called without any argument, it will remove the newline from $_)
- the content of each line is then added to @colors array by using the push function.
An example of output could be the following:
yellow
Ctrl/Z in Windows (Ctrl/d in Linux)
@colors = blue yellow
To indicate the EOF of the file, after the input lines, you must type Ctrl/d in Linux (Ctrl/z in Windows and than strike the Enter key) as the last line.
When you want to read from STDIN using the while statement, you can use any of the two code lines, too:
In order to overwrite the loop normal behavior, Perl language provides three loop control operators: last, next and redo. The following snippet cod shows you how you can use them in conjunction with Perl chomp function.
Please take a look at the following code snippet:
- Perl will read a line from STDIN (if no file handle is used with the diamond operator <> the interpreter will examine the @ARGV special array variable and if this array is empty, the operator will read from STDIN); because the $line is a scalar variable, Perl will read only one line from STDIN, will convert it to lowercase and then it will store it in the $line variable ;
- chomp will remove the ending newline from $line
- if $line is equal with the word last, the while loop will end the loop entirely, skipping the remaining statements in the block
- if $line is equal with the word next, the while loop will skip the rest of the code and it will move to the next iteration
- if $line is equal with the word redo, the while loop will repeat the same iteration but without reevaluating the condition (before executing redo I assigned 1 to $done, to show you that the condition statement is not evaluated in this case)
- if $line is some other word, it will be printed by the last statement of the block and the while loop will reiterate
If you have a file with large amount of data – let’s say it is named "fn.txt", you can’t slurp the file in an array and globally chomp it as in the next sample code:
In this case you need to read the records of the file one at a time, process the record and then read the next record and so on. There are many approaches to do it, I’ll show you an example by using the while statement.
In the next code snippet we use Perl defined function to test if the line-input operator <FILE> reached the end of file. In the scalar context this operator returns a line of text, but if there are no more input lines, it returns the undef value.
The loop while will end when <FILE> handle will return undef, i.e. the end of file will be reached. The Perl chomp function is used to get rid of the eventually newline attached at the end of the $line variable.
Anyway, using defined function is the longest way to read from a filehandle using while statement. Any of the two below code lines are equivalent:
One approach to chomp the elements of an array is to use the Perl chomp function inside a map block:
Inside the map block the Perl chomp function will remove from the element assigned to $_ the ending characters that match $/ (in our case it is only a newline) and it will increment the $nr variable with the number of chomped characters.
You can use the defined function to test if the line-input operator <STDIN> reached the end of the file. Used in scalar context this operator returns a line of text, but if there are not more input lines, it returns the undef value. It is very useful when you want to distinguish between an empty line and the end of file. See the following snippet code for an example:
Anyway, using defined function is the longest way to read from STDIN using while statement. Any of the two below code lines are equivalent:
The Perl chomp function removes any trailing string that corresponds to the special variable $/ (input record separator). By default, $/ is set to the newline character, i. e. "\n". You can play with the input record separator to perform different tasks. You can set this variable to any string you like.
I think the best way to deal with this is to perform the respectively tasks inside a block where you declare the $/ special variable with local (another idea is to save the value of $/ in a scalar variable and restore it later). It is very important not to alter the value of this variable throughout your script, otherwise you could expect to some file reading issues.
You can set the $/ special variable to the ""– this is what it is called the "paragraph mode". In this mode you can read a text file paragraph by paragraph and chomp all the newlines from the end of the paragraph.
An illustrative example here:
I’ll examine below how this script works:
- the @array is initialized with an empty list (actually my @array; is enough to initialize an array with an empty list because the array is declared with my)
- the file is read in an anonymous block because $/ is declared here with local and after exiting from the block, Perl will automatically restore the previous value of $/ (an alternative to the block is to store the current value of $/ in a scalar variable and restore it to its initial value after reading the file).
- to read the file the foreach statement is used:
- each line of the file is set in turn to $_;
- chomp will remove the trailing newline from $_
- push will append the line assigned to $_ to @array
- finally, the @array is printed, each element on a new line.
This script produces the following result:
This is Perl.
Some Perl functions here: push, substr, index.
This is Perl too.
But finally, I can’t stretch enough to restore the $/ value at the end of your processing, if you altered it. Especially if your script is a more elaborate one.
If you have some strings split over two or more lines, you can use a continuation character to tell Perl that a continuation line follows. See the next code snippet:
This is another one
In the above code we used the backslash character to indicate that a line will be continued and the __DATA__ pseudo-datafile marker instead of a real file, to simplify the test of this script.
The while loop will read the lines one at a time and will concatenate the $line variable with the current line read from the <DATA> pseudo-filehandle. At the beginning of the script the $line variable will be initialized with the null string.
Inside the while block we do the following:
- the Perl chomp function will remove the trailing newline from $line
- we use the substitution operator (s///) to remove the ending backslash for $line (the ~= operator will bind the substitution operator to $line variable); the first backslash is the escape character for the next special character, in our case it is the backslash character too; \s matches the eventual whitespace (\t\f\r\n)) characters that follow the backslash character that we used as a continuing character; the * character causes the precedent character (in our case a whitespace character) to be matched 0 or more times
- if we succeed - that means we have a continuation line - the next control will skip the rest of the block, the script continuing with the reading of another line
- if the substitution operator returns false, the $line remains unchanged and it will be appended to @array using the push function, the $line variable will be initialized to the null string and the script will continue with a new iteration of the while loop
Finally, we used the foreach statement to print the array.
Please recall that the Perl chomp function removes the input record separator stored in $/ special variable. We will examine below the case when $/ is set to LF ("\n").
You don’t need to worry about the specific platform you are running your script. When you read a line from a file, you need to remove the trailing newline from your line. The chomp function will do the right thing and it will remove the record separator at the end of your string.
On a Unix platform one line read from a file is expected to end in only a newline (LF), and the Perl chomp function will remove the newline found at the end of the line.
On a Windows platform, when reading a file it is expected that one line read from the file will end in both a CR ("\r") and LF ("\n"), but Perl automatically converts CRLF to LF. The chomp function will remove the ending newline (LF) as in the previous case.
However, some issues could happen if you read documents written on one platform, from another platform.
An alternative to the Perl chomp function is to use the regular expressions. Here is an example:
first line
second line
In the above script we used the __DATA__ pseudo-datafile marker instead of a real file, to simplify the test of this script.
The script will read the entire file into a temporary array and foreach will go through the elements of this array, removing the CR and the LF characters, using the substitution operator (s///).
The pattern binding operator (=~) allows us to make the substitution on the $line string variable.
The quotation mark (?) causes the precedent character (\r) to be matched either 0 or once and the dollar sign ($) tells Perl to make the substitution at the end of the $line string.
After removing the eventually CR and the LF characters, the current line is appended to @array.
Finally, we print the @array using the foreach statement. As you can see in the output, the newlines were removed.
In the following example we create a binary file from an array – each string element of the array as a record of the file.
The string elements of the array will be converted into hexadecimal values and then the hex strings will be appended onto the binary file.
After each record we write the "\0" character in order to delimitate them.
Next, the array used to create the file will be initialized to an empty list.
First we need to set the $/ (input record separator) with "\0", and restore it at its current value after reading and processing the file.
We are going on with reading the file one line at a time, removing the trailing "\0" character with the Perl chomp function and converting the hex string read from the file record back into its character string.
After that the text string is appended to our initial array.
Finally we print the array, each element on a line.
Shortly, we have the following steps:
- initialize an array with a few string elements
- create from the array a binary file with the records delimitated by "\0" separator
- empty the initial array
- set the $/ with "\0"
- read the binary file line by line, chomp the "\0" rightmost character from the current line and append this line to our array
- print the content of the array
Notice that we’ll get back the initial content of our array.
Now, here is the script code:
Perl functions
Perl arrays
Perl hashes
- the @array is initialized with some few strings
- the binary file named "file.bin" is created from the array elements:
- we create the $file filehandle to be used with this file
- we open the file in writing mode and we use the die operator if we can’t open the file
- the binmode function is used to tell Perl that we want to create a binary file
- the foreach loop iterates through the @array; because foreach is used without any iterator, the special variable $_ will be used instead; at each iteration step:
- the current element of @array will be assigned to $_
- the string value stored in $_ will be converted to its hexadecimal corresponding value using the s/searchpattern/replacement/modifiers substitution regexp operator (please note that if you don’t use =~, the substitution operator will search by default $_):
- (.)This is the search pattern: the dot means we match any single character and the using of parentheses allow us to store the matched character in the special variable named $1 (if you have more parentheses, the expression included in the second parenthesis will be assigned to $2, and so on)
- sprintf("%x",ord($1))is the replacement argument of the substitution operator and it is pure Perl code; the ord function returns the ASCII numeric value of the character stored in $1; next sprintf will convert this numeric value into its hex corresponding value (you can use %X if you want the hex values in uppercase)
- e is a modifier and it tells the regex engine to treat the replacement field as Perl code (see above)
- g is a modifier and tells the regex engine to repeatedly apply the substitution for all the characters of the string, starting with the first one
- we close the binary file
- the @array is initialized with an empty list
- the $fh filehandle is associated with the "file.bin" and we open this file in reading mode; if an error arises, the die operator will print the appropriated error message and the script will end; please note that for the $fh filehandle we are under the context of binmode function that we called at the beginning of the script
- in order to save the content of $/, we read the binary file inside an anonymous block; when we exit from the block, the initial content of $/ will be restored:
- the special variable $/ is declared local and it is assigned with the "\0" value that is the delimiter for the records of our binary file
- a while loop allows us to read the file one record at a time; for each line (or record) read from the file, the following steps are processed:
- the current line is assigned to $_
- the chomp function will remove the "\0" character from the end of $_
- we need to convert the hex string stored in $_ to a character string and we use the s/// substitution operator:
- ([a-fA-F0-9][a-fA-F0-9]) match any two hexadecimal digits and store them in $1
- chr(hex($1)) is the replacement argument of the substitution operator; the Perl hex function will convert the hexadecimal string stored in $1 in its decimal corresponding value and it will return this value; the chr function will return the character represented by the numeric value returned by the hex function
- e and g modifiers have the same meaning as in the above code where we converted a character string into a hex string
- we append the content of $_ to the @array
- we close the binary file
- finally, the array will be printed each element on a line, using a foreach loop.
Well, it’s quite a long story, if it’s boring for you let it aside and look at the code – it works!
I’ll give you a simple example about how to chomp an input line read fro STDIN and perform some actions depending on the content of the input line.
Take a look at the following sample code:
- the lc function expects a scalar as argument so the <STDIN> operator will read only a line from the input standard; this line will be converted in lowercase and then will be assigned to the $choice scalar variable; next, the Perl chomp function will remove the trailing newline from the $choice variable
- we use the if-elsif-else statement to check the content of the variable read from STDIN and perform different actions depending on the value stored in $choice;
In the above example we could use the uc (uppercase) function too, modifying the comparisons appropriate.
Our task is to print from a flat file database the names of cities located in Europe. We’ll name our file "cities.txt" and we suppose its records have the following structure:
As you can see, the fields of a record are delimitated by ":" separator. A sample of this file is as follows:
Europe:France:Paris
Europe:Austria:Vienna
Africa:Kenya:Nairobi
Asia:Japan:Tokyo
Europe:Greece:Athens
If everything is OK, the while loop will automatically read each line of our file one at a time and will assign the value of the current line to the special variable $_. Next, let’s see what it happens inside the while loop, at each iteration step:
- the Perl chomp function is called with no argument, so it will remove the ending newline from the value assigned to $_
- we try to see if the word "Europe" is found in the line assigned to $_;
- we need to make our search case insensitively, so we can use uc function to return the value of $_ converted in uppercase
- the index function searches for the text "EUROPE" in the string contained in $_ and if it doesn’t find it, will reiterate the loop by using the next control
- if the string is found, the split function will create the @tmp array with the fields of the current record
- the push function will append the last element of the @tmp array to the @cities array
Finally, the script will print the cities we searched for:
Cities from Europe: Paris Vienna Athens
You can use either uc or lc to search for our substring case insensitively. Please note that we used push function to append the city name to @cities array and because of this we get the cities listed in the same order as they are in the file. But we can use the unshift function too, and in this case we will get the cities listed in a reverse order.
If you want the cities listed in ascending alphabetical order, you need to sort the @cities array, i.e. to insert the next line before printing the array:
In the next example we read a few hexadecimal numbers from STDIN and we want to translate them in characters. The hexadecimal numbers are separated by the ', ' delimiter.
- the STDIN operator works in scalar context, so only one line will be read from the input standard; the line will be assigned to the $line scalar variable and the Perl chomp function will remove the trailing newline from it
- the $line will be split using the ', ' separator into a temporary array; the map function will run its block on each element (hexadecimal number) of the array, the elements of the array will be assigned in turn to $_; for every element stored in $_:
- the hex function will convert the hexadecimal number into its ASCII corresponding value; please note that in the calling of hex function the expression argument is omitted and hex will use by default the hexadecimal value stored in $_
- the chr function will convert the ASCII value returned by hex function into its equivalent character
- the print function will print the list returned by the map function
A sample of output:
This is Perl!
You can use Perl push function if you need to read some lines one by one from STDIN, chomp and store them in an array. The next snippet code is an example about how you can do it.
- each line is assigned to the special variable $_
- calling Perl chomp function without argument has as result the removing of the trailing newline from $_
- the value assigned to $_ is appended to the @lines array
Finally, the array is printed.
You can perform some other actions inside the while loop if you consider necessary (like using regexp to check and validate the input line, split in order to split the line in more individual units, comparisons and the list can continue …). But if you don’t need more than in the above example, you can shorten the code by writing the while statement on a single line:
You can use the Perl chomp and split functions to read a csv file line by line, chomp each line and next split the content of the line into different variables. A csv file is a comma delimited text data file that has the extension ".csv".
There are a lot of applications such as Excel and MS Access that allow you to save the data in a csv file format. It’s very easy to create a csv file yourself, by delimiting the data with commas and ending each line with a newline character.
In the following example I’ll use the __DATA__ pseudo-datafile marker instead of a real file, to simplify the test of this script.
Please have a look at it:
Country: USA, Age: 23
Last Name: Antoine, First Name: Chevron,
Country: France, Age: 45
Next, the foreach statement will loop through this temporary array element by element, loading the current element into $line variable. Inside the block of the foreach statement, for each line the following steps will be executed:
- the Perl chomp function will remove the ending newline from the $line scalar variable
- the split functions will split the content of the $line into the $lastname, $firstname, $country and $age variables, using the comma separator
- the content of the four variables will be printed
While a chop function deletes the last ending character of a string regardless of whatever it is, the Perl chomp function checks whether the last character(s) matches the input record separator ($/) and only then it deletes them.
If the last character of a string is a newline (and $/ is set by default to "\n"), both functions do the same thing, i.e. remove the newline character.
Please look at the following example:
Chomp and chop function
Chomp and chop functions: the newline was removed