Variables
awk supports user defined variables as well as variables that are predefined. These variables do not need to be declared like they do in bash scripts. There are three types of variables:
1. System Variables
2. Scalars
3. Arrays
System or Built-in Variables
System variables are upper case and case sensitive.
NR: number of input lines
The NR variable stores each record, when it is read it is incremented by 1 as you see in the example.
awk ‘{print NR}’ processes
1
2
3
—cut—
72
73
74
NF: number of fields
Each record has fields which are separated by whitespace. These fields can vary depending upon the record.
awk ‘{print NF}’ processes
11
12
In this example the first and second fields are printed with the total number of fields available for each record.
awk ‘{print $1,$2, NF }’ /var/log/messages.1
Jul 24 14
Jul 24 11
Jul 24 8
Jul 24 12
FILENAME: name of input file
The script sets the filename as the file currently being processed. You cannot print the variable FILENAME in the BEGIN section of an awk script because it has not been declared until the BODY by default.
FNR: used with multiple input files
This variable will determine which input file is being processed.
FS: field separator character
The field separator is blank space or tabs by default but can be changed by using “-F” followed by the separator, in this example “:”. So awk searches for the text string jane and prints the first field “$1” and then prints her Group ID which is the forth field. The fields in /etc/passwd are separated by “:”.
tail /etc/passwd | awk -F: ‘/jane/{print $1, “Group: “$4}’
jane Group: 502
If you wanted to create a line that would allow for multiple file separators you could create a regular expression that would look for a space or a colon or a tab. Note it is enclosed in single quotes.
tail /etc/passwd | awk -F’[ :\t]‘ ‘/jane/{print $1, “Group: “$4}’
jane Group: 502
OFS: output filed separator
The OFS separates output by default with a space. That space is used when you place commas between the fields as you see in the example below.
tail /etc/passwd | awk -F’[ :\t]‘ ‘/jane/{print $1,$2,$3,$4,$6,$7}’
jane x 502 502 /home/jane /bin/bash
If you take out the commas in the fields you will get the output without spaces as you see in the example.
tail /etc/passwd | awk -F’[ :\t]‘ ‘/jane/{print $1$2$3$4$6$7}’
janex502502/home/jane/bin/bash
ORS: output record separator
Each line is considered a record and is terminated at the end of the line. The record separator (line separator) defaults to a new line.
OFMT: format for numeric output
This variable allows you to control the format of the number. The default format is “%.6g” that means 6 significant numbers to the right of the decimal are printed.
RS: record separator
Typically the record separator is a new line.
$0 Variable
The entire file is referenced by $0.
awk ‘{print $0}’ processes
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 10348 720 ? Ss 22:01 0:01 init [3]
root 2 0.0 0.0 0 0 ? S< 22:01 0:00 [migration/0]
root 3 0.0 0.0 0 0 ? SN 22:01 0:00 [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S< 22:01 0:00 [watchdog/0]
Scalar Variables
Scalar variables can be numeric or text.
var=3
or
var = “test_string”
Array Variables
Array variables will have a name and include brackets with a number.
variable_name[0]
variable_name[1]


{ 5 comments }
Nice intro to AWK — but the description below can be somewhat misleading:
Quoting:
$0 Variable
The entire file is referenced by $0.
awk ‘{print $0}’ processes
In reality, $0 is the entire line of the input stream (file or data flowing through awk) — and the ‘{print $0}’ is executed for every line, so it results in the equivalent of what looks like a dump of the entire file.
In reality, the data flowing through awk in your example is being processed on a line by line basis. To illustrate this:
‘{print $0}
{ print NR }’
And you will see every line followed by its record indicator.
Awk is a power-saw for developers and administrators alike. Not a day goes by that I don’t use it in my quest to manage enterprise systems.
It is SCALAR variable NOT scaler variable.
Thanks…it is corrected.
A good article, thank you.
I didn’t know about the FILENAME, FNR or OFMT.
Just a few corrections:
It’s scalar, not scaler. http://en.wikipedia.org/wiki/Scalar_(computing)
The FNR is the per file line number.
The ORS defaults to a new line, not a carriage return on U*X. On Windows it depends on the port you are using: cygwin in bash sell is new line, UnxTools it’s carriage return followed by new line.
Cheers
Dan
Thanks for the input. I have reflected the corrections in the file.
Comments on this entry are closed.