Exercise #4: Starting with awk Scripts

by mike on February 15, 2011

Awk scripts allow you to create a script that could be reused and enter a path into a file or execute it from the command line.  When you access the file use the “-f” option to indicate the file.

awk -f awk_file
In this example a file was created with ps aux > processes so awk could run on that file for searches.  The goal was to list lines with “ssh” and “apache”.  The awkfile is a file that contains the searches you may want to perform.
awkfile contents.  Note there is not an awk command nor any single quotes.
Here is the awk file named daemons.
/apache/{print}
/ssh/{print}

Here the awkfile is referenced to search for the two strings.
awk -f daemons processes
root      1886  0.0  0.3  62608  1212 ?        Ss   22:03   0:00 /usr/sbin/sshd
root      2150  0.0  0.9  90108  3360 ?        Ss   22:14   0:00 sshd: root@pts/0
apache    2206  0.0  1.0 207376  3972 ?        S    22:20   0:00 /usr/sbin/httpd
apache    2207  0.0  1.0 207376  3972 ?        S    22:20   0:00 /usr/sbin/httpd

In this example “daemons” is the name of the awk file.  It is a good idea to make the name descriptive of what you want to achieve with it as you may have a number of awk files that you will be using.  The text file that was search is called “processes” in the example.
The same constructions you used in the commands previously may be used in the awk file but without single quotes.  In this example the fourth field in “processes” must be equal to or larger than 1%.  This is the field that locates memory used by a process.
awk file:  highmem
$4 >= 1

Here is the awk file used to search for the desired information.
awk -f highmem processes
root     14022  0.0  1.0  10012  2840 ?        Ss   Feb08   0:00 /usr/sbin/httpd
root     15857  0.0  1.1  10032  2900 ?        Ss   12:02   0:00 sshd: root@pts/0

awk file: cclass
In this example the awk file is called “cclass” to create a character class that will search for the field numer 11 that starts with either “s” or “c”.
$11 ~ /^[sc]/
awk -f cclass processes
root     13880  0.0  0.2   1816   612 ?        Ss   Feb08   0:00 syslogd -m 0
root     13977  0.0  0.6   9300  1680 ?        Ss   Feb08   0:00 sendmail: accepting connections
root     14031  0.0  0.4   4492  1104 ?        Ss   Feb08   0:00 crond
root     15857  0.0  1.1  10032  2900 ?        Ss   12:02   0:00 sshd: root@pts/0

awk file: range
This file uses the comma as a range separator to create a range of processes that match field number two that is equal to “14022” to the field number two that matches “14040”.
$2 ~ 14022 , $2 ~ 14040 {print $1,$11}
The output will only print field numbers one and eleven on lines that match.
awk -f range processes
root /usr/sbin/httpd
apache /usr/sbin/httpd
root crond
root /usr/sbin/saslauthd
root /usr/sbin/saslauthd

awk file: range_nr
This awk file print the records 2 through 5 using a range separator.
NR == 2 , NR ==5

awk -f range_nr processes
root         1  0.0  0.2   2160   660 ?        Ss   Feb08   0:00 init [3]
root     13638  0.0  0.2   2252   552 ?        S<s  Feb08   0:00 /sbin/udevd -d
root     13880  0.0  0.2   1816   612 ?        Ss   Feb08   0:00 syslogd -m 0
root     13901  0.0  0.4   7196  1056 ?        Ss   Feb08   0:00 /usr/sbin/sshd


awk file: range_nr
This modification locates the records 2-5 and only prints the eleventh field on each record.
NR == 2 , NR ==5 {print $11}

awk -f range_nr processes
init
/sbin/udevd
syslogd
/usr/sbin/sshd

awk file: lowmem
This search will be for all processes using only .2% of memory per process, field number four.
$4 == .2
awk -f lowmem processes
root         1  0.0  0.2   2160   660 ?        Ss   Feb08   0:00 init [3]
root     13638  0.0  0.2   2252   552 ?        S<s  Feb08   0:00 /sbin/udevd -d
root     13880  0.0  0.2   1816   612 ?        Ss   Feb08   0:00 syslogd -m 0
root     14039  0.0  0.2   5680   712 ?        Ss   Feb08   0:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam -n 2

awk file: lowmem
In this example the same awk file is used but it is modified to print three fields.
$4 == .2 {print $2,$4,$11}
awk -f lowmem processes
1 0.2 init
13638 0.2 /sbin/udevd
13880 0.2 syslogd
14039 0.2 /usr/sbin/saslauthd

awk file: lowmem
In this example the output is modified in order to add text “PID”  and to place the information inside “| |”.
$4 == .2 {printf "|PID %-5s|\n",$2}
awk -f lowmem processes
|PID 1    |
|PID 13638|
|PID 13880|
|PID 14039|

awk file: lowmem
Now formatting with printf is added.  The first field is a text field with a text string “PID” and 8 spaces for the process id number.  The second field printed is also a text string so “%s” is indicated for field eleven.
$4 == .2 {printf ("|PID %-8s| %s\n",$2,$11)}
awk -f lowmem processes
|PID 1       | init
|PID 13638   | /sbin/udevd
|PID 13880   | syslogd
|PID 14039   | /usr/sbin/saslauthd

Download the awk one liners PDF

{ 2 comments }

George Anderson February 15, 2011 at 4:26 pm

Great tutorial, keep it going.

Congrats!
George.

Pieter February 15, 2011 at 6:16 pm

Thanks for the great tutorials. As a total n00b I really like them. I would love to see one too for sed and regex/pcre :-)

One question: in awk_oneliners under “Substitution” there is this example: awk ‘{gsub(/dog|cat|bird,”pet”);print}’ filename
Isn’t that one missing a “/” after bird? This one works for me:
awk ‘{gsub(/dog|cat|bird/,”pet”);print}’ filename

Thanks again. I look forward to the next exercise.

Comments on this entry are closed.

Previous post:

Next post: