Exercise #1: Learning awk Basics

by mike on February 12, 2011

One of the best ways to learn awk is to have a series of commands that you can run to see how the basics work and then build on that.  Here is the first in a series of 10 lessons on how to use awk.  Start with creating a file to work with by going to the command line and redirecting the output of the ps command to create a file called processes. Note the processes on your server or desktop will likely be different but the principles are the same.

ps aux > processes

If you wanted to view each line of the file created use awk without a pattern and the action is to print all lines.

awk '{ print }' processes

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.2   2160   660 ?        Ss   Feb08   0:00 init [3]
root     13638  0.0  0.2   2252   552 ?        S<s  Feb08   0:00 /sbin/udevd -d
root     13880  0.0  0.2   1816   612 ?        Ss   Feb08   0:00 syslogd -m 0
root     13901  0.0  0.4   7196  1056 ?        Ss   Feb08   0:00 /usr/sbin/sshd
root     13910  0.0  0.3   2836   872 ?        Ss   Feb08   0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root     13977  0.0  0.6   9300  1680 ?        Ss   Feb08   0:00 sendmail: accepting connections
root     14022  0.0  1.0  10012  2840 ?        Ss   Feb08   0:00 /usr/sbin/httpd
apache   14023  0.0  0.7  10012  2056 ?        S    Feb08   0:00 /usr/sbin/httpd
root     14031  0.0  0.4   4492  1104 ?        Ss   Feb08   0:00 crond
root     14039  0.0  0.2   5680   712 ?        Ss   Feb08   0:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam -n 2
root     14040  0.0  0.1   5680   444 ?        S    Feb08   0:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam -n 2
root     15857  0.0  1.1  10032  2900 ?        Ss   12:02   0:00 sshd: root@pts/0
root     15864  0.0  0.5   3716  1500 pts/0    Ss   12:02   0:00 -bash
root     15969  0.0  0.3   2536   908 pts/0    R+   12:07   0:00 ps aux

Using awk without an action but only providing a pattern returns only lines that match the pattern.
awk '/httpd/' processes
root     14022  0.0  1.0  10012  2840 ?        Ss   Feb08   0:00 /usr/sbin/httpd
apache   14023  0.0  0.7  10012  2056 ?        S    Feb08   0:00 /usr/sbin/httpd

View USER, PID, CPU and MEM fields only by using print with the fields desired.

awk '{print $1,$2,$3,$4}'  processes
USER PID %CPU %MEM
root 1 0.0 0.2
root 13638 0.0 0.2
root 13880 0.0 0.2
root 13901 0.0 0.4
root 13910 0.0 0.3
root 13977 0.0 0.6
root 14022 0.0 1.0
apache 14023 0.0 0.7
root 14031 0.0 0.4
root 14039 0.0 0.2
root 14040 0.0 0.1
root 15857 0.0 1.1
root 15864 0.0 0.5
root 15969 0.0 0.3

Repeat the command without commas and you will quickly realize the value of separating the filed output.

awk '{print $1$2$3$4}'  processes
USERPID%CPU%MEM
root10.00.2
root136380.00.2
root138800.00.2
root139010.00.4
root139100.00.3
root139770.00.6
root140220.01.0
apache140230.00.7
root140310.00.4
root140390.00.2
root140400.00.1
root158570.01.1
root158640.00.5
root159690.00.3

Using both a pattern “httpd” and an action, listing 4 fields.
awk '/httpd/ {print $1,$2,$3,$4}'  processes
root 14022 0.0 1.0
apache 14023 0.0 0.7

Here you can use a pattern that is limited to one character and list the first and last fields.
awk '/h/ {print $1,$11}'  processes
root /usr/sbin/sshd
root /usr/sbin/httpd
apache /usr/sbin/httpd
root /usr/sbin/saslauthd
root /usr/sbin/saslauthd
root sshd:
root -bash

Use the operator “~” to list the eleventh field that matches “sendmail”.
awk '$11 ~ /sendmail/'  processes
root     13977  0.0  0.6   9300  1680 ?        Ss   Feb08   0:00 sendmail: accepting connections

If you wanted to view fields that started with a pattern awk is happy to perform that task.  Here awk lists all lines that start with “c” using the “^”, in field number 11.

awk '$11 ~ /^c/'  processes
root     14031  0.0  0.4   4492  1104 ?        Ss   Feb08   0:00 crond

Once you have seen that option you may want to list  multiple lines that start with letters.  In order to do that you will need to create a character class with “[ ]“.
awk '$11 ~ /^[sc]/'  processes
root     13880  0.0  0.2   1816   612 ?        Ss   Feb08   0:00 syslogd -m 0
root     13977  0.0  0.6   9300  1680 ?        Ss   Feb08   0:00 sendmail: accepting connections
root     14031  0.0  0.4   4492  1104 ?        Ss   Feb08   0:00 crond
root     15857  0.0  1.1  10032  2900 ?        Ss   12:02   0:00 sshd: root@pts/0

{ 24 comments }

Robert February 12, 2011 at 3:40 pm

Very clear and concise. Can’t wait for the rest.

Disat February 12, 2011 at 4:23 pm

I’ve been waiting for something like that!!! thankz!

p February 12, 2011 at 5:05 pm

Thanks, I like it, although you should use some copy-paste friendly format (code tags, etc.) because your blog software changes apostrophes and all.

Andrew February 12, 2011 at 6:14 pm

Will use the code tag in tomorrows post. Thanks for the tip!

Brad Langhorst February 12, 2011 at 6:08 pm

The curly single quotes don’t work in iTerm (and maybe others too).

awk ‘{ print }’ processes != awk ‘{ print }’ processes

Bashguy February 12, 2011 at 8:39 pm

iTerm (assuming this is Mac OS X terminal emulator) is just a terminal emulator, the shell is bash and it should work fine; did you output ps aux to a file called processes as shown in the first part of the tutorial?

anon February 12, 2011 at 6:08 pm

I’d like to see some examples of awk networking.

Peter February 12, 2011 at 7:06 pm

Any chance you could properly format your output in a fixed-width font?

Guru February 12, 2011 at 7:19 pm

Very effective and concise. Have a feeling that most of the commands are known but you have built it from basics and transitioned to the most important command options. Love to see what is in stock..

Good work, keep it up!

buzu February 12, 2011 at 8:16 pm

I think I will follow this series of tutorials. It seems interesting.

Husain AlKhamis February 12, 2011 at 10:19 pm

Thanks for sharing this.
An alternative way to run AWK script is to write the script in a file and let AWK command run that script. I always prefer to do this especially when the script is complex…

Stealth February 12, 2011 at 10:21 pm

Are you going to have one tutorial everyday of the week for 10 days?

Andrew February 12, 2011 at 11:11 pm

at least 1 every day

Khaja Minhajuddin February 12, 2011 at 10:50 pm

Just want to say this: Thank you *very* much for posting this. I’ve been hearing about how powerful awk is for years, but nobody explained it this clearly. Thanks a bunch.

Andrew February 12, 2011 at 11:11 pm

you’re welcome, better stuff coming up.

jignesh February 13, 2011 at 5:32 am

nice tutorial .short and power full. waiting for more . thank you

Kunal February 13, 2011 at 6:14 am

Thank you very much. The next 10 days i’d be eying this space

Dennis February 13, 2011 at 7:22 pm

Looking forward to these tutorials. I’ve been wanting to learn awk for a while but never got around to it.

When I look at the man page, it shows gawk (GNU’s awk). So whenever I call awk, is it actually gawk that’s being executed?

mike February 13, 2011 at 7:40 pm

Awk is a pattern-scanning and text processing utility that captures information from text files creating reports in the process, modify files from one format to another, create databases and perform mathematical operations on data. The term “awk” comes from the names of the authors, Aho, Weinberger and Kernighan. Nawk is the newer version of awk and Gawk is the Gnu version. Often awk is a symbolic link to gawk as you can see on this CentOS machine.

awk –version
GNU Awk 3.1.5
Copyright (C) 1989, 1991-2005 Free Software Foundation.

ls -l /bin/awk
lrwxrwxrwx 1 root root 4 May 22 07:43 /bin/awk -> gawk

SEO services February 14, 2011 at 4:45 am

Extremely good writeup, should do a world of good for most of the people out there, thanks!!..

Marco February 14, 2011 at 6:09 am

thanks alot, very useful.
Now off to read #2 ;)

rasta_freak February 14, 2011 at 7:01 am

Nice tutorial. Awk has variables, functions, condition testing, looping, rich set of math & string functions, control over I/O, can be tweaked (with special variable and command line prompt) to any text processing purpose, is very very fast.
Swiss army chainsaw, sort to speak…

John N. February 14, 2011 at 4:24 pm

It may be obvious to many, but for the newbie it should be mentioned that ‘awk’ takes each line of input and matches it to actions using the patterns (or any actions without a pattern). These examples work by reading each line of the output file from ‘ps’ and deciding what to do with them one at a time.

George Anderson February 14, 2011 at 8:24 pm

Congratulations for making awk look so straightforward. I am a seasoned awk user but would not teach it so well like you just did. I will definitely come back!

Comments on this entry are closed.

{ 1 trackback }

Previous post:

Next post: