One of the best ways to learn awk is to have a series of commands that you can run to see how the basics work and then build on that. Here is the first in a series of 10 lessons on how to use awk. Start with creating a file to work with by going to the command line and redirecting the output of the ps command to create a file called processes. Note the processes on your server or desktop will likely be different but the principles are the same.
ps aux > processes
If you wanted to view each line of the file created use awk without a pattern and the action is to print all lines.
awk '{ print }' processes
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.2 2160 660 ? Ss Feb08 0:00 init [3]
root 13638 0.0 0.2 2252 552 ? S<s Feb08 0:00 /sbin/udevd -d
root 13880 0.0 0.2 1816 612 ? Ss Feb08 0:00 syslogd -m 0
root 13901 0.0 0.4 7196 1056 ? Ss Feb08 0:00 /usr/sbin/sshd
root 13910 0.0 0.3 2836 872 ? Ss Feb08 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root 13977 0.0 0.6 9300 1680 ? Ss Feb08 0:00 sendmail: accepting connections
root 14022 0.0 1.0 10012 2840 ? Ss Feb08 0:00 /usr/sbin/httpd
apache 14023 0.0 0.7 10012 2056 ? S Feb08 0:00 /usr/sbin/httpd
root 14031 0.0 0.4 4492 1104 ? Ss Feb08 0:00 crond
root 14039 0.0 0.2 5680 712 ? Ss Feb08 0:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam -n 2
root 14040 0.0 0.1 5680 444 ? S Feb08 0:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam -n 2
root 15857 0.0 1.1 10032 2900 ? Ss 12:02 0:00 sshd: root@pts/0
root 15864 0.0 0.5 3716 1500 pts/0 Ss 12:02 0:00 -bash
root 15969 0.0 0.3 2536 908 pts/0 R+ 12:07 0:00 ps aux
Using awk without an action but only providing a pattern returns only lines that match the pattern.
awk '/httpd/' processes
root 14022 0.0 1.0 10012 2840 ? Ss Feb08 0:00 /usr/sbin/httpd
apache 14023 0.0 0.7 10012 2056 ? S Feb08 0:00 /usr/sbin/httpd
View USER, PID, CPU and MEM fields only by using print with the fields desired.
awk '{print $1,$2,$3,$4}' processes
USER PID %CPU %MEM
root 1 0.0 0.2
root 13638 0.0 0.2
root 13880 0.0 0.2
root 13901 0.0 0.4
root 13910 0.0 0.3
root 13977 0.0 0.6
root 14022 0.0 1.0
apache 14023 0.0 0.7
root 14031 0.0 0.4
root 14039 0.0 0.2
root 14040 0.0 0.1
root 15857 0.0 1.1
root 15864 0.0 0.5
root 15969 0.0 0.3
Repeat the command without commas and you will quickly realize the value of separating the filed output.
awk '{print $1$2$3$4}' processes
USERPID%CPU%MEM
root10.00.2
root136380.00.2
root138800.00.2
root139010.00.4
root139100.00.3
root139770.00.6
root140220.01.0
apache140230.00.7
root140310.00.4
root140390.00.2
root140400.00.1
root158570.01.1
root158640.00.5
root159690.00.3
Using both a pattern “httpd” and an action, listing 4 fields.
awk '/httpd/ {print $1,$2,$3,$4}' processes
root 14022 0.0 1.0
apache 14023 0.0 0.7
Here you can use a pattern that is limited to one character and list the first and last fields.
awk '/h/ {print $1,$11}' processes
root /usr/sbin/sshd
root /usr/sbin/httpd
apache /usr/sbin/httpd
root /usr/sbin/saslauthd
root /usr/sbin/saslauthd
root sshd:
root -bash
Use the operator “~” to list the eleventh field that matches “sendmail”.
awk '$11 ~ /sendmail/' processes
root 13977 0.0 0.6 9300 1680 ? Ss Feb08 0:00 sendmail: accepting connections
If you wanted to view fields that started with a pattern awk is happy to perform that task. Here awk lists all lines that start with “c” using the “^”, in field number 11.
awk '$11 ~ /^c/' processes
root 14031 0.0 0.4 4492 1104 ? Ss Feb08 0:00 crond
Once you have seen that option you may want to list multiple lines that start with letters. In order to do that you will need to create a character class with “[ ]“.
awk '$11 ~ /^[sc]/' processes
root 13880 0.0 0.2 1816 612 ? Ss Feb08 0:00 syslogd -m 0
root 13977 0.0 0.6 9300 1680 ? Ss Feb08 0:00 sendmail: accepting connections
root 14031 0.0 0.4 4492 1104 ? Ss Feb08 0:00 crond
root 15857 0.0 1.1 10032 2900 ? Ss 12:02 0:00 sshd: root@pts/0


{ 24 comments }
Very clear and concise. Can’t wait for the rest.
I’ve been waiting for something like that!!! thankz!
Thanks, I like it, although you should use some copy-paste friendly format (code tags, etc.) because your blog software changes apostrophes and all.
Will use the code tag in tomorrows post. Thanks for the tip!
The curly single quotes don’t work in iTerm (and maybe others too).
awk ‘{ print }’ processes != awk ‘{ print }’ processes
iTerm (assuming this is Mac OS X terminal emulator) is just a terminal emulator, the shell is bash and it should work fine; did you output ps aux to a file called processes as shown in the first part of the tutorial?
I’d like to see some examples of awk networking.
Any chance you could properly format your output in a fixed-width font?
Very effective and concise. Have a feeling that most of the commands are known but you have built it from basics and transitioned to the most important command options. Love to see what is in stock..
Good work, keep it up!
I think I will follow this series of tutorials. It seems interesting.
Thanks for sharing this.
An alternative way to run AWK script is to write the script in a file and let AWK command run that script. I always prefer to do this especially when the script is complex…
Are you going to have one tutorial everyday of the week for 10 days?
at least 1 every day
Just want to say this: Thank you *very* much for posting this. I’ve been hearing about how powerful awk is for years, but nobody explained it this clearly. Thanks a bunch.
you’re welcome, better stuff coming up.
nice tutorial .short and power full. waiting for more . thank you
Thank you very much. The next 10 days i’d be eying this space
Looking forward to these tutorials. I’ve been wanting to learn awk for a while but never got around to it.
When I look at the man page, it shows gawk (GNU’s awk). So whenever I call awk, is it actually gawk that’s being executed?
Awk is a pattern-scanning and text processing utility that captures information from text files creating reports in the process, modify files from one format to another, create databases and perform mathematical operations on data. The term “awk” comes from the names of the authors, Aho, Weinberger and Kernighan. Nawk is the newer version of awk and Gawk is the Gnu version. Often awk is a symbolic link to gawk as you can see on this CentOS machine.
awk –version
GNU Awk 3.1.5
Copyright (C) 1989, 1991-2005 Free Software Foundation.
ls -l /bin/awk
lrwxrwxrwx 1 root root 4 May 22 07:43 /bin/awk -> gawk
Extremely good writeup, should do a world of good for most of the people out there, thanks!!..
thanks alot, very useful.
Now off to read #2
Nice tutorial. Awk has variables, functions, condition testing, looping, rich set of math & string functions, control over I/O, can be tweaked (with special variable and command line prompt) to any text processing purpose, is very very fast.
Swiss army chainsaw, sort to speak…
It may be obvious to many, but for the newbie it should be mentioned that ‘awk’ takes each line of input and matches it to actions using the patterns (or any actions without a pattern). These examples work by reading each line of the output file from ‘ps’ and deciding what to do with them one at a time.
Congratulations for making awk look so straightforward. I am a seasoned awk user but would not teach it so well like you just did. I will definitely come back!
Comments on this entry are closed.
{ 1 trackback }