How To Search Using The Awk Utility

by mike on November 11, 2010

You can do searches with awk by enclosing the search within forward slashes.  Note awk is case sensitive.

If you wanted to search for the strings “debian” and “Debian” you could use a regular expression.


You take those searches and look for the text string at the start of the line, “^” or the end of the line “$” like this:


If you were looking for lower case letters d-m your search would look like this:


If you were searching for any number:


If you wanted to do searches that matched one text string or another text string you can use a “|” to separate them.  Note the use of the single quotes.

awk ‘/Iceweasel|Epiphany/’ access_log

This will return both  results.
To search the first field in a file you can use the “$1″ to indicate that field and then use “~” for the equivalent to the text string you are looking for in the field.  Be sure to use the single quotes as illustrated.

awk ‘$1 ~ /^$/’ access_log

You can negate the search for a string by using “!” in the search.
awk ‘$1 !~ /^$/’ access_log

To search a series of lines you can use multiple operations.  In this example lines 10 through 15 are searched.

awk ‘NR == 10,NR == 15′  access_log

Here is a list of some of the operators you could use.
<           Less than.
<=         Less than or equal.
==         Equal.
!=          Not equal.
>=         Greater than or equal to.
>           Greater than.

You can use combinations of searches using “&&” which requires and “AND” as well as “||” which requires an “OR” operator.

awk ‘((NR >= 10) && ($1 == “″)) || ($1 ==”″)’ access_log

This example uses awk to check lines equal to or more than 10 and requires either the first field be one IP Address or the other.

Various “special” characters are used and can be embedded in strings.

\n     Newline (line feed).
\t     Horizontal tab.
\b     Backspace.
\r     Carriage return.
\f     Form feed.

If you wanted to perform simple operations on a text file, access_log for example, awk is a very useful and simple tool to use.  The following example, show the command awk searching for the text string “Debian” in access_log for an apache web server. It will return the lines that contain the text string.  The reason for the single quotes instead of the double quotes is to prevent the shell from interpreting characters within the program as special shell characters.

awk ‘/Debian/’ access_log – - [16/Jul/2010:23:30:06 -0600] “GET /icons/apache_pb.gif HTTP/1.1″ 304 – “” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/2010033100 Iceweasel/3.0.6 (Debian-3.0.6-3)” – - [16/Jul/2010:23:30:06 -0600] “GET /icons/powered_by_rh.png HTTP/1.1″ 304 – “” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/2010033100 Iceweasel/3.0.6 (Debian-3.0.6-3)” – - [16/Jul/2010:23:30:10 -0600] “GET /favicon.ico HTTP/1.1″ 404 287 “-” “Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/2010033100 Iceweasel/3.0.6 (Debian-3.0.6-3)”

Note that:

awk ‘/Debian/’

is the same as:

awk ‘/Debian/’ {print}

This is because the default function of a search is to print it to screen.

The basic way to use awk is to employ this format:

awk <search pattern> {<program actions>}

Now awk uses the search pattern to review a file and then perform the “program actions” such as printing the fields like in this example:

awk ‘/Debian/’ access_log | awk ‘{ print $1,$20}’

Awk is used search for the text string “Debian” in the access_log and then it is piped to awk again to print two fields, the IP Address which is the first field and the web browser used by the Debian system which is the field number 20. Iceweasel/3.0.6

{ 1 comment }

jonny November 13, 2010 at 12:17 am

Good article but I would also be interested in comparing this awk usage with grep and sed as your usage seems to overlap a bit. Which is more efficient and what are the grep/sed equivalents to the commands above?

Comments on this entry are closed.

Previous post:

Next post: