::--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--::
:: .ooO A Guide to playing with gawk by Wyzewun Ooo. ::
::--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--::
:: ::
:: I was shocked at the number of people who don't know how to use (g)awk ::
:: properly, so I decided to write up a guide to getting starting with gawk ::
:: for text formatting or whatever. Oh, I generally refer to gawk, but if ::
:: you have an ancient *nix then you may have another version, but awk will ::
:: probably symlink to it anyway. Here's a little chart of the evolution of ::
:: the awk utility... ::
:: ::
:: awk ------> nawk ------> POSIXawk ------> gawk ::
:: ::
:: Right, so lets try some simple stuff with awk first. Probably the most ::
:: commonly known thing that one can do with awk is format coloums. For ::
:: example, the output of a command like host -l gov.za would have an ::
:: output that looks like this... ::
:: ::
:: <stuff cut out> ::
:: gp.gov.za has address 196.254.66.6 ::
:: <stuff cut out> ::
:: ::
:: Now, we want to format the output of our host command and save the IP ::
:: addresses to a file called lame. We would type something to the effect ::
:: of host -l gov.za | gawk '{print $4}' > lame ::
:: ::
:: We are telling awk to print the fourth coloum only, thus the $4, and so ::
:: we will end up with a list of all the IPs with .gov.za hostnames. ;) ::
:: ::
:: Obviously, the above is used by script kiddies a helluva lot, so they ::
:: can use their l33t0 mscan across a third of the internet, in the hope ::
:: that they'll find some lame .edu host that they can root and feel elite. ::
:: *Sigh* So lets look at some more useful stuff, shall we? It won't help ::
:: you pointlessly compromise machines, but it may help you become a ::
:: proficient Unix user (imagine that). ::
:: ::
:: Okey Dokey, awk can count the number of coloums as well. We could've ::
:: done this with the previous example by typing something like ::
:: host -l gov.za | gawk '{print NF ": " $0}' ::
:: ::
:: We are telling awk to print the number of fields (print NF), followed by ::
:: a colon and a space (": "), right at the beginning of each line of text ::
:: ($0), so we get an output that will look like... ::
:: ::
:: 4: gp.gov.za has address 196.254.66.6 ::
:: ::
:: You can use *awk for counting lines as well, instead of wc -l, by using ::
:: NR instead of NF. ::
:: ::
:: I also find gawk useful for finding strings in files, when grep can't ::
:: quite cut it. I could do something like gawk '/wyze1/' /etc/passwd and ::
:: I would get an output like this... ::
:: ::
:: wyze1:x:2005:12:wyze1:/home/wyze1:/bin/tcsh ::
:: drew:x:2006:13:wyze1:/home/drew:/bin/tcsh ::
:: ::
:: So, I hear you saying "So What? I can do that with grep!" Sure. You can. ::
:: But say you were only looking for the username wyze1 and not that drew ::
:: account which has wyze1 as the real name and not the username, you can't ::
:: do that with grep, can you? So, we use awk and do something like ::
:: gawk -F: '$1 ~ /wyze1/' /etc/passwd then I will only get the wyze1 ::
:: account. Easy, huh? =) ::
:: ::
:: Say I have given myself 500 pointless accounts on my box, and have ::
:: specified "Wyzewun" as the Real Name for some & "Wyze1" for others. Now, ::
:: to make things more difficult, the Real Name for some other accounts ::
:: which I DON'T want have been set as "NotSoWyze1" and "AnythingButWyze1", ::
:: so grep will find all sorts of accounts I don't want. So, I decided to ::
:: do something like gawk -F: '$5 ~ /Wyze*/' /etc/passwd and I only find ::
:: the accounts that I want because I specified that the field must begin ::
:: with "Wyze" and end with anything. ::
:: ::
:: Now, you can also write *awk programs using BEGIN and END blocks, and it ::
:: becomes in many places much like a proper programming language. BEGIN ::
:: blocks are used for initializing variables and END blocks are used for ::
:: things that are input dependant, like totals. Lets make an example ::
:: program to find all users on the system with the username or real name ::
:: "drew" on our machine... ::
:: ::
:: BEGIN { ::
:: FS = ":" # /etc/passwd seperates stuff with colons, remember? ::
:: OFS = " " # tab ::
:: print "Username", "Real Name" ::
:: } ::
:: /drew/ {print $1, $5} ::
:: ::
:: We then save this file as fk_is_lame.awk and then invoke it by typing ::
:: gawk -f fk_is_lame.awk /etc/passwd and get an output like... ::
:: ::
:: Username Real Name ::
:: wizdumb drew ::
:: drew wyze1 ::
:: ::
:: Easy enough. :) If we wanted to do something with an end tag we could ::
:: rewrite the program like this... ::
:: ::
:: BEGIN { ::
:: FS = ":" # /etc/passwd seperates stuff with colons, remember? ::
:: OFS = " " # set output to a tab ::
:: print "Username", "Real Name" ::
:: } ::
:: /drew/ {print $1, $5 ; counts++} ::
:: END ::
:: {print counts " accounts found."} ::
:: ::
:: So our output will then look something like... ::
:: ::
:: Username Real Name ::
:: wizdumb drew ::
:: drew wyze1 ::
:: 2 accounts found. ::
:: ::
:: You can also do comparisons in awk, with the same operators you use in ::
:: C, C++, Java, whatever. (==, <, >, <=, >=, !=, ~, ~!). The only ::
:: unfamiliar stuff there should be ~ and ~! which represent matched by and ::
:: not matched by respectively. And if that other stuff isn't familiar, I ::
:: highly recommend that you start learning to code, not only is it an ::
:: extrememly rewarding experience, but it is damn useful, wether you're ::
:: involved in the computer underground or not. ::
:: ::
:: Another really powerful feature of awk, are Range Patterns. Say I have ::
:: access to an employee record sheet which follows a pattern something like::
:: Name:Employee ID:Salary that looks like... ::
:: ::
:: Drew:666000:14000 ::
:: Koos:231876:100 ::
:: John:967123:18000 ::
:: Marc:000666:16000 ::
:: ::
:: I want to view all employees with a salary between 13000 and 17000 per ::
:: month, so I type... ::
:: ::
:: cat list | gawk -F: '$3 == 13000, $3 == 17000 {print $1, $3}' ::
:: ::
:: And my result is... ::
:: ::
:: Drew 14000 ::
:: Marc 16000 ::
:: ::
:: I could also do something simpler like printing all people with a salary ::
:: less than R1000 with standard operators, like $3 < 1000 would only ::
:: print Koos's details. ::
:: ::
:: We could do that using if statement, like so... ::
:: ::
:: { if $3 < 1000 ::
:: print $1 " is such a loser" ::
:: else ::
:: print $1 " is such a pimp" } ::
:: ::
:: Drew is such a pimp ::
:: Koos is such a loser ::
:: John is such a pimp ::
:: Marc is such a pimp ::
:: ::
:: You can also use the shorthand ? : style if then else statement as used ::
:: in C/C++ and Java, which I personally prefer. ::
:: ::
:: Errr... I really don't have time to finish this article and there's a ::
:: whole bunch of stuff that I haven't covered. Hrmm. I'll make a sequel ::
:: some time, okay? ;) ::
:: ::
:: --=====-- ::
:: <WGM> Don't code Java man!!! ::
:: <WGM> Total MS-run Crap!! ::
:: <WGM> Code Delphi instead, less MS-based ::
:: --=====-- ::
:: ::
::--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--::
TUCoPS is optimized to look best in Firefox® on a widescreen monitor (1440x900 or better).
Site design & layout copyright © 1986-2025 AOH