|
::--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--:: :: .ooO A Guide to playing with gawk by Wyzewun Ooo. :: ::--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--:: :: :: :: I was shocked at the number of people who don't know how to use (g)awk :: :: properly, so I decided to write up a guide to getting starting with gawk :: :: for text formatting or whatever. Oh, I generally refer to gawk, but if :: :: you have an ancient *nix then you may have another version, but awk will :: :: probably symlink to it anyway. Here's a little chart of the evolution of :: :: the awk utility... :: :: :: :: awk ------> nawk ------> POSIXawk ------> gawk :: :: :: :: Right, so lets try some simple stuff with awk first. Probably the most :: :: commonly known thing that one can do with awk is format coloums. For :: :: example, the output of a command like host -l gov.za would have an :: :: output that looks like this... :: :: :: :: <stuff cut out> :: :: gp.gov.za has address 196.254.66.6 :: :: <stuff cut out> :: :: :: :: Now, we want to format the output of our host command and save the IP :: :: addresses to a file called lame. We would type something to the effect :: :: of host -l gov.za | gawk '{print $4}' > lame :: :: :: :: We are telling awk to print the fourth coloum only, thus the $4, and so :: :: we will end up with a list of all the IPs with .gov.za hostnames. ;) :: :: :: :: Obviously, the above is used by script kiddies a helluva lot, so they :: :: can use their l33t0 mscan across a third of the internet, in the hope :: :: that they'll find some lame .edu host that they can root and feel elite. :: :: *Sigh* So lets look at some more useful stuff, shall we? It won't help :: :: you pointlessly compromise machines, but it may help you become a :: :: proficient Unix user (imagine that). :: :: :: :: Okey Dokey, awk can count the number of coloums as well. We could've :: :: done this with the previous example by typing something like :: :: host -l gov.za | gawk '{print NF ": " $0}' :: :: :: :: We are telling awk to print the number of fields (print NF), followed by :: :: a colon and a space (": "), right at the beginning of each line of text :: :: ($0), so we get an output that will look like... :: :: :: :: 4: gp.gov.za has address 196.254.66.6 :: :: :: :: You can use *awk for counting lines as well, instead of wc -l, by using :: :: NR instead of NF. :: :: :: :: I also find gawk useful for finding strings in files, when grep can't :: :: quite cut it. I could do something like gawk '/wyze1/' /etc/passwd and :: :: I would get an output like this... :: :: :: :: wyze1:x:2005:12:wyze1:/home/wyze1:/bin/tcsh :: :: drew:x:2006:13:wyze1:/home/drew:/bin/tcsh :: :: :: :: So, I hear you saying "So What? I can do that with grep!" Sure. You can. :: :: But say you were only looking for the username wyze1 and not that drew :: :: account which has wyze1 as the real name and not the username, you can't :: :: do that with grep, can you? So, we use awk and do something like :: :: gawk -F: '$1 ~ /wyze1/' /etc/passwd then I will only get the wyze1 :: :: account. Easy, huh? =) :: :: :: :: Say I have given myself 500 pointless accounts on my box, and have :: :: specified "Wyzewun" as the Real Name for some & "Wyze1" for others. Now, :: :: to make things more difficult, the Real Name for some other accounts :: :: which I DON'T want have been set as "NotSoWyze1" and "AnythingButWyze1", :: :: so grep will find all sorts of accounts I don't want. So, I decided to :: :: do something like gawk -F: '$5 ~ /Wyze*/' /etc/passwd and I only find :: :: the accounts that I want because I specified that the field must begin :: :: with "Wyze" and end with anything. :: :: :: :: Now, you can also write *awk programs using BEGIN and END blocks, and it :: :: becomes in many places much like a proper programming language. BEGIN :: :: blocks are used for initializing variables and END blocks are used for :: :: things that are input dependant, like totals. Lets make an example :: :: program to find all users on the system with the username or real name :: :: "drew" on our machine... :: :: :: :: BEGIN { :: :: FS = ":" # /etc/passwd seperates stuff with colons, remember? :: :: OFS = " " # tab :: :: print "Username", "Real Name" :: :: } :: :: /drew/ {print $1, $5} :: :: :: :: We then save this file as fk_is_lame.awk and then invoke it by typing :: :: gawk -f fk_is_lame.awk /etc/passwd and get an output like... :: :: :: :: Username Real Name :: :: wizdumb drew :: :: drew wyze1 :: :: :: :: Easy enough. :) If we wanted to do something with an end tag we could :: :: rewrite the program like this... :: :: :: :: BEGIN { :: :: FS = ":" # /etc/passwd seperates stuff with colons, remember? :: :: OFS = " " # set output to a tab :: :: print "Username", "Real Name" :: :: } :: :: /drew/ {print $1, $5 ; counts++} :: :: END :: :: {print counts " accounts found."} :: :: :: :: So our output will then look something like... :: :: :: :: Username Real Name :: :: wizdumb drew :: :: drew wyze1 :: :: 2 accounts found. :: :: :: :: You can also do comparisons in awk, with the same operators you use in :: :: C, C++, Java, whatever. (==, <, >, <=, >=, !=, ~, ~!). The only :: :: unfamiliar stuff there should be ~ and ~! which represent matched by and :: :: not matched by respectively. And if that other stuff isn't familiar, I :: :: highly recommend that you start learning to code, not only is it an :: :: extrememly rewarding experience, but it is damn useful, wether you're :: :: involved in the computer underground or not. :: :: :: :: Another really powerful feature of awk, are Range Patterns. Say I have :: :: access to an employee record sheet which follows a pattern something like:: :: Name:Employee ID:Salary that looks like... :: :: :: :: Drew:666000:14000 :: :: Koos:231876:100 :: :: John:967123:18000 :: :: Marc:000666:16000 :: :: :: :: I want to view all employees with a salary between 13000 and 17000 per :: :: month, so I type... :: :: :: :: cat list | gawk -F: '$3 == 13000, $3 == 17000 {print $1, $3}' :: :: :: :: And my result is... :: :: :: :: Drew 14000 :: :: Marc 16000 :: :: :: :: I could also do something simpler like printing all people with a salary :: :: less than R1000 with standard operators, like $3 < 1000 would only :: :: print Koos's details. :: :: :: :: We could do that using if statement, like so... :: :: :: :: { if $3 < 1000 :: :: print $1 " is such a loser" :: :: else :: :: print $1 " is such a pimp" } :: :: :: :: Drew is such a pimp :: :: Koos is such a loser :: :: John is such a pimp :: :: Marc is such a pimp :: :: :: :: You can also use the shorthand ? : style if then else statement as used :: :: in C/C++ and Java, which I personally prefer. :: :: :: :: Errr... I really don't have time to finish this article and there's a :: :: whole bunch of stuff that I haven't covered. Hrmm. I'll make a sequel :: :: some time, okay? ;) :: :: :: :: --=====-- :: :: <WGM> Don't code Java man!!! :: :: <WGM> Total MS-run Crap!! :: :: <WGM> Code Delphi instead, less MS-based :: :: --=====-- :: :: :: ::--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--::