Gawk

From wikinotes
Revision as of 01:21, 19 July 2021 by Will (talk | contribs) (→‎Loops)

AWK is a scripting language that was originally designed for manipulating tables and presenting them in human-readable formats. It is useful on the commandline, I'd prefer to use a more powerful language than awk than use it for scripting.


WARNING:

There are two main variations of the unix coreutils (Gnu/BSD), each has different parameters etc. If you are expecting to use awk across different platforms, make no assumptions.

Notes

gawk usage
gawk variables
gawk datatypes
gawk conditionals
gawk loops

Syntax

print, printf, sprintf

print

Your basic print, like echo.

Multiple items can be printed separated by a comma. They will be output with a space between them by default. $0 prints the entire submitted line

echo "hello" | awk '{$0,"goodbye"}' 
> hello goodbye


$1, $2, $3 etc. prints each entry separated by a token. The default token is a single space.

echo "one two three four" | awk '{print $1 $3}'
> one three

printf

By default, every print statement is converted to a string. printf allows you to control the output type, and the spacing of elements

%s    # string
%f    # floating point
%i    # integer
%E    # scientific notation
%4s   # string (padded to 4 characters)
%.4s  # string (max 4 characters, truncated if necessary)
echo "one two three" | awk '{ printf(   "%10s %5s %-10s %s",    $1,$2,$3,"\n"   ) }'
>          one     two three
printf statements must be encased in parentheses. The first section specifies the format.

--left alignment
%10s means that the first entry ($1 in this case) is a string and will have a 
'column' 10 characters wide from the start of the word.
ex: {one.......}

--right alignment
%-10s means that the entry ($3) will have a column 10 characters wide, but starting at the end of the word.
ex: {.....three}

--integer
you could also just as easily print an integer with:

echo "10.459" | awk '{ printf( "%i", $1) }'
> 10

sprintf

sprintf is syntactically identical to printf except that instead of printing to the standard output, it is designed to be printed to a variable.

echo "8.234" | awk '{var = sprintf("%f", $1); print var}'

split

You can't call awk -F from within an awk script. But you can use split to tokenize within an awk script.

split("this is my string", a, " ")
a[1] = this
a[2] = is
echo "aaa bbb cc/11/22" \
    | awk '{ split($3, a, "/"); print(a[2]); }'  
    # 11

NOTE:

awk has no way of measuring size of an array. You can however use split on a variable, and count the number of tokens

WARNING:

awk array indexes start at 1

match

match checks for a matching string, returns char number if found, otherwise returns a 0

match($0, "searchterm")
echo "abcdefg" | awk '{var=match($0, "cd"); print var}'
#> 3

echo "abcdefg" | awk '{var=match($0, "zef"); print var}'
#> 0

echo "abcdefg" | awk '{ 
   if(match($0, "cd")) { 
       print "match found"; 
   } 
}'
#> match found

system

Executes a command in shell or cmd from an awk script. Assigning to a variable only gives return value (1,0)

system(ls -la);

math

var+= 1;
var=( (100/2) * 3 );

References

http://www.funtoo.org/wiki/Awk_by_Example,_Part_1
http://www.math.utah.edu/docs/info/gawk_7.html