Home Explore Blog CI



man-pages

5th chunk of `awk.man`
8241b40a2ef18b10e3de97b75e163ee88c62ed218888638b0000000100000fba
 matches r repeated one or more times.

            r?           matches r zero or once.
                         (repetition).

            (r)          matches r
                         (grouping).

            r{n}         matches r exactly n times.

            r{n,}        matches r repeated n or more times.

            r{n,m}       matches r repeated n to m (inclusive) times.

            r{,m}        matches r repeated 0 to m times (a non‐standard option).

       The increasing precedence of operators is:

       alternation concatenation repetition grouping

       For example,

            /^[_a-zA-Z][_a-zA-Z0-9]*$/  and
            /^[-+]?([0-9]+\.?|\.[0-9])[0-9]*([eE][-+]?[0-9]+)?$/

       are matched by AWK identifiers and AWK numeric constants respectively.  Note that “.” has to be escaped to be recognized as a decimal point, and that metacharacters are not special inside character classes.

       Any expression can be used on the right hand side of the ~ or !~ operators or passed to a built‐in that expects a regular expression.  If needed, it is converted to string, and then interpreted as  a  regular  expres‐
       sion.  For example,

            BEGIN { identifier = "[_a-zA-Z][_a-zA-Z0-9]*" }

            $0 ~ "^" identifier

       prints all lines that start with an AWK identifier.

       mawk recognizes the empty regular expression, //, which matches the empty string and hence is matched by any string at the front, back and between every character.  For example,

            echo  abc | mawk ’{ gsub(//, "X")’ ; print }
            XaXbXcX

   4. Records and fields
       Records  are  read in one at a time, and stored in the field variable $0.  The record is split into fields which are stored in $1, $2, ..., $NF.  The built‐in variable NF is set to the number of fields, and NR and FNR
       are incremented by 1.  Fields above $NF are set to "".

       Assignment to $0 causes the fields and NF to be recomputed.  Assignment to NF or to a field causes $0 to be reconstructed by concatenating the $i’s separated by OFS.  Assignment to a field with index greater than  NF,
       increases NF and causes $0 to be reconstructed.

       Data input stored in fields is string, unless the entire field has numeric form and then the type is number and string.  For example,

            echo 24 24E |
            mawk ’{ print($1>100, $1>"100", $2>100, $2>"100") }’
            0 1 1 1

       $0 and $2 are string and $1 is number and string.  The first comparison is numeric, the second is string, the third is string (100 is converted to "100"), and the last is string.

   5. Expressions and operators
       The expression syntax is similar to C.  Primary expressions are numeric constants, string constants, variables, fields, arrays and function calls.  The identifier for a variable, array or function can be a sequence of
       letters, digits and underscores, that does not start with a digit.  Variables are not declared; they exist when first referenced and are initialized to null.

       New expressions are composed with the following operators in order of increasing precedence.

            assignment          =  +=  -=  *=  /=  %=  ^=
            conditional         ?  :
            logical or          ||
            logical and         &&
            array membership    in
            matching       ~   !~
            relational          <  >   <=  >=  ==  !=
            concatenation       (no explicit operator)
            add ops             +  -
            mul ops             *  /  %
            unary               +  -
            logical not         !
            exponentiation      ^
            inc and dec         ++ -- (both post and pre)
            field               $

       Assignment, conditional and exponentiation associate right to left; the other operators associate left to right.  Any expression can be parenthesized.

   6. Arrays
       Awk  provides one‐dimensional arrays.  Array elements

Title: Regular Expressions, Records, Fields, Expressions, Operators, and Arrays in AWK
Summary
This section details the specifics of regular expression matching, including repetition and grouping. It also covers the precedence of regular expression operators, how expressions are used with the `~` and `!~` operators, and the use of empty regular expressions. The text then transitions to discussing records and fields in AWK, including how they are split, how `$0`, `NF`, and other field variables are managed, and the dual numeric and string nature of fields. Finally, it introduces the expression syntax, operators, and the usage of arrays in AWK, including operator precedence and the dynamic creation of variables.