Home Explore Blog CI



man-pages

15th chunk of `awk.man`
f7411a76e8b6118697bff841e292be02171bb5f6df071986000000010000096a
 compatibility with “traditional awk”).

       •   Interval expressions, were introduced into awk regular expressions in IEEE 1003.1‐2001 (also known as Unix 03), along with some internationalization features.

       •   Apple modified its copy of the original awk in April 2006, making this version of awk support interval expressions.

           The updated source provides for compatibility with older “legacy” versions using an environment variable, making this “Unix 2003” feature (perhaps meant as Unix 03) the default.

       •   NetBSD developers copied this change in January 2018, omitting the compatibility option, and then applied it to BWK awk.

       •   The interval expression implementation in mawk is based on changes proposed by James Parkinson in April 2016.

       Mawk also recognizes a few gawk‐specific command line options for script compatibility:

            --help, --posix, -r, --re-interval, --traditional, --version

   Subtle Differences not in POSIX or the AWK Book
       Finally, here is how mawk handles exceptional cases not discussed in the AWK book or the POSIX draft.  It is unsafe to assume consistency across awks and safe to skip to the next section.

          •   substr(s, i, n) returns the characters of s in the intersection of the closed interval [1, length(s)] and the half‐open interval [i, i+n).  When this intersection is empty, the empty string is returned; so sub‐
              str("ABC", 1, 0) = "" and substr("ABC", -4, 6) = "A".

          •   Every string, including the empty string, matches the empty string at the front so, s ~ // and s ~ "", are always 1 as is match(s, //) and match(s, "").  The last two set RLENGTH to 0.

          •   index(s,  t)  is always the same as match(s, t1) where t1 is the same as t with metacharacters escaped.  Hence consistency with match requires that index(s, "") always returns 1.  Also the condition, index(s,t)
              != 0 if and only t is a substring of s, requires index("","") = 1.

          •   If getline encounters end of file, getline var, leaves var unchanged.  Similarly, on entry to the END actions, $0, the fields and NF have their value unaltered from the last record.

ENVIRONMENT VARIABLES
       Mawk recognizes these variables:

          MAWKBINMODE
             (see COMPATIBILITY ISSUES)

          MAWK_LONG_OPTIONS
             If this

Title: MAWK's Handling of Interval Expressions, Command-Line Options, and Subtle Differences
Summary
This section details the history of interval expression support in various AWK implementations, including GAWK, IEEE 1003.1-2001, Apple's modified AWK, NetBSD, and finally, MAWK. It notes the command line options that MAWK recognizes to emulate GAWK, and discusses subtle differences in MAWK's behavior that are not covered in either the AWK book or the POSIX standard, focusing on edge cases for functions like substr, regular expression matching, and index. Finally, it mentions getline's behavior at end of file and the preservation of variables in END actions. It concludes by listing environment variables MAWK utilizes.