Title here
Summary here
gawk
defaults to using ERE mode.
First, create the foo
text file.
cat <<EOF > foo
a1,a2,a3
b1,b2,b3
EOF
gawk -F, '/1,a/{ print $1 }' foo
a1
Here, the entire line is matched, equivalent to $0
.
$2 ~
specifies the use of the second field for matching.
gawk 'BEGIN{ FS="," } $2 ~ /^[ab]2/{ print $2 }' foo
a2
b2
Substitution, replaces the first matched string.
sub(regex, replacement [, target])
regex
:The regular expression to match.replacement
:The string to replace the match.target
:Optional, the target string, defaults to $0
.Not providing target
defaults to whole line matching.
echo "aa bb aa" | gawk '{ sub(/aa/, "cc"); print }'
cc bb aa
Specify to replace the third column.
echo "aa bb aa" | gawk '{ sub(/aa/, "cc", $3); print }'
aa bb cc
echo "app cat" | gawk '{ sub(/\w+/, "[&]"); print }'
[app] cat
Global Substitution, globally replaces.
gsub(regex, replacement [, target])
regex
:The regular expression to match.replacement
:The string to replace the match.target
:Optional, the target string, defaults to $0
.Not providing target
defaults to whole line matching.
echo 'aa bb aa' | gawk '{ gsub("aa", "cc"); print }'
cc bb cc
Specify to replace the third column.
echo 'aa bb aa' | gawk '{ gsub("aa", "cc", $3); print }'
aa bb cc
echo "app cat" | gawk '{ gsub(/\w+/, "[&]"); print }'
[app] [cat]
\<
indicates the start of a word.
echo 'app cat' | gawk '{ gsub(/\<[a-z]/, "[&]"); print }'
[a]pp [c]at
\>
indicates the end of a word.
echo 'app cat' | gawk '{ gsub(/[a-z]\>/, "[&]"); print }'
ap[p] ca[t]
General Substitution, general replacement.
sub
and gsub
.sub
and gsub
do not support capture groups.gensub(regex, replacement, how [, target])
regex
:The regular expression to match.replacement
:The replacement string, can use capture groups.how
:Can specify global or the Nth match replacement.target
:Optional, the target string, defaults to $0
.Using g
for global replacement.
echo "aa aa aa" | gawk '{ print gensub(/aa/, "bb", "g") }'
bb bb bb
Replace the second match.
echo "aa aa aa" | gawk '{ print gensub(/aa/, "bb", "2") }'
aa bb aa
\1
represents the first matched parameter.
echo "aa-bb" | gawk '{ print gensub(/(\w+)-(\w+)/, "\\2:\\1", "g")}'
bb:aa