summaryrefslogtreecommitdiff
path: root/static/netbsd/man1/awk.1
diff options
context:
space:
mode:
Diffstat (limited to 'static/netbsd/man1/awk.1')
-rw-r--r--static/netbsd/man1/awk.1696
1 files changed, 696 insertions, 0 deletions
diff --git a/static/netbsd/man1/awk.1 b/static/netbsd/man1/awk.1
new file mode 100644
index 00000000..496a2a65
--- /dev/null
+++ b/static/netbsd/man1/awk.1
@@ -0,0 +1,696 @@
+.de EX
+.nf
+.ft CW
+..
+.de EE
+.br
+.fi
+.ft 1
+..
+.de TF
+.IP "" "\w'\fB\\$1\ \ \fP'u"
+.PD 0
+..
+.TH AWK 1
+.CT 1 files prog_other
+.SH NAME
+awk \- pattern-directed scanning and processing language
+.SH SYNOPSIS
+.B awk
+[
+.BI \-F
+.I fs
+|
+.B \-\^\-csv
+]
+[
+.BI \-v
+.I var=value
+]
+[
+.I 'prog'
+|
+.BI \-f
+.I progfile
+]
+[
+.I file ...
+]
+.SH DESCRIPTION
+.I Awk
+scans each input
+.I file
+for lines that match any of a set of patterns specified literally in
+.I prog
+or in one or more files
+specified as
+.B \-f
+.IR progfile .
+With each pattern
+there can be an associated action that will be performed
+when a line of a
+.I file
+matches the pattern.
+Each line is matched against the
+pattern portion of every pattern-action statement;
+the associated action is performed for each matched pattern.
+The file name
+.B \-
+means the standard input.
+Any
+.I file
+of the form
+.I var=value
+is treated as an assignment, not a filename,
+and is executed at the time it would have been opened if it were a filename.
+The option
+.B \-v
+followed by
+.I var=value
+is an assignment to be done before
+.I prog
+is executed;
+any number of
+.B \-v
+options may be present.
+The
+.B \-F
+.I fs
+option defines the input field separator to be the regular expression
+.IR fs .
+The
+.B \-\^\-csv
+option causes
+.I awk
+to process records using (more or less) standard comma-separated values
+(CSV) format.
+.PP
+An input line is normally made up of fields separated by white space,
+or by the regular expression
+.BR FS .
+The fields are denoted
+.BR $1 ,
+.BR $2 ,
+\&..., while
+.B $0
+refers to the entire line.
+If
+.BR FS
+is null, the input line is split into one field per character.
+.PP
+A pattern-action statement has the form:
+.IP
+.IB pattern " { " action " }
+.PP
+A missing
+.BI { " action " }
+means print the line;
+a missing pattern always matches.
+Pattern-action statements are separated by newlines or semicolons.
+.PP
+An action is a sequence of statements.
+A statement can be one of the following:
+.PP
+.EX
+.ta \w'\f(CWdelete array[expression]\fR'u
+.RS
+.nf
+.ft CW
+if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]\fP
+while(\fI expression \fP)\fI statement\fP
+for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI statement\fP
+for(\fI var \fPin\fI array \fP)\fI statement\fP
+do\fI statement \fPwhile(\fI expression \fP)
+break
+continue
+{\fR [\fP\fI statement ... \fP\fR] \fP}
+\fIexpression\fP #\fR commonly\fP\fI var = expression\fP
+print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
+printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
+return\fR [ \fP\fIexpression \fP\fR]\fP
+next #\fR skip remaining patterns on this input line\fP
+nextfile #\fR skip rest of this file, open next, start at top\fP
+delete\fI array\fP[\fI expression \fP] #\fR delete an array element\fP
+delete\fI array\fP #\fR delete all elements of array\fP
+exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; status is \fP\fIexpression\fP
+.fi
+.RE
+.EE
+.DT
+.PP
+Statements are terminated by
+semicolons, newlines or right braces.
+An empty
+.I expression-list
+stands for
+.BR $0 .
+String constants are quoted \&\f(CW"\ "\fR,
+with the usual C escapes recognized within.
+Expressions take on string or numeric values as appropriate,
+and are built using the operators
+.B + \- * / % ^
+(exponentiation), and concatenation (indicated by white space).
+The operators
+.B
+! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
+are also available in expressions.
+Variables may be scalars, array elements
+(denoted
+.IB x [ i ] \fR)
+or fields.
+Variables are initialized to the null string.
+Array subscripts may be any string,
+not necessarily numeric;
+this allows for a form of associative memory.
+Multiple subscripts such as
+.B [i,j,k]
+are permitted; the constituents are concatenated,
+separated by the value of
+.BR SUBSEP .
+.PP
+The
+.B print
+statement prints its arguments on the standard output
+(or on a file if
+.BI > " file
+or
+.BI >> " file
+is present or on a pipe if
+.BI | " cmd
+is present), separated by the current output field separator,
+and terminated by the output record separator.
+.I file
+and
+.I cmd
+may be literal names or parenthesized expressions;
+identical string values in different statements denote
+the same open file.
+The
+.B printf
+statement formats its expression list according to the
+.I format
+(see
+.IR printf (3)).
+The built-in function
+.BI close( expr )
+closes the file or pipe
+.IR expr .
+The built-in function
+.BI fflush( expr )
+flushes any buffered output for the file or pipe
+.IR expr .
+.PP
+The mathematical functions
+.BR atan2 ,
+.BR cos ,
+.BR exp ,
+.BR log ,
+.BR sin ,
+and
+.B sqrt
+are built in.
+Other built-in functions:
+.TF "\fBlength(\fR[\fIv\^\fR]\fB)\fR"
+.TP
+\fBlength(\fR[\fIv\^\fR]\fB)\fR
+the length of its argument
+taken as a string,
+number of elements in an array for an array argument,
+or length of
+.B $0
+if no argument.
+.TP
+.B rand()
+random number on [0,1).
+.TP
+\fBsrand(\fR[\fIs\^\fR]\fB)\fR
+sets seed for
+.B rand
+and returns the previous seed.
+.TP
+.BI int( x\^ )
+truncates to an integer value.
+.TP
+\fBsubstr(\fIs\fB, \fIm\fR [\fB, \fIn\^\fR]\fB)\fR
+the
+.IR n -character
+substring of
+.I s
+that begins at position
+.I m
+counted from 1.
+If no
+.IR n ,
+use the rest of the string.
+.TP
+.BI index( s , " t" )
+the position in
+.I s
+where the string
+.I t
+occurs, or 0 if it does not.
+.TP
+.BI match( s , " r" )
+the position in
+.I s
+where the regular expression
+.I r
+occurs, or 0 if it does not.
+The variables
+.B RSTART
+and
+.B RLENGTH
+are set to the position and length of the matched string.
+.TP
+\fBsplit(\fIs\fB, \fIa \fR[\fB, \fIfs\^\fR]\fB)\fR
+splits the string
+.I s
+into array elements
+.IB a [1] \fR,
+.IB a [2] \fR,
+\&...,
+.IB a [ n ] \fR,
+and returns
+.IR n .
+The separation is done with the regular expression
+.I fs
+or with the field separator
+.B FS
+if
+.I fs
+is not given.
+An empty string as field separator splits the string
+into one array element per character.
+.TP
+\fBsub(\fIr\fB, \fIt \fR[, \fIs\^\fR]\fB)
+substitutes
+.I t
+for the first occurrence of the regular expression
+.I r
+in the string
+.IR s .
+If
+.I s
+is not given,
+.B $0
+is used.
+.TP
+\fBgsub(\fIr\fB, \fIt \fR[, \fIs\^\fR]\fB)
+same as
+.B sub
+except that all occurrences of the regular expression
+are replaced;
+.B sub
+and
+.B gsub
+return the number of replacements.
+.TP
+\fBgensub(\fIpat\fB, \fIrepl\fB, \fIhow\fR [\fB, \fItarget\fR]\fB)\fR
+replaces instances of
+.I pat
+in
+.I target
+with
+.IR repl .
+If
+.I how
+is \fB"g"\fR or \fB"G"\fR, do so globally. Otherwise,
+.I how
+is a number indicating which occurrence to replace. If no
+.IR target ,
+use
+.BR $0 .
+Return the resulting string;
+.I target
+is not modified.
+.TP
+.BI sprintf( fmt , " expr" , " ...\fB)
+the string resulting from formatting
+.I expr ...
+according to the
+.IR printf (3)
+format
+.IR fmt .
+.TP
+.B systime()
+returns the current date and time as a standard
+``seconds since the epoch'' value.
+.TP
+.BI strftime( fmt ", " timestamp\^ )
+formats
+.I timestamp
+(a value in seconds since the epoch)
+according to
+.IR fmt ,
+which is a format string as supported by
+.IR strftime (3).
+Both
+.I timestamp
+and
+.I fmt
+may be omitted; if no
+.IR timestamp ,
+the current time of day is used, and if no
+.IR fmt ,
+a default format of \fB"%a %b %e %H:%M:%S %Z %Y"\fR is used.
+.TP
+.BI system( cmd )
+executes
+.I cmd
+and returns its exit status. This will be \-1 upon error,
+.IR cmd 's
+exit status upon a normal exit,
+256 +
+.I sig
+upon death-by-signal, where
+.I sig
+is the number of the murdering signal,
+or 512 +
+.I sig
+if there was a core dump.
+.TP
+.BI tolower( str )
+returns a copy of
+.I str
+with all upper-case characters translated to their
+corresponding lower-case equivalents.
+.TP
+.BI toupper( str )
+returns a copy of
+.I str
+with all lower-case characters translated to their
+corresponding upper-case equivalents.
+.PD
+.PP
+The ``function''
+.B getline
+sets
+.B $0
+to the next input record from the current input file;
+.B getline
+.BI < " file
+sets
+.B $0
+to the next record from
+.IR file .
+.B getline
+.I x
+sets variable
+.I x
+instead.
+Finally,
+.IB cmd " | getline
+pipes the output of
+.I cmd
+into
+.BR getline ;
+each call of
+.B getline
+returns the next line of output from
+.IR cmd .
+In all cases,
+.B getline
+returns 1 for a successful input,
+0 for end of file, and \-1 for an error.
+.PP
+The functions
+.BR compl ,
+.BR and ,
+.BR or ,
+.BR xor ,
+.BR lshift ,
+and
+.B rshift
+peform the corresponding bitwise operations on their
+operands, which are first truncated to integer.
+.PP
+Patterns are arbitrary Boolean combinations
+(with
+.BR "! || &&" )
+of regular expressions and
+relational expressions.
+Regular expressions are as in
+.IR egrep ;
+see
+.IR grep (1).
+Isolated regular expressions
+in a pattern apply to the entire line.
+Regular expressions may also occur in
+relational expressions, using the operators
+.B ~
+and
+.BR !~ .
+.BI / re /
+is a constant regular expression;
+any string (constant or variable) may be used
+as a regular expression, except in the position of an isolated regular expression
+in a pattern.
+.PP
+A pattern may consist of two patterns separated by a comma;
+in this case, the action is performed for all lines
+from an occurrence of the first pattern
+through an occurrence of the second, inclusive.
+.PP
+A relational expression is one of the following:
+.IP
+.I expression matchop regular-expression
+.br
+.I expression relop expression
+.br
+.IB expression " in " array-name
+.br
+.BI ( expr ,\| expr ,\| ... ") in " array-name
+.PP
+where a
+.I relop
+is any of the six relational operators in C,
+and a
+.I matchop
+is either
+.B ~
+(matches)
+or
+.B !~
+(does not match).
+A conditional is an arithmetic expression,
+a relational expression,
+or a Boolean combination
+of these.
+.PP
+The special patterns
+.B BEGIN
+and
+.B END
+may be used to capture control before the first input line is read
+and after the last.
+.B BEGIN
+and
+.B END
+do not combine with other patterns.
+They may appear multiple times in a program and execute
+in the order they are read by
+.IR awk .
+.PP
+Variable names with special meanings:
+.TF FILENAME
+.TP
+.B ARGC
+argument count, assignable.
+.TP
+.B ARGV
+argument array, assignable;
+non-null members are taken as filenames.
+.TP
+.B CONVFMT
+conversion format used when converting numbers
+(default
+.BR "%.6g" ).
+.TP
+.B ENVIRON
+array of environment variables; subscripts are names.
+.TP
+.B FILENAME
+the name of the current input file.
+.TP
+.B FNR
+ordinal number of the current record in the current file.
+.TP
+.B FS
+regular expression used to separate fields; also settable
+by option
+.BI \-F fs\fR.
+.TP
+.BR NF
+number of fields in the current record.
+.TP
+.B NR
+ordinal number of the current record.
+.TP
+.B OFMT
+output format for numbers (default
+.BR "%.6g" ).
+.TP
+.B OFS
+output field separator (default space).
+.TP
+.B ORS
+output record separator (default newline).
+.TP
+.B RLENGTH
+the length of a string matched by
+.BR match .
+.TP
+.B RS
+input record separator (default newline).
+If empty, blank lines separate records.
+If more than one character long,
+.B RS
+is treated as a regular expression, and records are
+separated by text matching the expression.
+.TP
+.B RSTART
+the start position of a string matched by
+.BR match .
+.TP
+.B SUBSEP
+separates multiple subscripts (default 034).
+.PD
+.PP
+Functions may be defined (at the position of a pattern-action statement) thus:
+.IP
+.B
+function foo(a, b, c) { ... }
+.PP
+Parameters are passed by value if scalar and by reference if array name;
+functions may be called recursively.
+Parameters are local to the function; all other variables are global.
+Thus local variables may be created by providing excess parameters in
+the function definition.
+.SH ENVIRONMENT VARIABLES
+If
+.B POSIXLY_CORRECT
+is set in the environment, then
+.I awk
+follows the POSIX rules for
+.B sub
+and
+.B gsub
+with respect to consecutive backslashes and ampersands.
+.SH EXAMPLES
+.TP
+.EX
+length($0) > 72
+.EE
+Print lines longer than 72 characters.
+.TP
+.EX
+{ print $2, $1 }
+.EE
+Print first two fields in opposite order.
+.PP
+.EX
+BEGIN { FS = ",[ \et]*|[ \et]+" }
+ { print $2, $1 }
+.EE
+.ns
+.IP
+Same, with input fields separated by comma and/or spaces and tabs.
+.PP
+.EX
+.nf
+ { s += $1 }
+END { print "sum is", s, " average is", s/NR }
+.fi
+.EE
+.ns
+.IP
+Add up first column, print sum and average.
+.TP
+.EX
+/start/, /stop/
+.EE
+Print all lines between start/stop pairs.
+.PP
+.EX
+.nf
+BEGIN { # Simulate echo(1)
+ for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
+ printf "\en"
+ exit }
+.fi
+.EE
+.SH SEE ALSO
+.IR grep (1),
+.IR lex (1),
+.IR sed (1)
+.br
+A. V. Aho, B. W. Kernighan, P. J. Weinberger,
+.IR "The AWK Programming Language, Second Edition" ,
+Addison-Wesley, 2024. ISBN 978-0-13-826972-2, 0-13-826972-6.
+.SH BUGS
+There are no explicit conversions between numbers and strings.
+To force an expression to be treated as a number add 0 to it;
+to force it to be treated as a string concatenate
+\&\f(CW""\fP to it.
+.PP
+The scope rules for variables in functions are a botch;
+the syntax is worse.
+.PP
+Input is expected to be UTF-8 encoded. Other multibyte
+character sets are not handled.
+However, in eight-bit locales,
+.I awk
+treats each input byte as a separate character.
+.SH UNUSUAL FLOATING-POINT VALUES
+.I Awk
+was designed before IEEE 754 arithmetic defined Not-A-Number (NaN)
+and Infinity values, which are supported by all modern floating-point
+hardware.
+.PP
+Because
+.I awk
+uses
+.IR strtod (3)
+and
+.IR atof (3)
+to convert string values to double-precision floating-point values,
+modern C libraries also convert strings starting with
+.B inf
+and
+.B nan
+into infinity and NaN values respectively. This led to strange results,
+with something like this:
+.PP
+.EX
+.nf
+echo nancy | awk '{ print $1 + 0 }'
+.fi
+.EE
+.PP
+printing
+.B nan
+instead of zero.
+.PP
+.I Awk
+now follows GNU AWK, and prefilters string values before attempting
+to convert them to numbers, as follows:
+.TP
+.I "Hexadecimal values"
+Hexadecimal values (allowed since C99) convert to zero, as they did
+prior to C99.
+.TP
+.I "NaN values"
+The two strings
+.B +nan
+and
+.B \-nan
+(case independent) convert to NaN. No others do.
+(NaNs can have signs.)
+.TP
+.I "Infinity values"
+The two strings
+.B +inf
+and
+.B \-inf
+(case independent) convert to positive and negative infinity, respectively.
+No others do.