Merge branch 'master' into feature/cmake

author: Arnold D. Robbins <arnold@skeeve.com> 2015-06-19 12:42:37 +0300
committer: Arnold D. Robbins <arnold@skeeve.com> 2015-06-19 12:42:37 +0300
commit: ec58524cb5a671c18c4af1b893e599eb04c7760a (patch)
tree: 1d1c3d298ec82caa03c0cf5caeb0dd14b08ce247 /doc/gawk.texi
parent: 76e1f5bfee032dbcb5c19b3e4e92f96aa05731c3 (diff)
parent: f7cd8a03c09a00c4cb520f881bbe838cf76e718f (diff)
download: egawk-ec58524cb5a671c18c4af1b893e599eb04c7760a.tar.gz
egawk-ec58524cb5a671c18c4af1b893e599eb04c7760a.tar.bz2
egawk-ec58524cb5a671c18c4af1b893e599eb04c7760a.zip
1 files changed, 151 insertions, 5 deletions
diff --git a/doc/gawk.texi b/doc/gawk.texi
index d61a47de..7552f164 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -562,6 +562,7 @@ particular records in a file and perform operations upon them.
 * Computed Regexps::                    Using Dynamic Regexps.
 * GNU Regexp Operators::                Operators specific to GNU software.
 * Case-sensitivity::                    How to do case-insensitive matching.
+* Strong Regexp Constants::             Strongly typed regexp constants.
 * Regexp Summary::                      Regular expressions summary.
 * Records::                             Controlling how data is split into
                                         records.
@@ -5013,6 +5014,7 @@ regular expressions work, we present more complicated instances.
 * Computed Regexps::            Using Dynamic Regexps.
 * GNU Regexp Operators::        Operators specific to GNU software.
 * Case-sensitivity::            How to do case-insensitive matching.
+* Strong Regexp Constants::     Strongly typed regexp constants.
 * Regexp Summary::              Regular expressions summary.
 @end menu
 
@@ -6260,6 +6262,89 @@ The value of @code{IGNORECASE} has no effect if @command{gawk} is in
 compatibility mode (@pxref{Options}).
 Case is always significant in compatibility mode.
 
+@node Strong Regexp Constants
+@section Strongly Typed Regexp Constants
+
+This @value{SECTION} describes a @command{gawk}-specific feature.
+
+Regexp constants (@code{/@dots{}/}) hold a strange position in the
+@command{awk} language. In most contexts, they act like an expression:
+@samp{$0 ~ /@dots{}/}. In other contexts, they denote only a regexp to
+be matched. In no case are they really a ``first class citizen'' of the
+language. That is, you cannot define a scalar variable whose type is
+``regexp'' in the same sense that you can define a variable to be a
+number or a string:
+
+@example
+num = 42        @ii{Numeric variable}
+str = "hi"      @ii{String variable}
+re = /foo/      @ii{Wrong!} re @ii{is the result of} $0 ~ /foo/
+@end example
+
+For a number of more advanced use cases (described later on in this
+@value{DOCUMENT}), it would be nice to have regexp constants that
+are @dfn{strongly typed}; in other words, that denote a regexp useful
+for matching, and not an expression.
+
+@command{gawk} provides this feature.  A strongly typed regexp constant
+looks almost like a regular regexp constant, except that it is preceded
+by an @samp{@@} sign:
+
+@example
+re = @@/foo/     @ii{Regexp variable}
+@end example
+
+Strongly typed regexp constants @emph{cannot} be used eveywhere that a
+regular regexp constant can, because this would make the language even more
+confusing.  Instead, you may use them only in certain contexts:
+
+@itemize @bullet
+@item
+On the righthand side of the @samp{~} and @samp{!~} operators: @samp{some_var ~ @@/foo/}
+(@pxref{Regexp Usage}).
+
+@item
+In the @code{case} part of a @code{switch} statement
+(@pxref{Switch Statement}).
+
+@item
+As an argument to one of the built-in functions that accept regexp constants:
+@code{gensub()},
+@code{gsub()},
+@code{match()},
+@code{patsplit()},
+@code{split()},
+and
+@code{sub()}
+(@pxref{String Functions}).
+
+@item
+As a parameter in a call to a user-defined function
+(@pxref{User-defined}).
+
+@item
+On the righthand side of an assignment to a variable: @samp{some_var = @@/foo/}.
+In this case, the type of @code{some_var} is regexp. Additionally, @code{some_var}
+can be used with @samp{~} and @samp{!~}, passed to one of the built-in functions
+listed above, or passed as a parameter to a user-defined function.
+@end itemize
+
+You may use the @code{typeof()} built-in function
+(@pxref{Type Functions})
+to determine if a variable or function parameter is
+a regexp variable.
+
+The true power of this feature comes from the ability to create variables that
+have regexp type. Such variables can be passed on to user-defined functions,
+without the confusing aspects of computed regular expressions created from
+strings or string constants. They may also be passed through indirect function
+calls (@pxref{Indirect Calls})
+onto the built-in functions that accept regexp constants.
+
+When used in numeric conversions, strongly typed regexp variables convert
+to zero. When used in string conversions, they convert to the string
+value of the original regexp text.
+
 @node Regexp Summary
 @section Summary
 
@@ -6303,6 +6388,11 @@ treated as regular expressions).
 case sensitivity of regexp matching.  In other @command{awk}
 versions, use @code{tolower()} or @code{toupper()}.
 
+@item
+Strongly typed regexp constants (@code{@@/.../}) enable
+certain advanced use cases to be described later on in the
+@value{DOCUMENT}.
+
 @end itemize
 
 
@@ -19387,16 +19477,41 @@ results of the @code{compl()}, @code{lshift()}, and @code{rshift()} functions.
 @node Type Functions
 @subsection Getting Type Information
 
-@command{gawk} provides a single function that lets you distinguish
-an array from a scalar variable.  This is necessary for writing code
+@command{gawk} provides two functions that lets you distinguish
+the type of a variable.
+This is necessary for writing code
 that traverses every element of an array of arrays
-(@pxref{Arrays of Arrays}).
+(@pxref{Arrays of Arrays}), and in other contexts.
 
 @table @code
 @cindexgawkfunc{isarray}
 @cindex scalar or array
 @item isarray(@var{x})
 Return a true value if @var{x} is an array. Otherwise, return false.
+
+@cindexgawkfunc{typeof}
+@cindex variable type
+@cindex type, of variable
+@item typeof(@var{x})
+Return one of the following strings, depending upon the type of @var{x}:
+
+@c nested table
+@table @code
+@item "array"
+@var{x} is an array.
+
+@item "regexp"
+@var{x} is a strongly typed regexp (@pxref{Strong Regexp Constants}).
+
+@item "scalar_n"
+@var{x} is a number.
+
+@item "scalar_s"
+@var{x} is a string.
+
+@item "untyped"
+@var{x} has not yet been given a type.
+@end table
 @end table
 
 @code{isarray()} is meant for use in two circumstances. The first is when
@@ -19414,6 +19529,14 @@ that has not been previously used to @code{isarray()}, @command{gawk}
 ends up turning it into a scalar.
 @end quotation
 
+The @code{typeof()} function is general; it allows you to determine
+if a variable or function parameter is a scalar, an array, or a strongly
+typed regexp.
+
+@code{isarray()} is deprecated; you should use @code{typeof()} instead.
+You should replace any existing uses of @samp{isarray(var)} in your
+code with @samp{typeof(var) == "array"}.
+
 @node I18N Functions
 @subsection String-Translation Functions
 @cindex @command{gawk}, string-translation functions
@@ -34975,17 +35098,31 @@ properly:
 # Please set INPLACE_SUFFIX to make a backup copy.  For example, you may
 # want to set INPLACE_SUFFIX to .bak on the command line or in a BEGIN rule.
 
+# By default, each filename on the command line will be edited inplace.
+# But you can selectively disable this by adding an inplace=0 argument
+# prior to files that you do not want to process this way.  You can then
+# reenable it later on the commandline by putting inplace=1 before files
+# that you wish to be subject to inplace editing.
+
 # N.B. We call inplace_end() in the BEGINFILE and END rules so that any
 # actions in an ENDFILE rule will be redirected as expected.
 
+BEGIN @{
+    inplace = 1		# enabled by default
+@}
+
 BEGINFILE @{
     if (_inplace_filename != "")
         inplace_end(_inplace_filename, INPLACE_SUFFIX)
-    inplace_begin(_inplace_filename = FILENAME, INPLACE_SUFFIX)
+    if (inplace)
+        inplace_begin(_inplace_filename = FILENAME, INPLACE_SUFFIX)
+    else
+        _inplace_filename = ""
 @}
 
 END @{
-    inplace_end(FILENAME, INPLACE_SUFFIX)
+    if (_inplace_filename != "")
+        inplace_end(_inplace_filename, INPLACE_SUFFIX)
 @}
 @end group
 @c endfile
@@ -34999,6 +35136,11 @@ If @code{INPLACE_SUFFIX} is not an empty string, the original file is
 linked to a backup @value{FN} created by appending that suffix.  Finally,
 the temporary file is renamed to the original @value{FN}.
 
+Note that the use of this feature can be controlled by placing @samp{inplace=0}
+on the command-line prior to listing files that should not be processed this
+way.  You can reenable inplace editing by adding an @samp{inplace=1} argument
+prior to files that should be subject to inplace editing.
+
 The @code{_inplace_filename} variable serves to keep track of the
 current filename so as to not invoke @code{inplace_end()} before
 processing the first file.
@@ -35019,6 +35161,10 @@ $ @kbd{gawk -i inplace -v INPLACE_SUFFIX=.bak '@{ gsub(/foo/, "bar") @}}
 > @kbd{@{ print @}' file1 file2 file3}
 @end example
 
+Please note that, while the extension does attempt to preserve ownership and permissions, it makes no attempt to copy the ACLs from the original file.
+
+If the program dies prematurely, as might happen if an unhandled signal is received, a temporary file may be left behind.
+
 @node Extension Sample Ord
 @subsection Character and Numeric values: @code{ord()} and @code{chr()}
author	Arnold D. Robbins <arnold@skeeve.com>	2015-06-19 12:42:37 +0300
committer	Arnold D. Robbins <arnold@skeeve.com>	2015-06-19 12:42:37 +0300
commit	ec58524cb5a671c18c4af1b893e599eb04c7760a (patch)
tree	1d1c3d298ec82caa03c0cf5caeb0dd14b08ce247 /doc/gawk.texi
parent	76e1f5bfee032dbcb5c19b3e4e92f96aa05731c3 (diff)
parent	f7cd8a03c09a00c4cb520f881bbe838cf76e718f (diff)
download	egawk-ec58524cb5a671c18c4af1b893e599eb04c7760a.tar.gz egawk-ec58524cb5a671c18c4af1b893e599eb04c7760a.tar.bz2 egawk-ec58524cb5a671c18c4af1b893e599eb04c7760a.zip