.\" cppawk: C preprocessor wrapper around awk
.\" Copyright 2022 Kaz Kylheku <kaz@kylheku.com>
.\"
.\" BSD-2 License
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions are met:
.\"
.\" 1. Redistributions of source code must retain the above copyright notice,
.\"    this list of conditions and the following disclaimer.
.\"
.\" 2. Redistributions in binary form must reproduce the above copyright notice,
.\"    this list of conditions and the following disclaimer in the documentation
.\"    and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
.\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
.\" LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.TH CPPAWK-NARG 1 "19 April 2022" "cppawk Libraries" "Variadic Macros"

.SH NAME
.I narg
\- macros for writing variable argument macros

.SH SYNOPSIS
.ft B
  #include <narg.h>

  #define narg(...)
  #define splice(\fIargs\fP)
  #define varexpand(\fIfirst_mac\fP, \fIrest_mac\fP, ...)
  #define variexpand(\fIfirst_mac\fP, \fIrest_mac\fP, ...)
  #define variaexpand(\fIfirst_mac\fP, \fIrest_mac\fP, \fIarg\fP, ...)
  #define revarg(...)
.ft R

.SH DESCRIPTION
The
.I <narg.h>
header provides several macros which are useful to macro writers.
In particular, these macros make it easy to develop variable argument
macros which take one or more argument, and have complex expansions.

In this manual, the
.I ->
(arrow) notation means "expands to". For instance

.ft B
  foo(bar) \fI->\fP 42  // the macro call foo(bar) expands to 42
.ft R

A description of each macro follows:

.IP \fBnarg\fR
This macro takes one or more arguments, and expands to a decimal integer which
indicates how many arguments there are.

.ft B
  narg(\fIx\fP) \fI->\fP 1
  narg(\fIx\fP, \fIy\fP) \fI->\fP 2
  narg(\fIx\fP, \fIy\fP, \fIz\fP) \fI->\fP 3
.ft R

The
.B narg
macro can be called with up to 32 (thirty-two) arguments. If it is called
with between 33 to 48 arguments, it expands to an unspecified token
sequence which generates a syntax error in Awk. The token sequence begins
with an identifier and therefore may appear as the right operand of the
token-pasting
.B ##
operator, opposite to an identifier token.

If more than 48 arguments are given, the behavior is unspecified.

.IP \fBsplice\fR
The
.B splice
macro provides a shim for inserting a parenthesized argument into
a macro expansion, such that the argument turns into individual
arguments. Suppose we have a macro like this:

.ft B
   #define vmac(\fIa\fP, \fIb\fP, ...) ...
.ft R

For some reason, we need to write fixed a macro like this:

.ft B
   #define fmac(\fIx\fP, \fIy\fP, \fIargs\fP) vmac(\fIx\fP, \fIy\fP, \fI???\fP)
.ft R

where the
.B args
argument is a parenthesized list of arguments that must become the
.I ...
argument of the
.B vmac
macro. That is to say,
.B fmac
is to be invoked like this, with the indicated expansion:

.ft B
   fmac(\fI1\fP, \fI2\fP, (\fI3\fP, \fI4\fP, \fI5\fP)) \fI->\fP vmac(\fI1\fP, \fI2\fP, \fI3\fP, \fI4\fP, \fI5\fP)
.ft R

The
.B splice
macro solves the question of what to write into the position
indicated by the ??? question marks to achieve this:

.ft B
   #define fmac(\fIx\fP, \fIy\fP, \fIargs\fP) vmac(\fIx\fP, \fIy\fP, splice(\fIargs\fP))
.ft R

Example: produce the following macro:

.ft B
  csum((\fIa\fP, \fIb\fP, \fIc\fP), (\fIx\fP, \fIy\fP))   \fI->\fP (sqrt(sumsq(\fIa\fP, \fIb\fP, \fIc\fP)) +
                                sqrt(sumsq(\fIx\fP, \fIy\fP)))
.ft R

This is a trick example:
.B splice
is not required at all:

.ft B
  #define csum(\fIleft\fP, \fIright\fP) (sqrt(sumsq \fIleft\fP) + \e
                             sqrt(sumsq \fIright\fP))
.ft R

The
.B splice
macro is not required because the parenthesized arguments constitute
the entire argument list of
.BR sumsq .
However, suppose the requirement is this, requiring the parenthesized
arguments to be inserted into an argument list containing other arguments:

.ft B
  csum(\fIt\fP, (\fIa\fP, \fIb\fP, \fIc\fP), (\fIx\fP, \fIy\fP))  -> (sqrt(sumsq(\fIt\fP, \fIa\fP, \fIb\fP, \fIc\fP)) +
                                  sqrt(sumsq(\fIt\fP, \fIx\fP, \fIy\fP)))
.ft R

Now we need:

.ft B
  #define csum(\fIparm\fP, \fIleft\fP, \fIright\fP) (sqrt(sumsq(\fIparm\fP, \e
                                              splice(\fIleft\fP)) + \e
                                        sumsq(\fIparm\fP, \e
                                              splice(\fIright\fP))))
.ft R

.IP \fBrevarg\fR
This macro expands to a comma-separated list of its arguments, which
appear in reverse.

.ft B
   revarg(\fI1\fP) \fI->\fP \fI1\fP
   revarg(\fI1\fP, \fI2\fP) \fI->\fP \fI2\fP, \fI1\fP
   revarg(\fI1\fP, \fI2\fP, \fI3\fP) \fI->\fP \fI3\fP, \fI2\fP, \fI1\fP
.ft R

Like
.BR narg ,
the
.B revarg
macro can be called with up to 32 arguments, beyond which there is
some overflow detection up to 48 arguments, followed by unspecified behavior
for 49 or more arguments.

.IP \fBvarexpand\fR
The most complex macros in the
.I <narg.h>
header are
.B varexpand
and
.BR variexpand .

These macros are used for writing variadic macros with complex expansions,
using a compact specification.

The
.B varexpand
macro uses "higher order macro" programming: it has arguments which are
themselves macros.
The
.B variexpand
macro is a variation on this, explained after a complete description of
.B varexpand
is given.

To understand
.B varexpand
it helps to understand the Lisp
.B reduce
function, or the similar
.B fold
function found in functional languages. Recall that the prototype of the
.BI varexpand
macro is this:

.ft B
  #define varexpand(first_mac, rest_mac, ...)
.ft R

To use
.B varexpand
you must first write two macros: a one-argument macro whose name is passed
as the
.B first_mac
argument, and two argument macro to be used as the
.B rest_mac
argument.

Most variadic macros written with
.B varexpand
will pass through their
.B __VA_ARGS__
list as the
.I ...
parameter; however, the
.B splice
macro can also be used to place parenthesized argument lists
into that position

Up to 32 variadic arguments are accepted by
.B varexpand
beyond which there is overflow detection up to 48 arguments,
followed by unspecified behavior for 49 or more arguments.

Example: suppose we want to write a macro with an expansion like this:

.ft B
  add(\fI1\fP) \fI\fP-> \fI1\fP
  add(\fI1\fP, \fI2\fP) \fI->\fP \fI1\fP + \fI2\fP
  add(\fI1\fP, \fI2\fP, \fI3\fP) \fI->\fP \fI1\fP + \fI2\fP + \fI3\fP
.ft R

First, we must write a macro for handling the base case of the induction, which
is used for the leftmost argument. The expansion is trivial:

.ft B
  #define add_first(\fIx\fP) \fIx\fP
.ft R

The second macro is more complex. It takes two arguments. The left argument is
the accumulated expansion so far, of all the arguments previous to that
argument. The right argument is the next argument to be added to the expansion.

.ft B
  #define add_next(\fIacc\fP, \fIx\fP) \fIacc\fP + \fIx\fP
.ft R

For instance, if the arguments 1, 2 have already been expanded to 1 + 2
and the next argument is 3, then
.B acc
takes on the tokens 1 + 2, and
.B x
takes on 3. Thus the expansion is:

.ft B
  add_next(\fI1\fP + \fI2\fP, \fI3\fP) \fI->\fP \fI1\fP + \fI2\fP + \fI3\fP
.ft R

With these two macros, we can then write
.B add
like this:

.ft B
  #define add(\fI\fP...) varexpand(\fIadd_first\fP, \fIadd_next\fP, __VA_ARGS__)
.ft R

More complex example: suppose we want an inline sum-of-squares macro
which works like this:

.ft B
  sumsq(\fIx\fP)       \fI->\fP ((\fIx\fP)*(\fIx\fP))
  sumsq(\fIx\fP, \fIy\fP, \fIz\fP) \fI->\fP ((\fIx\fP)*(\fIx\fP) + (\fIy\fP)*(\fIy\fP) + (\fIz\fP)*(\fIz\fP))
.ft R

Note the detail that there are outer parentheses around the entire
expansion, but the individual terms are not parenthesized, only
the products. We write the helper macros like this:

.ft B
  #define sumsq_first(\fIx\fP)   (\fIx\fP)*(\fIx\fP)
  #define sumsq_next(\fIa\fP, \fIx\fP) \fIa\fP + sumsq_first(\fIx\fP)
.ft R

Note that
.B sumsq_next
reuses
.B sumsq_first
to avoid repeating the (x)*(x) term. Then we complete the implementation:

.ft B
  #define sumsq(...) (varexpand(\fIsumsq_first\fP, \e
                                \fIsumsq_next\fP,\e
                                __VA_ARGS__))
.ft R

The outer parentheses are written around the
.B varexpand
call. In general,
.B varexpand
can be just a small component of a larger macro expansion, and
can be used more than one time in a macro expansion.

Example:
.BI rlist
macro which generates a left-associative nested expression, like this:

.ft B
  rlist(\fI1\fP)        \fI->\fP cons(\fI1\fP, nil)
  rlist(\fI1\fP, \fI2\fP)     \fI->\fP cons(\fI2\fP, cons(\fI1\fP, nil))
  rlist(\fI1\fP, \fI2\fP, \fI3\fP)  -> cons(\fI3\fP, cons(\fI2\fP, cons(\fI1\fP, nil)))
.ft R

Implementation:

.ft B
  #define rlist_first(\fIx\fP)    cons(\fIx\fP, nil)
  #define rlist_next(\fIa\fP, \fIx\fP)  cons(\fIx\fP, \fIa\fP)

  #define rlist(...)        varexpand(\fIrlist_first\fP, \fIrlist_next\fP, \e
                                      __VA_ARGS__)
.ft R

What if we want the consing to produce the list in order via right
association, rather than in reverse? So that is to say:

.ft B
  list(\fI1\fP, \fI2\fP, \fI3\fP)   -> cons(\fI1\fP, cons(\fI2\fP, cons(\fI3\fP, nil)))
.ft R

Here we simply take advantage of the
.B revarg
macro to reverse the arguments:

.ft B
  #define list(...)         rlist(revarg(__VA_ARG__))
.ft R

.IP \fBvariexpand\fB
The
.B variexpand
macro is very similar to
.BR varexpand .
The difference is that
.B varexpand
passes an extra argument to both of the
.B first_mac
and
.BR rest_mac
macros. This argument is a decimal integer token indicating the master argument
position being expanded.

For instance, suppose we wish to have a macro with the following properties:

.ft B
  series(\fIa\fP) \fI->\fP \fIa1\fP
  series(\fIa\fP, \fIb\fP) \fI->\fP \fIa1\fP + \fIb2\fP
  series(\fIa\fP, \fIb\fP, \fIc\fP) \fI->\fP \fIa1\fP + \fIb2\fP + \fIc3\fP
.ft R

Note that the numbers do not appear as arguments. The
.B variexpand
macro will supply them:

.ft B
  #define series_first(\fIx\fP, \fIi\fP) \fIx\fP ## \fIi\fP
  #define series_next(\fIprev\fP, \fIx\fP, \fIi\fP) \fIprev\fP + \fIx\fP ## \fIi\fP
  #define series(...) variexpand(\fIseries_first\fP, \fIseries_next\fP, \e
                                 __VA_ARGS__)
.ft R

Here,
.B series_first
is always called with
.I i " ="
1, and
.B series_next
is called with
.I i
taking on the values 2, 3, ... .
The value of
.I i
indicates the one-based argument position of
.I x
in the
.B series
macro.

One use for this is the generation of better temporary variables.
The C preprocessor doesn't have a facility for generating temporary
variable names. An unsatisfactory substitute is the use of some private
namespace prefix like
.B __x
pasted together with the expansion of the
.B __LINE__
macro. However, macros can occur in the same line of code, or
as arguments of a larger multi-line macro during the expansion of which
.B __LINE__
is pinned to the same value.  If a large, multi-clause macro is based on
.BR variexpand ,
it can pass the argument number to its child clauses, which can combine
it with
.B __LINE__
and a prefix to generate unique variables.

.IP \fBvariaexpand\fB
One more macro in the
.B varexpand
family is
.BR variaexpand .

Like
.BR variexpand ,
.B variaexpand
also passes the argument number to its child clauses.
In addition to the argument number, it passes one more argument:
a fixed argument specified in the
.B variexpand
invocation.

For instance, suppose we wish to have a macro with the following properties:

.ft B
  series(\fIm\fP, \fIa\fP) \fI->\fP \fIm\fP(\fIa1\fP)
  series(\fIm\fP, \fIa\fP, \fIb\fP) \fI->\fP \fIm\fP(\fIa1\fP) + \fIm\fP(\fIb2\fP)
  series(\fIm\fP, \fIa\fP, \fIb\fP, \fIc\fP) \fI->\fP \fIm\fP(\fIa1\fP) + \fIm\fP(\fIb2\fP) + \fIm\fP(\fIc3\fP)
.ft R

In our expansion, we want the argument numbers to be put into correspondence
with the arguments, and the argument
.B x
to be distributed into the terms:

.ft B
  #define series_first(\fIx\fP, \fIi\fP, \fIa\fP) \fIa\fP(\fIx\fP ## \fIi\fP)
  #define series_next(\fIprev\fP, \fIx\fP, \fIi\fP, \fIa\fP) \fIprev\fP + \fIa\fP(\fIx\fP ## \fIi\fP)
  #define series(\fIa\fP, ...) variaexpand(\fIseries_first\fP, \e
                                     \fIseries_next\fP, \e
                                     \fIa\fP, __VA_ARGS__)
.ft R

.SH BUGS
As noted in the DESCRIPTION, the
.BR narg ,
.B revarg
and
.B varexpand
macros are limited to handling 32 variadic arguments, beyond
which there is a 16 argument safety margin with error detection,
followed by unspecified behavior.

The C preprocessor doesn't support macro recursion, which forbids
some complex uses of
.B varexpand
whereby the
.B first_mac
and
.B next_mac
macros themselves make use of
.BR varexpand .
A possible workaround is to clone the implementation of
.B varexpand
to produce an identical macro called
.BR varexpand2 .
This then allows for two "recursion" levels, whereby each one uses
the macro under a different name.

Both
.B "narg()"
and
.B "narg(x)"
expand to 1. This is a "feature" of the preprocessor: the empty
argument list is indistinguishable from an empty argument, because
preprocessor arguments are not required to be non-empty sequences
of tokens. For instance if
.B mac
is a macro which may be called with two arguments, then
.B "mac(,)"
is a valid call, which passes two empty arguments. Consequently,
if the comma is deleted from the syntax, then there is one empty argument.
The number of arguments is the number of commas plus one. This is why
.B narg
is specified as taking one or more arguments: it is not possible for
any macro to be given fewer arguments than one.

.SH AUTHOR
Kaz Kylheku <kaz@kylheku.com>

.SH COPYRIGHT
Copyright 2022, BSD2 License.