diff options
author | Jeff Johnston <jjohnstn@redhat.com> | 2008-10-31 21:03:42 +0000 |
---|---|---|
committer | Jeff Johnston <jjohnstn@redhat.com> | 2008-10-31 21:03:42 +0000 |
commit | d456d606e35a460db59c6609f8c994e658fe3d7c (patch) | |
tree | ad333ffd13bbc4aab45e4f7ff147836da64cf052 /newlib/libc/sys/linux/stdlib/regex.3 | |
parent | 029d147e9488a30456226dd76be4ec25aff350d6 (diff) | |
download | cygnal-d456d606e35a460db59c6609f8c994e658fe3d7c.tar.gz cygnal-d456d606e35a460db59c6609f8c994e658fe3d7c.tar.bz2 cygnal-d456d606e35a460db59c6609f8c994e658fe3d7c.zip |
2008-10-31 Jeff Johnston <jjohnstn@redhat.com>
* libc/include/limits.h: Add ARG_MAX, PATH_MAX, and _POSIX2_RE_DUP_MAX.
* libc/include/envlock.h: New file.
* libc/include/fnmatch.h: Ditto.
* libc/include/glob.h: Ditto.
* libc/include/regex.h: Ditto.
* libc/include/wordexp.h: Ditto.
* libc/posix/Makefile.am: Add new files moved from
libc/sys/linux/stdlib.
* libc/posix/Makefile.in: Regenerated.
* libc/posix/COPYRIGHT: New file moved from libc/sys/linux/stdlib.
* libc/posix/cclass.h: Ditto.
* libc/posix/cname.h: Ditto.
* libc/posix/collate.c: Ditto.
* libc/posix/collate.h: Ditto.
* libc/posix/collcmp.c: Ditto.
* libc/posix/engine.c: Ditto.
* libc/posix/fnmatch.3: Ditto.
* libc/posix/glob.3: Ditto.
* libc/posix/fnmatch.c: Ditto.
* libc/posix/glob.c: Ditto.
* libc/posix/namespace.h: Ditto.
* libc/posix/reallocf.c: Ditto.
* libc/posix/regcomp.c: Ditto.
* libc/posix/regerror.c: Ditto.
* libc/posix/regex.3: Ditto.
* libc/posix/regex2.h: Ditto.
* libc/posix/regexec.c: Ditto.
* libc/posix/regfree.c: Ditto.
* libc/posix/rune.h: Ditto.
* libc/posix/runetype.h: Ditto.
* libc/posix/scandir.c: Remove advertising clause which is not in
effect.
* libc/posix/sysexits.h: Ditto.
* libc/posix/un-namespace.h: Ditto.
* libc/posix/utils.h: Ditto.
* libc/posix/wordexp.c: Ditto.
* libc/posix/wordfree.c: Ditto.
* libc/posix/execl.c: Add !_NO_EXECVE flag check.
* libc/posix/execle.c: Ditto.
* libc/posix/execlp.c: Ditto.
* libc/posix/execv.c: Ditto.
* libc/posix/execve.c: Ditto.
* libc/posix/execvp.c: Ditto.
* libc/posix/popen.c: Add !_NO_POPEN flag check.
* libc/sys/linux/configure: Regenerated.
* libc/sys/linux/configure.in: Remove stdlib.
* libc/sys/linux/include/limits.h: Add include of linux/limits.h.
* libc/sys/linux/stdlib/Makefile.am: Removed.
* libc/sys/linux/stdlib/Makefile.in: Ditto.
* libc/sys/linux/stdlib/COPYRIGHT: Moved to libc/posix.
* libc/sys/linux/stdlib/cclass.h: Ditto.
* libc/sys/linux/stdlib/cname.h: Ditto.
* libc/sys/linux/stdlib/collate.c: Ditto.
* libc/sys/linux/stdlib/collate.h: Ditto.
* libc/sys/linux/stdlib/collcmp.c: Ditto.
* libc/sys/linux/stdlib/engine.c: Ditto.
* libc/sys/linux/stdlib/fnmatch.3: Ditto.
* libc/sys/linux/stdlib/fnmatch.c: Ditto.
* libc/sys/linux/stdlib/glob.3: Ditto.
* libc/sys/linux/stdlib/glob.c: Ditto.
* libc/sys/linux/stdlib/reallocf.c: Ditto.
* libc/sys/linux/stdlib/regcomp.c: Ditto.
* libc/sys/linux/stdlib/regerror.c: Ditto.
* libc/sys/linux/stdlib/regex.3: Ditto.
* libc/sys/linux/stdlib/regex2.h: Ditto.
* libc/sys/linux/stdlib/regexec.c: Ditto.
* libc/sys/linux/stdlib/regfree.c: Ditto.
* libc/sys/linux/stdlib/utils.h: Ditto.
* libc/sys/linux/stdlib/wordexp.c: Ditto.
* libc/sys/linux/stdlib/wordfree.c: Ditto.
Diffstat (limited to 'newlib/libc/sys/linux/stdlib/regex.3')
-rw-r--r-- | newlib/libc/sys/linux/stdlib/regex.3 | 701 |
1 files changed, 0 insertions, 701 deletions
diff --git a/newlib/libc/sys/linux/stdlib/regex.3 b/newlib/libc/sys/linux/stdlib/regex.3 deleted file mode 100644 index d87164177..000000000 --- a/newlib/libc/sys/linux/stdlib/regex.3 +++ /dev/null @@ -1,701 +0,0 @@ -.\" Copyright (c) 1992, 1993, 1994 Henry Spencer. -.\" Copyright (c) 1992, 1993, 1994 -.\" The Regents of the University of California. All rights reserved. -.\" -.\" This code is derived from software contributed to Berkeley by -.\" Henry Spencer. -.\" -.\" Redistribution and use in source and binary forms, with or without -.\" modification, are permitted provided that the following conditions -.\" are met: -.\" 1. Redistributions of source code must retain the above copyright -.\" notice, this list of conditions and the following disclaimer. -.\" 2. Redistributions in binary form must reproduce the above copyright -.\" notice, this list of conditions and the following disclaimer in the -.\" documentation and/or other materials provided with the distribution. -.\" 3. All advertising materials mentioning features or use of this software -.\" must display the following acknowledgement: -.\" This product includes software developed by the University of -.\" California, Berkeley and its contributors. -.\" 4. Neither the name of the University nor the names of its contributors -.\" may be used to endorse or promote products derived from this software -.\" without specific prior written permission. -.\" -.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND -.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE -.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL -.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS -.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) -.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT -.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY -.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF -.\" SUCH DAMAGE. -.\" -.\" @(#)regex.3 8.4 (Berkeley) 3/20/94 -.\" $FreeBSD: src/lib/libc/regex/regex.3,v 1.9 2001/10/01 16:08:58 ru Exp $ -.\" -.Dd March 20, 1994 -.Dt REGEX 3 -.Os -.Sh NAME -.Nm regcomp , -.Nm regexec , -.Nm regerror , -.Nm regfree -.Nd regular-expression library -.Sh LIBRARY -.Lb libc -.Sh SYNOPSIS -.In sys/types.h -.In regex.h -.Ft int -.Fn regcomp "regex_t *preg" "const char *pattern" "int cflags" -.Ft int -.Fo regexec -.Fa "const regex_t *preg" "const char *string" -.Fa "size_t nmatch" "regmatch_t pmatch[]" "int eflags" -.Fc -.Ft size_t -.Fo regerror -.Fa "int errcode" "const regex_t *preg" -.Fa "char *errbuf" "size_t errbuf_size" -.Fc -.Ft void -.Fn regfree "regex_t *preg" -.Sh DESCRIPTION -These routines implement -.St -p1003.2 -regular expressions -.Pq Do RE Dc Ns s ; -see -.Xr re_format 7 . -.Fn Regcomp -compiles an RE written as a string into an internal form, -.Fn regexec -matches that internal form against a string and reports results, -.Fn regerror -transforms error codes from either into human-readable messages, -and -.Fn regfree -frees any dynamically-allocated storage used by the internal form -of an RE. -.Pp -The header -.Aq Pa regex.h -declares two structure types, -.Ft regex_t -and -.Ft regmatch_t , -the former for compiled internal forms and the latter for match reporting. -It also declares the four functions, -a type -.Ft regoff_t , -and a number of constants with names starting with -.Dq Dv REG_ . -.Pp -.Fn Regcomp -compiles the regular expression contained in the -.Fa pattern -string, -subject to the flags in -.Fa cflags , -and places the results in the -.Ft regex_t -structure pointed to by -.Fa preg . -.Fa Cflags -is the bitwise OR of zero or more of the following flags: -.Bl -tag -width REG_EXTENDED -.It Dv REG_EXTENDED -Compile modern -.Pq Dq extended -REs, -rather than the obsolete -.Pq Dq basic -REs that -are the default. -.It Dv REG_BASIC -This is a synonym for 0, -provided as a counterpart to -.Dv REG_EXTENDED -to improve readability. -.It Dv REG_NOSPEC -Compile with recognition of all special characters turned off. -All characters are thus considered ordinary, -so the -.Dq RE -is a literal string. -This is an extension, -compatible with but not specified by -.St -p1003.2 , -and should be used with -caution in software intended to be portable to other systems. -.Dv REG_EXTENDED -and -.Dv REG_NOSPEC -may not be used -in the same call to -.Fn regcomp . -.It Dv REG_ICASE -Compile for matching that ignores upper/lower case distinctions. -See -.Xr re_format 7 . -.It Dv REG_NOSUB -Compile for matching that need only report success or failure, -not what was matched. -.It Dv REG_NEWLINE -Compile for newline-sensitive matching. -By default, newline is a completely ordinary character with no special -meaning in either REs or strings. -With this flag, -.Ql [^ -bracket expressions and -.Ql .\& -never match newline, -a -.Ql ^\& -anchor matches the null string after any newline in the string -in addition to its normal function, -and the -.Ql $\& -anchor matches the null string before any newline in the -string in addition to its normal function. -.It Dv REG_PEND -The regular expression ends, -not at the first NUL, -but just before the character pointed to by the -.Va re_endp -member of the structure pointed to by -.Fa preg . -The -.Va re_endp -member is of type -.Ft "const char *" . -This flag permits inclusion of NULs in the RE; -they are considered ordinary characters. -This is an extension, -compatible with but not specified by -.St -p1003.2 , -and should be used with -caution in software intended to be portable to other systems. -.El -.Pp -When successful, -.Fn regcomp -returns 0 and fills in the structure pointed to by -.Fa preg . -One member of that structure -(other than -.Va re_endp ) -is publicized: -.Va re_nsub , -of type -.Ft size_t , -contains the number of parenthesized subexpressions within the RE -(except that the value of this member is undefined if the -.Dv REG_NOSUB -flag was used). -If -.Fn regcomp -fails, it returns a non-zero error code; -see -.Sx DIAGNOSTICS . -.Pp -.Fn Regexec -matches the compiled RE pointed to by -.Fa preg -against the -.Fa string , -subject to the flags in -.Fa eflags , -and reports results using -.Fa nmatch , -.Fa pmatch , -and the returned value. -The RE must have been compiled by a previous invocation of -.Fn regcomp . -The compiled form is not altered during execution of -.Fn regexec , -so a single compiled RE can be used simultaneously by multiple threads. -.Pp -By default, -the NUL-terminated string pointed to by -.Fa string -is considered to be the text of an entire line, minus any terminating -newline. -The -.Fa eflags -argument is the bitwise OR of zero or more of the following flags: -.Bl -tag -width REG_STARTEND -.It Dv REG_NOTBOL -The first character of -the string -is not the beginning of a line, so the -.Ql ^\& -anchor should not match before it. -This does not affect the behavior of newlines under -.Dv REG_NEWLINE . -.It Dv REG_NOTEOL -The NUL terminating -the string -does not end a line, so the -.Ql $\& -anchor should not match before it. -This does not affect the behavior of newlines under -.Dv REG_NEWLINE . -.It Dv REG_STARTEND -The string is considered to start at -.Fa string -+ -.Fa pmatch Ns [0]. Ns Va rm_so -and to have a terminating NUL located at -.Fa string -+ -.Fa pmatch Ns [0]. Ns Va rm_eo -(there need not actually be a NUL at that location), -regardless of the value of -.Fa nmatch . -See below for the definition of -.Fa pmatch -and -.Fa nmatch . -This is an extension, -compatible with but not specified by -.St -p1003.2 , -and should be used with -caution in software intended to be portable to other systems. -Note that a non-zero -.Va rm_so -does not imply -.Dv REG_NOTBOL ; -.Dv REG_STARTEND -affects only the location of the string, -not how it is matched. -.El -.Pp -See -.Xr re_format 7 -for a discussion of what is matched in situations where an RE or a -portion thereof could match any of several substrings of -.Fa string . -.Pp -Normally, -.Fn regexec -returns 0 for success and the non-zero code -.Dv REG_NOMATCH -for failure. -Other non-zero error codes may be returned in exceptional situations; -see -.Sx DIAGNOSTICS . -.Pp -If -.Dv REG_NOSUB -was specified in the compilation of the RE, -or if -.Fa nmatch -is 0, -.Fn regexec -ignores the -.Fa pmatch -argument (but see below for the case where -.Dv REG_STARTEND -is specified). -Otherwise, -.Fa pmatch -points to an array of -.Fa nmatch -structures of type -.Ft regmatch_t . -Such a structure has at least the members -.Va rm_so -and -.Va rm_eo , -both of type -.Ft regoff_t -(a signed arithmetic type at least as large as an -.Ft off_t -and a -.Ft ssize_t ) , -containing respectively the offset of the first character of a substring -and the offset of the first character after the end of the substring. -Offsets are measured from the beginning of the -.Fa string -argument given to -.Fn regexec . -An empty substring is denoted by equal offsets, -both indicating the character following the empty substring. -.Pp -The 0th member of the -.Fa pmatch -array is filled in to indicate what substring of -.Fa string -was matched by the entire RE. -Remaining members report what substring was matched by parenthesized -subexpressions within the RE; -member -.Va i -reports subexpression -.Va i , -with subexpressions counted (starting at 1) by the order of their opening -parentheses in the RE, left to right. -Unused entries in the array (corresponding either to subexpressions that -did not participate in the match at all, or to subexpressions that do not -exist in the RE (that is, -.Va i -> -.Fa preg Ns -> Ns Va re_nsub ) ) -have both -.Va rm_so -and -.Va rm_eo -set to -1. -If a subexpression participated in the match several times, -the reported substring is the last one it matched. -(Note, as an example in particular, that when the RE -.Ql "(b*)+" -matches -.Ql bbb , -the parenthesized subexpression matches each of the three -.So Li b Sc Ns s -and then -an infinite number of empty strings following the last -.Ql b , -so the reported substring is one of the empties.) -.Pp -If -.Dv REG_STARTEND -is specified, -.Fa pmatch -must point to at least one -.Ft regmatch_t -(even if -.Fa nmatch -is 0 or -.Dv REG_NOSUB -was specified), -to hold the input offsets for -.Dv REG_STARTEND . -Use for output is still entirely controlled by -.Fa nmatch ; -if -.Fa nmatch -is 0 or -.Dv REG_NOSUB -was specified, -the value of -.Fa pmatch Ns [0] -will not be changed by a successful -.Fn regexec . -.Pp -.Fn Regerror -maps a non-zero -.Fa errcode -from either -.Fn regcomp -or -.Fn regexec -to a human-readable, printable message. -If -.Fa preg -is -.No non\- Ns Dv NULL , -the error code should have arisen from use of -the -.Ft regex_t -pointed to by -.Fa preg , -and if the error code came from -.Fn regcomp , -it should have been the result from the most recent -.Fn regcomp -using that -.Ft regex_t . -.No ( Fn Regerror -may be able to supply a more detailed message using information -from the -.Ft regex_t . ) -.Fn Regerror -places the NUL-terminated message into the buffer pointed to by -.Fa errbuf , -limiting the length (including the NUL) to at most -.Fa errbuf_size -bytes. -If the whole message won't fit, -as much of it as will fit before the terminating NUL is supplied. -In any case, -the returned value is the size of buffer needed to hold the whole -message (including terminating NUL). -If -.Fa errbuf_size -is 0, -.Fa errbuf -is ignored but the return value is still correct. -.Pp -If the -.Fa errcode -given to -.Fn regerror -is first ORed with -.Dv REG_ITOA , -the -.Dq message -that results is the printable name of the error code, -e.g.\& -.Dq Dv REG_NOMATCH , -rather than an explanation thereof. -If -.Fa errcode -is -.Dv REG_ATOI , -then -.Fa preg -shall be -.No non\- Ns Dv NULL -and the -.Va re_endp -member of the structure it points to -must point to the printable name of an error code; -in this case, the result in -.Fa errbuf -is the decimal digits of -the numeric value of the error code -(0 if the name is not recognized). -.Dv REG_ITOA -and -.Dv REG_ATOI -are intended primarily as debugging facilities; -they are extensions, -compatible with but not specified by -.St -p1003.2 , -and should be used with -caution in software intended to be portable to other systems. -Be warned also that they are considered experimental and changes are possible. -.Pp -.Fn Regfree -frees any dynamically-allocated storage associated with the compiled RE -pointed to by -.Fa preg . -The remaining -.Ft regex_t -is no longer a valid compiled RE -and the effect of supplying it to -.Fn regexec -or -.Fn regerror -is undefined. -.Pp -None of these functions references global variables except for tables -of constants; -all are safe for use from multiple threads if the arguments are safe. -.Sh IMPLEMENTATION CHOICES -There are a number of decisions that -.St -p1003.2 -leaves up to the implementor, -either by explicitly saying -.Dq undefined -or by virtue of them being -forbidden by the RE grammar. -This implementation treats them as follows. -.Pp -See -.Xr re_format 7 -for a discussion of the definition of case-independent matching. -.Pp -There is no particular limit on the length of REs, -except insofar as memory is limited. -Memory usage is approximately linear in RE size, and largely insensitive -to RE complexity, except for bounded repetitions. -See -.Sx BUGS -for one short RE using them -that will run almost any system out of memory. -.Pp -A backslashed character other than one specifically given a magic meaning -by -.St -p1003.2 -(such magic meanings occur only in obsolete -.Bq Dq basic -REs) -is taken as an ordinary character. -.Pp -Any unmatched -.Ql [\& -is a -.Dv REG_EBRACK -error. -.Pp -Equivalence classes cannot begin or end bracket-expression ranges. -The endpoint of one range cannot begin another. -.Pp -.Dv RE_DUP_MAX , -the limit on repetition counts in bounded repetitions, is 255. -.Pp -A repetition operator -.Ql ( ?\& , -.Ql *\& , -.Ql +\& , -or bounds) -cannot follow another -repetition operator. -A repetition operator cannot begin an expression or subexpression -or follow -.Ql ^\& -or -.Ql |\& . -.Pp -.Ql |\& -cannot appear first or last in a (sub)expression or after another -.Ql |\& , -i.e. an operand of -.Ql |\& -cannot be an empty subexpression. -An empty parenthesized subexpression, -.Ql "()" , -is legal and matches an -empty (sub)string. -An empty string is not a legal RE. -.Pp -A -.Ql {\& -followed by a digit is considered the beginning of bounds for a -bounded repetition, which must then follow the syntax for bounds. -A -.Ql {\& -.Em not -followed by a digit is considered an ordinary character. -.Pp -.Ql ^\& -and -.Ql $\& -beginning and ending subexpressions in obsolete -.Pq Dq basic -REs are anchors, not ordinary characters. -.Sh SEE ALSO -.Xr grep 1 , -.Xr re_format 7 -.Pp -.St -p1003.2 , -sections 2.8 (Regular Expression Notation) -and -B.5 (C Binding for Regular Expression Matching). -.Sh DIAGNOSTICS -Non-zero error codes from -.Fn regcomp -and -.Fn regexec -include the following: -.Pp -.Bl -tag -width REG_ECOLLATE -compact -.It Dv REG_NOMATCH -.Fn regexec -failed to match -.It Dv REG_BADPAT -invalid regular expression -.It Dv REG_ECOLLATE -invalid collating element -.It Dv REG_ECTYPE -invalid character class -.It Dv REG_EESCAPE -.Ql \e -applied to unescapable character -.It Dv REG_ESUBREG -invalid backreference number -.It Dv REG_EBRACK -brackets -.Ql "[ ]" -not balanced -.It Dv REG_EPAREN -parentheses -.Ql "( )" -not balanced -.It Dv REG_EBRACE -braces -.Ql "{ }" -not balanced -.It Dv REG_BADBR -invalid repetition count(s) in -.Ql "{ }" -.It Dv REG_ERANGE -invalid character range in -.Ql "[ ]" -.It Dv REG_ESPACE -ran out of memory -.It Dv REG_BADRPT -.Ql ?\& , -.Ql *\& , -or -.Ql +\& -operand invalid -.It Dv REG_EMPTY -empty (sub)expression -.It Dv REG_ASSERT -can't happen - you found a bug -.It Dv REG_INVARG -invalid argument, e.g. negative-length string -.El -.Sh HISTORY -Originally written by -.An Henry Spencer . -Altered for inclusion in the -.Bx 4.4 -distribution. -.Sh BUGS -This is an alpha release with known defects. -Please report problems. -.Pp -The back-reference code is subtle and doubts linger about its correctness -in complex cases. -.Pp -.Fn Regexec -performance is poor. -This will improve with later releases. -.Fa Nmatch -exceeding 0 is expensive; -.Fa nmatch -exceeding 1 is worse. -.Fn Regexec -is largely insensitive to RE complexity -.Em except -that back -references are massively expensive. -RE length does matter; in particular, there is a strong speed bonus -for keeping RE length under about 30 characters, -with most special characters counting roughly double. -.Pp -.Fn Regcomp -implements bounded repetitions by macro expansion, -which is costly in time and space if counts are large -or bounded repetitions are nested. -An RE like, say, -.Ql "((((a{1,100}){1,100}){1,100}){1,100}){1,100}" -will (eventually) run almost any existing machine out of swap space. -.Pp -There are suspected problems with response to obscure error conditions. -Notably, -certain kinds of internal overflow, -produced only by truly enormous REs or by multiply nested bounded repetitions, -are probably not handled well. -.Pp -Due to a mistake in -.St -p1003.2 , -things like -.Ql "a)b" -are legal REs because -.Ql )\& -is -a special character only in the presence of a previous unmatched -.Ql (\& . -This can't be fixed until the spec is fixed. -.Pp -The standard's definition of back references is vague. -For example, does -.Ql "a\e(\e(b\e)*\e2\e)*d" -match -.Ql "abbbd" ? -Until the standard is clarified, -behavior in such cases should not be relied on. -.Pp -The implementation of word-boundary matching is a bit of a kludge, -and bugs may lurk in combinations of word-boundary matching and anchoring. |