mirror of
https://git.hardenedbsd.org/hardenedbsd/HardenedBSD.git
synced 2024-11-15 23:05:49 +01:00
3031 lines
136 KiB
Plaintext
3031 lines
136 KiB
Plaintext
Fri Apr 2 17:31:59 1993 Jim Blandy (jimb@totoro.cs.oberlin.edu)
|
||
|
||
* Released version 0.12.
|
||
|
||
* regex.c (regerror): If errcode is zero, that's not a valid
|
||
error code, according to POSIX, but return "Success."
|
||
|
||
* regex.c (regerror): Remember to actually fetch the message
|
||
from re_error_msg.
|
||
|
||
* regex.c (regex_compile): Don't use the trick for ".*\n" on
|
||
".+\n". Since the latter involves laying an extra choice
|
||
point, the backward jump isn't adjusted properly.
|
||
|
||
Thu Mar 25 21:35:18 1993 Jim Blandy (jimb@totoro.cs.oberlin.edu)
|
||
|
||
* regex.c (regex_compile): In the handle_open and handle_close
|
||
sections, clear pending_exact to zero.
|
||
|
||
Tue Mar 9 12:03:07 1993 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu)
|
||
|
||
* regex.c (re_search_2): In the loop which searches forward
|
||
using fastmap, don't forget to cast the character from the
|
||
string to an unsigned before using it as an index into the
|
||
translate map.
|
||
|
||
Thu Jan 14 15:41:46 1993 David J. MacKenzie (djm@kropotkin.gnu.ai.mit.edu)
|
||
|
||
* regex.h: Never define const; let the callers do it.
|
||
configure.in: Don't define USING_AUTOCONF.
|
||
|
||
Wed Jan 6 20:49:29 1993 Jim Blandy (jimb@geech.gnu.ai.mit.edu)
|
||
|
||
* regex.c (regerror): Abort if ERRCODE is out of range.
|
||
|
||
Sun Dec 20 16:19:10 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu)
|
||
|
||
* configure.in: Arrange to #define USING_AUTOCONF.
|
||
* regex.h: If USING_AUTOCONF is #defined, don't mess with
|
||
`const' at all; autoconf has taken care of it.
|
||
|
||
Mon Dec 14 21:40:39 1992 David J. MacKenzie (djm@kropotkin.gnu.ai.mit.edu)
|
||
|
||
* regex.h (RE_SYNTAX_AWK): Fix typo. From Arnold Robbins.
|
||
|
||
Sun Dec 13 20:35:39 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu)
|
||
|
||
* regex.c (compile_range): Fetch the range start and end by
|
||
casting the pattern pointer to an `unsigned char *' before
|
||
fetching through it.
|
||
|
||
Sat Dec 12 09:41:01 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu)
|
||
|
||
* regex.c: Undo change of 12/7/92; it's better for Emacs to
|
||
#define HAVE_CONFIG_H.
|
||
|
||
Fri Dec 11 22:00:34 1992 Jim Meyering (meyering@hal.gnu.ai.mit.edu)
|
||
|
||
* regex.c: Define and use isascii-protected ctype.h macros.
|
||
|
||
Fri Dec 11 05:10:38 1992 Jim Blandy (jimb@totoro.cs.oberlin.edu)
|
||
|
||
* regex.c (re_match_2): Undo Karl's November 10th change; it
|
||
keeps the group in :\(.*\) from matching :/ properly.
|
||
|
||
Mon Dec 7 19:44:56 1992 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu)
|
||
|
||
* regex.c: #include config.h if either HAVE_CONFIG_H or emacs
|
||
is #defined.
|
||
|
||
Tue Dec 1 13:33:17 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu)
|
||
|
||
* regex.c [HAVE_CONFIG_H]: Include config.h.
|
||
|
||
Wed Nov 25 23:46:02 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu)
|
||
|
||
* regex.c (regcomp): Add parens around bitwise & for clarity.
|
||
Initialize preg->allocated to prevent segv.
|
||
|
||
Tue Nov 24 09:22:29 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu)
|
||
|
||
* regex.c: Use HAVE_STRING_H, not USG.
|
||
* configure.in: Check for string.h, not USG.
|
||
|
||
Fri Nov 20 06:33:24 1992 Karl Berry (karl@cs.umb.edu)
|
||
|
||
* regex.c (SIGN_EXTEND_CHAR) [VMS]: Back out of this change,
|
||
since Roland Roberts now says it was a localism.
|
||
|
||
Mon Nov 16 07:01:36 1992 Karl Berry (karl@cs.umb.edu)
|
||
|
||
* regex.h (const) [!HAVE_CONST]: Test another cpp symbol (from
|
||
Autoconf) before zapping const.
|
||
|
||
Sun Nov 15 05:36:42 1992 Jim Blandy (jimb@wookumz.gnu.ai.mit.edu)
|
||
|
||
* regex.c, regex.h: Changes for VMS from Roland B Roberts
|
||
<roberts@nsrl31.nsrl.rochester.edu>.
|
||
|
||
Thu Nov 12 11:31:15 1992 Karl Berry (karl@cs.umb.edu)
|
||
|
||
* Makefile.in (distfiles): Include INSTALL.
|
||
|
||
Tue Nov 10 09:29:23 1992 Karl Berry (karl@cs.umb.edu)
|
||
|
||
* regex.c (re_match_2): At maybe_pop_jump, if at end of string
|
||
and pattern, just quit the matching loop.
|
||
|
||
* regex.c (LETTER_P): Rename to `WORDCHAR_P'.
|
||
|
||
* regex.c (AT_STRINGS_{BEG,END}): Take `d' as an arg; change
|
||
callers.
|
||
|
||
* regex.c (re_match_2) [!emacs]: In wordchar and notwordchar
|
||
cases, advance d.
|
||
|
||
Wed Nov 4 15:43:58 1992 Karl Berry (karl@hal.gnu.ai.mit.edu)
|
||
|
||
* regex.h (const) [!__STDC__]: Don't define if it's already defined.
|
||
|
||
Sat Oct 17 19:28:19 1992 Karl Berry (karl@cs.umb.edu)
|
||
|
||
* regex.c (bcmp, bcopy, bzero): Only #define if they are not
|
||
already #defined.
|
||
|
||
* configure.in: Use AC_CONST.
|
||
|
||
Thu Oct 15 08:39:06 1992 Karl Berry (karl@cs.umb.edu)
|
||
|
||
* regex.h (const) [!const]: Conditionalize.
|
||
|
||
Fri Oct 2 13:31:42 1992 Karl Berry (karl@cs.umb.edu)
|
||
|
||
* regex.h (RE_SYNTAX_ED): New definition.
|
||
|
||
Sun Sep 20 12:53:39 1992 Karl Berry (karl@cs.umb.edu)
|
||
|
||
* regex.[ch]: remove traces of `longest_p' -- dumb idea to put
|
||
this into the pattern buffer, as it means parallelism loses.
|
||
|
||
* Makefile.in (config.status): use sh to run configure --no-create.
|
||
|
||
* Makefile.in (realclean): OK, don't remove configure.
|
||
|
||
Sat Sep 19 09:05:08 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (PUSH_FAILURE_POINT, POP_FAILURE_POINT) [DEBUG]: keep
|
||
track of how many failure points we push and pop.
|
||
(re_match_2) [DEBUG]: declare variables for that, and print results.
|
||
(DEBUG_PRINT4): new macro.
|
||
|
||
* regex.h (re_pattern_buffer): new field `longest_p' (to
|
||
eliminate backtracking if the user doesn't need it).
|
||
* regex.c (re_compile_pattern): initialize it (to 1).
|
||
(re_search_2): set it to zero if register information is not needed.
|
||
(re_match_2): if it's set, don't backtrack.
|
||
|
||
* regex.c (re_search_2): update fastmap only after checking that
|
||
the pattern is anchored.
|
||
|
||
* regex.c (re_match_2): do more debugging at maybe_pop_jump.
|
||
|
||
* regex.c (re_search_2): cast result of TRANSLATE for use in
|
||
array subscript.
|
||
|
||
Thu Sep 17 19:47:16 1992 Karl Berry (karl@geech.gnu.ai.mit.edu)
|
||
|
||
* Version 0.11.
|
||
|
||
Wed Sep 16 08:17:10 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (INIT_FAIL_STACK): rewrite as statements instead of a
|
||
complicated comma expr, to avoid compiler warnings (and also
|
||
simplify).
|
||
(re_compile_fastmap, re_match_2): change callers.
|
||
|
||
* regex.c (POP_FAILURE_POINT): cast pop of regstart and regend
|
||
to avoid compiler warnings.
|
||
|
||
* regex.h (RE_NEWLINE_ORDINARY): remove this syntax bit, and
|
||
remove uses.
|
||
* regex.c (at_{beg,end}line_loc_p): go the last mile: remove
|
||
the RE_NEWLINE_ORDINARY case which made the ^ in \n^ be an anchor.
|
||
|
||
Tue Sep 15 09:55:29 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (at_begline_loc_p): new fn.
|
||
(at_endline_loc_p): simplify at_endline_op_p.
|
||
(regex_compile): in ^/$ cases, call the above.
|
||
|
||
* regex.c (POP_FAILURE_POINT): rewrite the fn as a macro again,
|
||
as lord's profiling indicates the function is 20% of the time.
|
||
(re_match_2): callers changed.
|
||
|
||
* configure.in (AC_MEMORY_H): remove, since we never use memcpy et al.
|
||
|
||
Mon Sep 14 17:49:27 1992 Karl Berry (karl@hayley)
|
||
|
||
* Makefile.in (makeargs): include MFLAGS.
|
||
|
||
Sun Sep 13 07:41:45 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (regex_compile): in \1..\9 case, make it always
|
||
invalid to use \<digit> if there is no preceding <digit>th subexpr.
|
||
* regex.h (RE_NO_MISSING_BK_REF): remove this syntax bit.
|
||
|
||
* regex.c (regex_compile): remove support for invalid empty groups.
|
||
* regex.h (RE_NO_EMPTY_GROUPS): remove this syntax bit.
|
||
|
||
* regex.c (FREE_VARIABLES) [!REGEX_MALLOC]: define as alloca (0),
|
||
to reclaim memory.
|
||
|
||
* regex.h (RE_SYNTAX_POSIX_SED): don't bother with this.
|
||
|
||
Sat Sep 12 13:37:21 1992 Karl Berry (karl@hayley)
|
||
|
||
* README: incorporate emacs.diff.
|
||
|
||
* regex.h (_RE_ARGS) [!__STDC__]: define as empty parens.
|
||
|
||
* configure.in: add AC_ALLOCA.
|
||
|
||
* Put test files in subdir test, documentation in subdir doc.
|
||
Adjust Makefile.in and configure.in accordingly.
|
||
|
||
Thu Sep 10 10:29:11 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.h (RE_SYNTAX_{POSIX_,}SED): new definitions.
|
||
|
||
Wed Sep 9 06:27:09 1992 Karl Berry (karl@hayley)
|
||
|
||
* Version 0.10.
|
||
|
||
Tue Sep 8 07:32:30 1992 Karl Berry (karl@hayley)
|
||
|
||
* xregex.texinfo: put the day of month into the date.
|
||
|
||
* Makefile.in (realclean): remove Texinfo-generated files.
|
||
(distclean): remove empty sorted index files.
|
||
(clean): remove dvi files, etc.
|
||
|
||
* configure.in: test for more Unix variants.
|
||
|
||
* fileregex.c: new file.
|
||
Makefile.in (fileregex): new target.
|
||
|
||
* iregex.c (main): move variable decls to smallest scope.
|
||
|
||
* regex.c (FREE_VARIABLES): free reg_{,info_}dummy.
|
||
(re_match_2): check that the allocation for those two succeeded.
|
||
|
||
* regex.c (FREE_VAR): replace FREE_NONNULL with this.
|
||
(FREE_VARIABLES): call it.
|
||
(re_match_2) [REGEX_MALLOC]: initialize all our vars to NULL.
|
||
|
||
* tregress.c (do_match): generalize simple_match.
|
||
(SIMPLE_NONMATCH): new macro.
|
||
(SIMPLE_MATCH): change from routine.
|
||
|
||
* Makefile.in (regex.texinfo): make file readonly, so we don't
|
||
edit it by mistake.
|
||
|
||
* many files (re_default_syntax): rename to `re_syntax_options';
|
||
call re_set_syntax instead of assigning to the variable where
|
||
possible.
|
||
|
||
Mon Sep 7 10:12:16 1992 Karl Berry (karl@hayley)
|
||
|
||
* syntax.skel: don't use prototypes.
|
||
|
||
* {configure,Makefile}.in: new files.
|
||
|
||
* regex.c: include <string.h> `#if USG || STDC_HEADERS'; remove
|
||
obsolete test for `POSIX', and test for BSRTING.
|
||
Include <strings.h> if we are not USG or STDC_HEADERS.
|
||
Do not include <unistd.h>. What did we ever need that for?
|
||
|
||
* regex.h (RE_NO_EMPTY_ALTS): remove this.
|
||
(RE_SYNTAX_AWK): remove from here, too.
|
||
* regex.c (regex_compile): remove the check.
|
||
* xregex.texinfo (Alternation Operator): update.
|
||
* other.c (test_others): remove tests for this.
|
||
|
||
* regex.h (RE_DUP_MAX): undefine if already defined.
|
||
|
||
* regex.h: (RE_SYNTAX_POSIX*): redo to allow more operators, and
|
||
define new syntaxes with the minimal set.
|
||
|
||
* syntax.skel (main): used sscanf instead of scanf.
|
||
|
||
* regex.h (RE_SYNTAX_*GREP): new definitions from mike.
|
||
|
||
* regex.c (regex_compile): initialize the upper bound of
|
||
intervals at the beginning of the interval, not the end.
|
||
(From pclink@qld.tne.oz.au.)
|
||
|
||
* regex.c (handle_bar): rename to `handle_alt', for consistency.
|
||
|
||
* regex.c ({store,insert}_{op1,op2}): new routines (except the last).
|
||
({STORE,INSERT}_JUMP{,2}): macros to replace the old routines,
|
||
which took arguments in different orders, and were generally weird.
|
||
|
||
* regex.c (PAT_PUSH*): rename to `BUF_PUSH*' -- we're not
|
||
appending info to the pattern!
|
||
|
||
Sun Sep 6 11:26:49 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (regex_compile): delete the variable
|
||
`following_left_brace', since we never use it.
|
||
|
||
* regex.c (print_compiled_pattern): don't print the fastmap if
|
||
it's null.
|
||
|
||
* regex.c (re_compile_fastmap): handle
|
||
`on_failure_keep_string_jump' like `on_failure_jump'.
|
||
|
||
* regex.c (re_match_2): in `charset{,_not' case, cast the bit
|
||
count to unsigned, not unsigned char, in case we have a full
|
||
32-byte bit list.
|
||
|
||
* tregress.c (simple_match): remove.
|
||
(simple_test): rename as `simple_match'.
|
||
(simple_compile): print the error string if the compile failed.
|
||
|
||
* regex.c (DO_RANGE): rewrite as a function, `compile_range', so
|
||
we can debug it. Change pattern characters to unsigned char
|
||
*'s, and change the range variable to an unsigned.
|
||
(regex_compile): change calls.
|
||
|
||
Sat Sep 5 17:40:49 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.h (_RE_ARGS): new macro to put in argument lists (if
|
||
ANSI) or omit them (if K&R); don't declare routines twice.
|
||
|
||
* many files (obscure_syntax): rename to `re_default_syntax'.
|
||
|
||
Fri Sep 4 09:06:53 1992 Karl Berry (karl@hayley)
|
||
|
||
* GNUmakefile (extraclean): new target.
|
||
(realclean): delete the info files.
|
||
|
||
Wed Sep 2 08:14:42 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.h: doc fix.
|
||
|
||
Sun Aug 23 06:53:15 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.[ch] (re_comp): no const in the return type (from djm).
|
||
|
||
Fri Aug 14 07:25:46 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (DO_RANGE): declare variables as unsigned chars, not
|
||
signed chars (from jimb).
|
||
|
||
Wed Jul 29 18:33:53 1992 Karl Berry (karl@claude.cs.umb.edu)
|
||
|
||
* Version 0.9.
|
||
|
||
* GNUmakefile (distclean): do not remove regex.texinfo.
|
||
(realclean): remove it here.
|
||
|
||
* tregress.c (simple_test): initialize buf.buffer.
|
||
|
||
Sun Jul 26 08:59:38 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (push_dummy_failure): new opcode and corresponding
|
||
case in the various routines. Pushed at the end of
|
||
alternatives.
|
||
|
||
* regex.c (jump_past_next_alt): rename to `jump_past_alt', for
|
||
brevity.
|
||
(no_pop_jump): rename to `jump'.
|
||
|
||
* regex.c (regex_compile) [DEBUG]: terminate printing of pattern
|
||
with a newline.
|
||
|
||
* NEWS: new file.
|
||
|
||
* tregress.c (simple_{compile,match,test}): routines to simplify all
|
||
these little tests.
|
||
|
||
* tregress.c: test for matching as much as possible.
|
||
|
||
Fri Jul 10 06:53:32 1992 Karl Berry (karl@hayley)
|
||
|
||
* Version 0.8.
|
||
|
||
Wed Jul 8 06:39:31 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (SIGN_EXTEND_CHAR): #undef any previous definition, as
|
||
ours should always work properly.
|
||
|
||
Mon Jul 6 07:10:50 1992 Karl Berry (karl@hayley)
|
||
|
||
* iregex.c (main) [DEBUG]: conditionalize the call to
|
||
print_compiled_pattern.
|
||
|
||
* iregex.c (main): initialize buf.buffer to NULL.
|
||
* tregress (test_regress): likewise.
|
||
|
||
* regex.c (alloca) [sparc]: #if on HAVE_ALLOCA_H instead.
|
||
|
||
* tregress.c (test_regress): didn't have jla's test quite right.
|
||
|
||
Sat Jul 4 09:02:12 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.c (re_match_2): only REGEX_ALLOCATE all the register
|
||
vectors if the pattern actually has registers.
|
||
(match_end): new variable to avoid having to use best_regend[0].
|
||
|
||
* regex.c (IS_IN_FIRST_STRING): rename to FIRST_STRING_P.
|
||
|
||
* regex.c: doc fixes.
|
||
|
||
* tregess.c (test_regress): new fastmap test forwarded by rms.
|
||
|
||
* tregress.c (test_regress): initialize the fastmap field.
|
||
|
||
* tregress.c (test_regress): new test from jla that aborted
|
||
in re_search_2.
|
||
|
||
Fri Jul 3 09:10:05 1992 Karl Berry (karl@hayley)
|
||
|
||
* tregress.c (test_regress): add tests for translating charsets,
|
||
from kaoru.
|
||
|
||
* GNUmakefile (common): add alloca.o.
|
||
* alloca.c: new file, copied from bison.
|
||
|
||
* other.c (test_others): remove var `buf', since it's no longer used.
|
||
|
||
* Below changes from ro@TechFak.Uni-Bielefeld.DE.
|
||
|
||
* tregress.c (test_regress): initialize buf.allocated.
|
||
|
||
* regex.c (re_compile_fastmap): initialize `succeed_n_p'.
|
||
|
||
* GNUmakefile (regex): depend on $(common).
|
||
|
||
Wed Jul 1 07:12:46 1992 Karl Berry (karl@hayley)
|
||
|
||
* Version 0.7.
|
||
|
||
* regex.c: doc fixes.
|
||
|
||
Mon Jun 29 08:09:47 1992 Karl Berry (karl@fosse)
|
||
|
||
* regex.c (pop_failure_point): change string vars to
|
||
`const char *' from `unsigned char *'.
|
||
|
||
* regex.c: consolidate debugging stuff.
|
||
(print_partial_compiled_pattern): avoid enum clash.
|
||
|
||
Mon Jun 29 07:50:27 1992 Karl Berry (karl@hayley)
|
||
|
||
* xmalloc.c: new file.
|
||
* GNUmakefile (common): add it.
|
||
|
||
* iregex.c (print_regs): new routine (from jimb).
|
||
(main): call it.
|
||
|
||
Sat Jun 27 10:50:59 1992 Jim Blandy (jimb@pogo.cs.oberlin.edu)
|
||
|
||
* xregex.c (re_match_2): When we have accepted a match and
|
||
restored d from best_regend[0], we need to set dend
|
||
appropriately as well.
|
||
|
||
Sun Jun 28 08:48:41 1992 Karl Berry (karl@hayley)
|
||
|
||
* tregress.c: rename from regress.c.
|
||
|
||
* regex.c (print_compiled_pattern): improve charset case to ease
|
||
byte-counting.
|
||
Also, don't distinguish between Emacs and non-Emacs
|
||
{not,}wordchar opcodes.
|
||
|
||
* regex.c (print_fastmap): move here.
|
||
* test.c: from here.
|
||
* regex.c (print_{{partial,}compiled_pattern,double_string}):
|
||
rename from ..._printer. Change calls here and in test.c.
|
||
|
||
* regex.c: create from xregex.c and regexinc.c for once and for
|
||
all, and change the debug fns to be extern, instead of static.
|
||
* GNUmakefile: remove traces of xregex.c.
|
||
* test.c: put in externs, instead of including regexinc.c.
|
||
|
||
* xregex.c: move interactive main program and scanstring to iregex.c.
|
||
* iregex.c: new file.
|
||
* upcase.c, printchar.c: new files.
|
||
|
||
* various doc fixes and other cosmetic changes throughout.
|
||
|
||
* regexinc.c (compiled_pattern_printer): change variable name,
|
||
for consistency.
|
||
(partial_compiled_pattern_printer): print other info about the
|
||
compiled pattern, besides just the opcodes.
|
||
* xregex.c (regex_compile) [DEBUG]: print the compiled pattern
|
||
when we're done.
|
||
|
||
* xregex.c (re_compile_fastmap): in the duplicate case, set
|
||
`can_be_null' and return.
|
||
Also, set `bufp->can_be_null' according to a new variable,
|
||
`path_can_be_null'.
|
||
Also, rewrite main while loop to not test `p != NULL', since
|
||
we never set it that way.
|
||
Also, eliminate special `can_be_null' value for the endline case.
|
||
(re_search_2): don't test for the special value.
|
||
* regex.h (struct re_pattern_buffer): remove the definition.
|
||
|
||
Sat Jun 27 15:00:40 1992 Karl Berry (karl@hayley)
|
||
|
||
* xregex.c (re_compile_fastmap): remove the `RE_' from
|
||
`REG_RE_MATCH_NULL_AT_END'.
|
||
Also, assert the fastmap in the pattern buffer is non-null.
|
||
Also, reset `succeed_n_p' after we've
|
||
paid attention to it, instead of every time through the loop.
|
||
Also, in the `anychar' case, only clear fastmap['\n'] if the
|
||
syntax says to, and don't return prematurely.
|
||
Also, rearrange cases in some semblance of a rational order.
|
||
* regex.h (REG_RE_MATCH_NULL_AT_END): remove the `RE_' from the name.
|
||
|
||
* other.c: take bug reports from here.
|
||
* regress.c: new file for them.
|
||
* GNUmakefile (test): add it.
|
||
* main.c (main): new possible test.
|
||
* test.h (test_type): new value in enum.
|
||
|
||
Thu Jun 25 17:37:43 1992 Karl Berry (karl@hayley)
|
||
|
||
* xregex.c (scanstring) [test]: new function from jimb to allow some
|
||
escapes.
|
||
(main) [test]: call it (on the string, not the pattern).
|
||
|
||
* xregex.c (main): make return type `int'.
|
||
|
||
Wed Jun 24 10:43:03 1992 Karl Berry (karl@hayley)
|
||
|
||
* xregex.c (pattern_offset_t): change to `int', for the benefit
|
||
of patterns which compile to more than 2^15 bytes.
|
||
|
||
* xregex.c (GET_BUFFER_SPACE): remove spurious braces.
|
||
|
||
* xregex.texinfo (Using Registers): put in a stub to ``document''
|
||
the new function.
|
||
* regex.h (re_set_registers) [!__STDC__]: declare.
|
||
* xregex.c (re_set_registers): declare K&R style (also move to a
|
||
different place in the file).
|
||
|
||
Mon Jun 8 18:03:28 1992 Jim Blandy (jimb@pogo.cs.oberlin.edu)
|
||
|
||
* regex.h (RE_NREGS): Doc fix.
|
||
|
||
* xregex.c (re_set_registers): New function.
|
||
* regex.h (re_set_registers): Declaration for new function.
|
||
|
||
Fri Jun 5 06:55:18 1992 Karl Berry (karl@hayley)
|
||
|
||
* main.c (main): `return 0' instead of `exit (0)'. (From Paul Eggert)
|
||
|
||
* regexinc.c (SIGN_EXTEND_CHAR): cast to unsigned char.
|
||
(extract_number, EXTRACT_NUMBER): don't bother to cast here.
|
||
|
||
Tue Jun 2 07:37:53 1992 Karl Berry (karl@hayley)
|
||
|
||
* Version 0.6.
|
||
|
||
* Change copyrights to `1985, 89, ...'.
|
||
|
||
* regex.h (REG_RE_MATCH_NULL_AT_END): new macro.
|
||
* xregex.c (re_compile_fastmap): initialize `can_be_null' to
|
||
`p==pend', instead of in the test at the top of the loop (as
|
||
it was, it was always being set).
|
||
Also, set `can_be_null'=1 if we would jump to the end of the
|
||
pattern in the `on_failure_jump' cases.
|
||
(re_search_2): check if `can_be_null' is 1, not nonzero. This
|
||
was the original test in rms' regex; why did we change this?
|
||
|
||
* xregex.c (re_compile_fastmap): rename `is_a_succeed_n' to
|
||
`succeed_n_p'.
|
||
|
||
Sat May 30 08:09:08 1992 Karl Berry (karl@hayley)
|
||
|
||
* xregex.c (re_compile_pattern): declare `regnum' as `unsigned',
|
||
not `regnum_t', for the benefit of those patterns with more
|
||
than 255 groups.
|
||
|
||
* xregex.c: rename `failure_stack' to `fail_stack', for brevity;
|
||
likewise for `match_nothing' to `match_null'.
|
||
|
||
* regexinc.c (REGEX_REALLOCATE): take both the new and old
|
||
sizes, and copy only the old bytes.
|
||
* xregex.c (DOUBLE_FAILURE_STACK): pass both old and new.
|
||
* This change from Thorsten Ohl.
|
||
|
||
Fri May 29 11:45:22 1992 Karl Berry (karl@hayley)
|
||
|
||
* regexinc.c (SIGN_EXTEND_CHAR): define as `(signed char) c'
|
||
instead of relying on __CHAR_UNSIGNED__, to work with
|
||
compilers other than GCC. From Per Bothner.
|
||
|
||
* main.c (main): change return type to `int'.
|
||
|
||
Mon May 18 06:37:08 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.h (RE_SYNTAX_AWK): typo in RE_RE_UNMATCHED...
|
||
|
||
Fri May 15 10:44:46 1992 Karl Berry (karl@hayley)
|
||
|
||
* Version 0.5.
|
||
|
||
Sun May 3 13:54:00 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.h (struct re_pattern_buffer): now it's just `regs_allocated'.
|
||
(REGS_UNALLOCATED, REGS_REALLOCATE, REGS_FIXED): new constants.
|
||
* xregex.c (regexec, re_compile_pattern): set the field appropriately.
|
||
(re_match_2): and use it. bufp can't be const any more.
|
||
|
||
Fri May 1 15:43:09 1992 Karl Berry (karl@hayley)
|
||
|
||
* regexinc.c: unconditionally include <sys/types.h>, first.
|
||
|
||
* regex.h (struct re_pattern_buffer): rename
|
||
`caller_allocated_regs' to `regs_allocated_p'.
|
||
* xregex.c (re_compile_pattern): same change here.
|
||
(regexec): and here.
|
||
(re_match_2): reallocate registers if necessary.
|
||
|
||
Fri Apr 10 07:46:50 1992 Karl Berry (karl@hayley)
|
||
|
||
* regex.h (RE_SYNTAX{_POSIX,}_AWK): new definitions from Arnold.
|
||
|
||
Sun Mar 15 07:34:30 1992 Karl Berry (karl at hayley)
|
||
|
||
* GNUmakefile (dist): versionize regex.{c,h,texinfo}.
|
||
|
||
Tue Mar 10 07:05:38 1992 Karl Berry (karl at hayley)
|
||
|
||
* Version 0.4.
|
||
|
||
* xregex.c (PUSH_FAILURE_POINT): always increment the failure id.
|
||
(DEBUG_STATEMENT) [DEBUG]: execute the statement even if `debug'==0.
|
||
|
||
* xregex.c (pop_failure_point): if the saved string location is
|
||
null, keep the current value.
|
||
(re_match_2): at fail, test for a dummy failure point by
|
||
checking the restored pattern value, not string value.
|
||
(re_match_2): new case, `on_failure_keep_string_jump'.
|
||
(regex_compile): output this opcode in the .*\n case.
|
||
* regexinc.c (re_opcode_t): define the opcode.
|
||
(partial_compiled_pattern_pattern): add the new case.
|
||
|
||
Mon Mar 9 09:09:27 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (regex_compile): optimize .*\n to output an
|
||
unconditional jump to the ., instead of pushing failure points
|
||
each time through the loop.
|
||
|
||
* xregex.c (DOUBLE_FAILURE_STACK): compute the maximum size
|
||
ourselves (and correctly); change callers.
|
||
|
||
Sun Mar 8 17:07:46 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (failure_stack_elt_t): change to `const char *', to
|
||
avoid warnings.
|
||
|
||
* regex.h (re_set_syntax): declare this.
|
||
|
||
* xregex.c (pop_failure_point) [DEBUG]: conditionally pass the
|
||
original strings and sizes; change callers.
|
||
|
||
Thu Mar 5 16:35:35 1992 Karl Berry (karl at claude.cs.umb.edu)
|
||
|
||
* xregex.c (regnum_t): new type for register/group numbers.
|
||
(compile_stack_elt_t, regex_compile): use it.
|
||
|
||
* xregex.c (regexec): declare len as `int' to match re_search.
|
||
|
||
* xregex.c (re_match_2): don't declare p1 twice.
|
||
|
||
* xregex.c: change `while (1)' to `for (;;)' to avoid silly
|
||
compiler warnings.
|
||
|
||
* regex.h [__STDC__]: use #if, not #ifdef.
|
||
|
||
* regexinc.c (REGEX_REALLOCATE): cast the result of alloca to
|
||
(char *), to avoid warnings.
|
||
|
||
* xregex.c (regerror): declare variable as const.
|
||
|
||
* xregex.c (re_compile_pattern, re_comp): define as returning a const
|
||
char *.
|
||
* regex.h (re_compile_pattern, re_comp): likewise.
|
||
|
||
Thu Mar 5 15:57:56 1992 Karl Berry (karl@hal)
|
||
|
||
* xregex.c (regcomp): declare `syntax' as unsigned.
|
||
|
||
* xregex.c (re_match_2): try to avoid compiler warnings about
|
||
unsigned comparisons.
|
||
|
||
* GNUmakefile (test-xlc): new target.
|
||
|
||
* regex.h (reg_errcode_t): remove trailing comma from definition.
|
||
* regexinc.c (re_opcode_t): likewise.
|
||
|
||
Thu Mar 5 06:56:07 1992 Karl Berry (karl at hayley)
|
||
|
||
* GNUmakefile (dist): add version numbers automatically.
|
||
(versionfiles): new variable.
|
||
(regex.{c,texinfo}): don't add version numbers here.
|
||
* regex.h: put in placeholder instead of the version number.
|
||
|
||
Fri Feb 28 07:11:33 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (re_error_msg): declare const, since it is.
|
||
|
||
Sun Feb 23 05:41:57 1992 Karl Berry (karl at fosse)
|
||
|
||
* xregex.c (PAT_PUSH{,_2,_3}, ...): cast args to avoid warnings.
|
||
(regex_compile, regexec): return REG_NOERROR, instead
|
||
of 0, on success.
|
||
(boolean): define as char, and #define false and true.
|
||
* regexinc.c (STREQ): cast the result.
|
||
|
||
Sun Feb 23 07:45:38 1992 Karl Berry (karl at hayley)
|
||
|
||
* GNUmakefile (test-cc, test-hc, test-pcc): new targets.
|
||
|
||
* regex.inc (extract_number, extract_number_and_incr) [DEBUG]:
|
||
only define if we are debugging.
|
||
|
||
* xregex.c [_AIX]: do #pragma alloca first if necessary.
|
||
* regexinc.c [_AIX]: remove the #pragma from here.
|
||
|
||
* regex.h (reg_syntax_t): declare as unsigned, and redo the enum
|
||
as #define's again. Some compilers do stupid things with enums.
|
||
|
||
Thu Feb 20 07:19:47 1992 Karl Berry (karl at hayley)
|
||
|
||
* Version 0.3.
|
||
|
||
* xregex.c, regex.h (newline_anchor_match_p): rename to
|
||
`newline_anchor'; dumb idea to change the name.
|
||
|
||
Tue Feb 18 07:09:02 1992 Karl Berry (karl at hayley)
|
||
|
||
* regexinc.c: go back to original, i.e., don't include
|
||
<string.h> or define strchr.
|
||
* xregex.c (regexec): don't bother with adding characters after
|
||
newlines to the fastmap; instead, just don't use a fastmap.
|
||
* xregex.c (regcomp): set the buffer and fastmap fields to zero.
|
||
|
||
* xregex.texinfo (GNU r.e. compiling): have to initialize more
|
||
than two fields.
|
||
|
||
* regex.h (struct re_pattern_buffer): rename `newline_anchor' to
|
||
`newline_anchor_match_p', as we're back to two cases.
|
||
* xregex.c (regcomp, re_compile_pattern, re_comp): change
|
||
accordingly.
|
||
(re_match_2): at begline and endline, POSIX is not a special
|
||
case anymore; just check newline_anchor_match_p.
|
||
|
||
Thu Feb 13 16:29:33 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (*empty_string*): rename to *null_string*, for brevity.
|
||
|
||
Wed Feb 12 06:36:22 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (re_compile_fastmap): at endline, don't set fastmap['\n'].
|
||
(re_match_2): rewrite the begline/endline cases to take account
|
||
of the new field newline_anchor.
|
||
|
||
Tue Feb 11 14:34:55 1992 Karl Berry (karl at hayley)
|
||
|
||
* regexinc.c [!USG etc.]: include <strings.h> and define strchr
|
||
as index.
|
||
|
||
* xregex.c (re_search_2): when searching backwards, declare `c'
|
||
as a char and use casts when using it as an array subscript.
|
||
|
||
* xregex.c (regcomp): if REG_NEWLINE, set
|
||
RE_HAT_LISTS_NOT_NEWLINE. Set the `newline_anchor' field
|
||
appropriately.
|
||
(regex_compile): compile [^...] as matching a \n according to
|
||
the syntax bit.
|
||
(regexec): if doing REG_NEWLINE stuff, compile a fastmap and add
|
||
characters after any \n's to the newline.
|
||
* regex.h (RE_HAT_LISTS_NOT_NEWLINE): new syntax bit.
|
||
(struct re_pattern_buffer): rename `posix_newline' to
|
||
`newline_anchor', define constants for its values.
|
||
|
||
Mon Feb 10 07:22:50 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (re_compile_fastmap): combine the code at the top and
|
||
bottom of the loop, as it's essentially identical.
|
||
|
||
Sun Feb 9 10:02:19 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.texinfo (POSIX Translate Tables): remove this, as it
|
||
doesn't match the spec.
|
||
|
||
* xregex.c (re_compile_fastmap): if we finish off a path, go
|
||
back to the top (to set can_be_null) instead of returning
|
||
immediately.
|
||
|
||
* xregex.texinfo: changes from bob.
|
||
|
||
Sat Feb 1 07:03:25 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (re_search_2): doc fix (from rms).
|
||
|
||
Fri Jan 31 09:52:04 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.texinfo (GNU Searching): clarify the range arg.
|
||
|
||
* xregex.c (re_match_2, at_endline_op_p): add extra parens to
|
||
get rid of GCC 2's (silly, IMHO) warning about && within ||.
|
||
|
||
* xregex.c (common_op_match_empty_string_p): use
|
||
MATCH_NOTHING_UNSET_VALUE, not -1.
|
||
|
||
Thu Jan 16 08:43:02 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (SET_REGS_MATCHED): only set the registers from
|
||
lowest to highest.
|
||
|
||
* regexinc.c (MIN): new macro.
|
||
* xregex.c (re_match_2): only check min (num_regs,
|
||
regs->num_regs) when we set the returned regs.
|
||
|
||
* xregex.c (re_match_2): set registers after the first
|
||
num_regs to -1 before we return.
|
||
|
||
Tue Jan 14 16:01:42 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (re_match_2): initialize max (RE_NREGS, re_nsub + 1)
|
||
registers (from rms).
|
||
|
||
* xregex.c, regex.h: don't abbreviate `19xx' to `xx'.
|
||
|
||
* regexinc.c [!emacs]: include <sys/types.h> before <unistd.h>.
|
||
(from ro@thp.Uni-Koeln.DE).
|
||
|
||
Thu Jan 9 07:23:00 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.c (*unmatchable): rename to `match_empty_string_p'.
|
||
(CAN_MATCH_NOTHING): rename to `REG_MATCH_EMPTY_STRING_P'.
|
||
|
||
* regexinc.c (malloc, realloc): remove prototypes, as they can
|
||
cause clashes (from rms).
|
||
|
||
Mon Jan 6 12:43:24 1992 Karl Berry (karl at claude.cs.umb.edu)
|
||
|
||
* Version 0.2.
|
||
|
||
Sun Jan 5 10:50:38 1992 Karl Berry (karl at hayley)
|
||
|
||
* xregex.texinfo: bring more or less up-to-date.
|
||
* GNUmakefile (regex.texinfo): generate from regex.h and
|
||
xregex.texinfo.
|
||
* include.awk: new file.
|
||
|
||
* xregex.c: change all calls to the fn extract_number_and_incr
|
||
to the macro.
|
||
|
||
* xregex.c (re_match_2) [emacs]: in at_dot, use PTR_CHAR_POS + 1,
|
||
instead of bf_* and sl_*. Cast d to unsigned char *, to match
|
||
the declaration in Emacs' buffer.h.
|
||
[emacs19]: in before_dot, at_dot, and after_dot, likewise.
|
||
|
||
* regexinc.c: unconditionally include <sys/types.h>.
|
||
|
||
* regexinc.c (alloca) [!alloca]: Emacs config files sometimes
|
||
define this, so don't define it if it's already defined.
|
||
|
||
Sun Jan 5 06:06:53 1992 Karl Berry (karl at fosse)
|
||
|
||
* xregex.c (re_comp): fix type conflicts with regex_compile (we
|
||
haven't been compiling this).
|
||
|
||
* regexinc.c (SIGN_EXTEND_CHAR): use `__CHAR_UNSIGNED__', not
|
||
`CHAR_UNSIGNED'.
|
||
|
||
* regexinc.c (NULL) [!NULL]: define it (as zero).
|
||
|
||
* regexinc.c (extract_number): remove the temporaries.
|
||
|
||
Sun Jan 5 07:50:14 1992 Karl Berry (karl at hayley)
|
||
|
||
* regex.h (regerror) [!__STDC__]: return a size_t, not a size_t *.
|
||
|
||
* xregex.c (PUSH_FAILURE_POINT, ...): declare `destination' as
|
||
`char *' instead of `void *', to match alloca declaration.
|
||
|
||
* xregex.c (regerror): use `size_t' for the intermediate values
|
||
as well as the return type.
|
||
|
||
* xregex.c (regexec): cast the result of malloc.
|
||
|
||
* xregex.c (regexec): don't initialize `private_preg' in the
|
||
declaration, as old C compilers can't do that.
|
||
|
||
* xregex.c (main) [test]: declare printchar void.
|
||
|
||
* xregex.c (assert) [!DEBUG]: define this to do nothing, and
|
||
remove #ifdef DEBUG's from around asserts.
|
||
|
||
* xregex.c (re_match_2): remove error message when not debugging.
|
||
|
||
Sat Jan 4 09:45:29 1992 Karl Berry (karl at hayley)
|
||
|
||
* other.c: test the bizarre duplicate case in re_compile_fastmap
|
||
that I just noticed.
|
||
|
||
* test.c (general_test): don't test registers beyond the end of
|
||
correct_regs, as well as regs.
|
||
|
||
* xregex.c (regex_compile): at handle_close, don't assign to
|
||
*inner_group_loc if we didn't push a start_memory (because the
|
||
group number was too big). In fact, don't push or pop the
|
||
inner_group_offset in that case.
|
||
|
||
* regex.c: rename to xregex.c, since it's not the whole thing.
|
||
* regex.texinfo: likewise.
|
||
* GNUmakefile: change to match.
|
||
|
||
* regex.c [DEBUG]: only include <stdio.h> if debugging.
|
||
|
||
* regexinc.c (SIGN_EXTEND_CHAR) [CHAR_UNSIGNED]: if it's already
|
||
defined, don't redefine it.
|
||
|
||
* regex.c: define _GNU_SOURCE at the beginning.
|
||
* regexinc.c (isblank) [!isblank]: define it.
|
||
(isgraph) [!isgraph]: change conditional to this, and remove the
|
||
sequent stuff.
|
||
|
||
* regex.c (regex_compile): add `blank' character class.
|
||
|
||
* regex.c (regex_compile): don't use a uchar variable to loop
|
||
through all characters.
|
||
|
||
* regex.c (regex_compile): at '[', improve logic for checking
|
||
that we have enough space for the charset.
|
||
|
||
* regex.h (struct re_pattern_buffer): declare translate as char
|
||
* again. We only use it as an array subscript once, I think.
|
||
|
||
* regex.c (TRANSLATE): new macro to cast the data character
|
||
before subscripting.
|
||
(num_internal_regs): rename to `num_regs'.
|
||
|
||
Fri Jan 3 07:58:01 1992 Karl Berry (karl at hayley)
|
||
|
||
* regex.h (struct re_pattern_buffer): declare `allocated' and
|
||
`used' as unsigned long, since these are never negative.
|
||
|
||
* regex.c (compile_stack_element): rename to compile_stack_elt_t.
|
||
(failure_stack_element): similarly.
|
||
|
||
* regexinc.c (TALLOC, RETALLOC): new macros to simplify
|
||
allocation of arrays.
|
||
|
||
* regex.h (re_*) [__STDC__]: don't declare string args unsigned
|
||
char *; that makes them incompatible with string constants.
|
||
(struct re_pattern_buffer): declare the pattern and translate
|
||
table as unsigned char *.
|
||
* regex.c (most routines): use unsigned char vs. char consistently.
|
||
|
||
* regex.h (re_compile_pattern): do not declare the length arg as
|
||
const.
|
||
* regex.c (re_compile_pattern): likewise.
|
||
|
||
* regex.c (POINTER_TO_REG): rename to `POINTER_TO_OFFSET'.
|
||
|
||
* regex.h (re_registers): declare `start' and `end' as
|
||
`regoff_t', instead of `int'.
|
||
|
||
* regex.c (regexec): if either of the malloc's for the register
|
||
information fail, return failure.
|
||
|
||
* regex.h (RE_NREGS): define this again, as 30 (from jla).
|
||
(RE_ALLOCATE_REGISTERS): remove this.
|
||
(RE_SYNTAX_*): remove it from definitions.
|
||
(re_pattern_buffer): remove `return_default_num_regs', add
|
||
`caller_allocated_regs'.
|
||
* regex.c (re_compile_pattern): clear no_sub and
|
||
caller_allocated_regs in the pattern.
|
||
(regcomp): set caller_allocated_regs.
|
||
(re_match_2): do all register allocation at the end of the
|
||
match; implement new semantics.
|
||
|
||
* regex.c (MAX_REGNUM): new macro.
|
||
(regex_compile): at handle_open and handle_close, if the group
|
||
number is too large, don't push the start/stop memory.
|
||
|
||
Thu Jan 2 07:56:10 1992 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (re_match_2): if the back reference is to a group that
|
||
never matched, then goto fail, not really_fail. Also, don't
|
||
test if the pattern can match the empty string. Why did we
|
||
ever do that?
|
||
(really_fail): this label no longer needed.
|
||
|
||
* regexinc.c [STDC_HEADERS]: use only this to test if we should
|
||
include <stdlib.h>.
|
||
|
||
* regex.c (DO_RANGE, regex_compile): translate in all cases
|
||
except the single character after a \.
|
||
|
||
* regex.h (RE_AWK_CLASS_HACK): rename to
|
||
RE_BACKSLASH_ESCAPE_IN_LISTS.
|
||
* regex.c (regex_compile): change use.
|
||
|
||
* regex.c (re_compile_fastmap): do not translate the characters
|
||
again; we already translated them at compilation. (From ylo@ngs.fi.)
|
||
|
||
* regex.c (re_match_2): in case for at_dot, invert sense of
|
||
comparison and find the character number properly. (From
|
||
worley@compass.com.)
|
||
(re_match_2) [emacs]: remove the cases for before_dot and
|
||
after_dot, since there's no way to specify them, and the code
|
||
is wrong (judging from this change).
|
||
|
||
Wed Jan 1 09:13:38 1992 Karl Berry (karl at hayley)
|
||
|
||
* psx-{interf,basic,extend}.c, other.c: set `t' as the first
|
||
thing, so that if we run them in sucession, general_test's
|
||
kludge to see if we're doing POSIX tests works.
|
||
|
||
* test.h (test_type): add `all_test'.
|
||
* main.c: add case for `all_test'.
|
||
|
||
* regexinc.c (partial_compiled_pattern_printer,
|
||
double_string_printer): don't print anything if we're passed null.
|
||
|
||
* regex.c (PUSH_FAILURE_POINT): do not scan for the highest and
|
||
lowest active registers.
|
||
(re_match_2): compute lowest/highest active regs at start_memory and
|
||
stop_memory.
|
||
(NO_{LOW,HIGH}EST_ACTIVE_REG): new sentinel values.
|
||
(pop_failure_point): return the lowest/highest active reg values
|
||
popped; change calls.
|
||
|
||
* regex.c [DEBUG]: include <assert.h>.
|
||
(various routines) [DEBUG]: change conditionals to assertions.
|
||
|
||
* regex.c (DEBUG_STATEMENT): new macro.
|
||
(PUSH_FAILURE_POINT): use it to increment num_regs_pushed.
|
||
(re_match_2) [DEBUG]: only declare num_regs_pushed if DEBUG.
|
||
|
||
* regex.c (*can_match_nothing): rename to *unmatchable.
|
||
|
||
* regex.c (re_match_2): at stop_memory, adjust argument reading.
|
||
|
||
* regex.h (re_pattern_buffer): declare `can_be_null' as a 2-bit
|
||
bit field.
|
||
|
||
* regex.h (re_pattern_buffer): declare `buffer' unsigned char *;
|
||
no, dumb idea. The pattern can have signed number.
|
||
|
||
* regex.c (re_match_2): in maybe_pop_jump case, skip over the
|
||
right number of args to the group operators, and don't do
|
||
anything with endline if posix_newline is not set.
|
||
|
||
* regex.c, regexinc.c (all the things we just changed): go back
|
||
to putting the inner group count after the start_memory,
|
||
because we need it in the on_failure_jump case in re_match_2.
|
||
But leave it after the stop_memory also, since we need it
|
||
there in re_match_2, and we don't have any way of getting back
|
||
to the start_memory.
|
||
|
||
* regexinc.c (partial_compiled_pattern_printer): adjust argument
|
||
reading for start/stop_memory.
|
||
* regex.c (re_compile_fastmap, group_can_match_nothing): likewise.
|
||
|
||
Tue Dec 31 10:15:08 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (bits list routines): remove these.
|
||
(re_match_2): get the number of inner groups from the pattern,
|
||
instead of keeping track of it at start and stop_memory.
|
||
Put the count after the stop_memory, not after the
|
||
start_memory.
|
||
(compile_stack_element): remove `fixup_inner_group' member,
|
||
since we now put it in when we can compute it.
|
||
(regex_compile): at handle_open, don't push the inner group
|
||
offset, and at handle_close, don't pop it.
|
||
|
||
* regex.c (level routines): remove these, and their uses in
|
||
regex_compile. This was another manifestation of having to find
|
||
$'s that were endlines.
|
||
|
||
* regex.c (regexec): this does searching, not matching (a
|
||
well-disguised part of the standard). So rewrite to use
|
||
`re_search' instead of `re_match'.
|
||
* psx-interf.c (test_regexec): add tests to, uh, match.
|
||
|
||
* regex.h (RE_TIGHT_ALT): remove this; nobody uses it.
|
||
* regex.c: remove the code that was supposed to implement it.
|
||
|
||
* other.c (test_others): ^ and $ never match newline characters;
|
||
RE_CONTEXT_INVALID_OPS doesn't affect anchors.
|
||
|
||
* psx-interf.c (test_regerror): update for new error messages.
|
||
|
||
* psx-extend.c: it's now ok to have an alternative be just a $,
|
||
so remove all the tests which supposed that was invalid.
|
||
|
||
Wed Dec 25 09:00:05 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (regex_compile): in handle_open, don't skip over ^ and
|
||
$ when checking for an empty group. POSIX has changed the
|
||
grammar.
|
||
* psx-extend.c (test_posix_extended): thus, move (^$) tests to
|
||
valid section.
|
||
|
||
* regexinc.c (boolean): move from here to test.h and regex.c.
|
||
* test files: declare verbose, omit_register_tests, and
|
||
test_should_match as boolean.
|
||
|
||
* psx-interf.c (test_posix_c_interface): remove the `c_'.
|
||
* main.c: likewise.
|
||
|
||
* psx-basic.c (test_posix_basic): ^ ($) is an anchor after
|
||
(before) an open (close) group.
|
||
|
||
* regex.c (re_match_2): in endline, correct precedence of
|
||
posix_newline condition.
|
||
|
||
Tue Dec 24 06:45:11 1991 Karl Berry (karl at hayley)
|
||
|
||
* test.h: incorporate private-tst.h.
|
||
* test files: include test.h, not private-tst.h.
|
||
|
||
* test.c (general_test): set posix_newline to zero if we are
|
||
doing POSIX tests (unfortunately, it's difficult to call
|
||
regcomp in this case, which is what we should really be doing).
|
||
|
||
* regex.h (reg_syntax_t): make this an enumeration type which
|
||
defines the syntax bits; renames re_syntax_t.
|
||
|
||
* regex.c (at_endline_op_p): don't preincrement p; then if it's
|
||
not an empty string op, we lose.
|
||
|
||
* regex.h (reg_errcode_t): new enumeration type of the error
|
||
codes.
|
||
* regex.c (regex_compile): return that type.
|
||
|
||
* regex.c (regex_compile): in [, initialize
|
||
just_had_a_char_class to false; somehow I had changed this to
|
||
true.
|
||
|
||
* regex.h (RE_NO_CONSECUTIVE_REPEATS): remove this, since we
|
||
don't use it, and POSIX doesn't require this behavior anymore.
|
||
* regex.c (regex_compile): remove it from here.
|
||
|
||
* regex.c (regex_compile): remove the no_op insertions for
|
||
verify_and_adjust_endlines, since that doesn't exist anymore.
|
||
|
||
* regex.c (regex_compile) [DEBUG]: use printchar to print the
|
||
pattern, so unprintable bytes will print properly.
|
||
|
||
* regex.c: move re_error_msg back.
|
||
* test.c (general_test): print the compile error if the pattern
|
||
was invalid.
|
||
|
||
Mon Dec 23 08:54:53 1991 Karl Berry (karl at hayley)
|
||
|
||
* regexinc.c: move re_error_msg here.
|
||
|
||
* regex.c (re_error_msg): the ``message'' for success must be
|
||
NULL, to keep the interface to re_compile_pattern the same.
|
||
(regerror): if the msg is null, use "Success".
|
||
|
||
* rename most test files for consistency. Change Makefile
|
||
correspondingly.
|
||
|
||
* test.c (most routines): add casts to (unsigned char *) when we
|
||
call re_{match,search}{,_2}.
|
||
|
||
Sun Dec 22 09:26:06 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (re_match_2): declare string args as unsigned char *
|
||
again; don't declare non-pointer args const; declare the
|
||
pattern buffer const.
|
||
(re_match): likewise.
|
||
(re_search_2, re_search): likewise, except don't declare the
|
||
pattern const, since we make a fastmap.
|
||
* regex.h [__STDC__]: change prototypes.
|
||
|
||
* regex.c (regex_compile): return an error code, not a string.
|
||
(re_err_list): new table to map from error codes to string.
|
||
(re_compile_pattern): return an element of re_err_list.
|
||
(regcomp): don't test all the strings.
|
||
(regerror): just use the list.
|
||
(put_in_buffer): remove this.
|
||
|
||
* regex.c (equivalent_failure_points): remove this.
|
||
|
||
* regex.c (re_match_2): don't copy the string arguments into
|
||
non-const pointers. We never alter the data.
|
||
|
||
* regex.c (re_match_2): move assignment to `is_a_jump_n' out of
|
||
the main loop. Just initialize it right before we do
|
||
something with it.
|
||
|
||
* regex.[ch] (re_match_2): don't declare the int parameters const.
|
||
|
||
Sat Dec 21 08:52:20 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.h (re_syntax_t): new type; declare to be unsigned
|
||
(previously we used int, but since we do bit operations on
|
||
this, unsigned is better, according to H&S).
|
||
(obscure_syntax, re_pattern_buffer): use that type.
|
||
* regex.c (re_set_syntax, regex_compile): likewise.
|
||
|
||
* regex.h (re_pattern_buffer): new field `posix_newline'.
|
||
* regex.c (re_comp, re_compile_pattern): set to zero.
|
||
(regcomp): set to REG_NEWLINE.
|
||
* regex.h (RE_HAT_LISTS_NOT_NEWLINE): remove this (we can just
|
||
check `posix_newline' instead.)
|
||
|
||
* regex.c (op_list_type, op_list, add_op): remove these.
|
||
(verify_and_adjust_endlines): remove this.
|
||
(pattern_offset_list_type, *pattern_offset* routines): and these.
|
||
These things all implemented the nonleading/nontrailing position
|
||
code, which was very long, had a few remaining problems, and
|
||
is no longer needed. So...
|
||
|
||
* regexinc.c (STREQ): new macro to abbreviate strcmp(,)==0, for
|
||
brevity. Change various places in regex.c to use it.
|
||
|
||
* regex{,inc}.c (enum regexpcode): change to a typedef
|
||
re_opcode_t, for brevity.
|
||
|
||
* regex.h (re_syntax_table) [SYNTAX_TABLE]: remove this; it
|
||
should only be in regex.c, I think, since we don't define it
|
||
in this case. Maybe it should be conditional on !SYNTAX_TABLE?
|
||
|
||
* regexinc.c (partial_compiled_pattern_printer): simplify and
|
||
distinguish the emacs/not-emacs (not)wordchar cases.
|
||
|
||
Fri Dec 20 08:11:38 1991 Karl Berry (karl at hayley)
|
||
|
||
* regexinc.c (regexpcode) [emacs]: only define the Emacs opcodes
|
||
if we are ifdef emacs.
|
||
|
||
* regex.c (BUF_PUSH*): rename to PAT_PUSH*.
|
||
|
||
* regex.c (regex_compile): in $ case, go back to essentially the
|
||
original code for deciding endline op vs. normal char.
|
||
(at_endline_op_p): new routine.
|
||
* regex.h (RE_ANCHORS_ONLY_AT_ENDS, RE_CONTEXT_INVALID_ANCHORS,
|
||
RE_REPEATED_ANCHORS_AWAY, RE_NO_ANCHOR_AT_NEWLINE): remove
|
||
these. POSIX has simplified the rules for anchors in draft
|
||
11.2.
|
||
(RE_NEWLINE_ORDINARY): new syntax bit.
|
||
(RE_CONTEXT_INDEP_ANCHORS): change description to be compatible
|
||
with POSIX.
|
||
* regex.texinfo (Syntax Bits): remove the descriptions.
|
||
|
||
Mon Dec 16 08:12:40 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (re_match_2): in jump_past_next_alt, unconditionally
|
||
goto no_pop. The only register we were finding was one which
|
||
enclosed the whole alternative expression, not one around an
|
||
individual alternative. So we were never doing what we
|
||
thought we were doing, and this way makes (|a) against the
|
||
empty string fail.
|
||
|
||
* regex.c (regex_compile): remove `highest_ever_regnum', and
|
||
don't restore regnum from the stack; just put it into a
|
||
temporary to put into the stop_memory. Otherwise, groups
|
||
aren't numbered consecutively.
|
||
|
||
* regex.c (is_in_compile_stack): rename to
|
||
`group_in_compile_stack'; remove unnecessary test for the
|
||
stack being empty.
|
||
|
||
* regex.c (re_match_2): in on_failure_jump, skip no_op's before
|
||
checking for the start_memory, in case we were called from
|
||
succeed_n.
|
||
|
||
Sun Dec 15 16:20:48 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (regex_compile): in duplicate case, use
|
||
highest_ever_regnum instead of regnum, since the latter is
|
||
reverted at stop_memory.
|
||
|
||
* regex.c (re_match_2): in on_failure_jump, if the * applied to
|
||
a group, save the information for that group and all inner
|
||
groups (by making it active), even though we're not inside it
|
||
yet.
|
||
|
||
Sat Dec 14 09:50:59 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (PUSH_FAILURE_ITEM, POP_FAILURE_ITEM): new macros.
|
||
Use them instead of copying the stack manipulating a zillion
|
||
times.
|
||
|
||
* regex.c (PUSH_FAILURE_POINT, pop_failure_point) [DEBUG]: save
|
||
and restore a unique identification value for each failure point.
|
||
|
||
* regexinc.c (partial_compiled_pattern_printer): don't print an
|
||
extra / after duplicate commands.
|
||
|
||
* regex.c (regex_compile): in back-reference case, allow a back
|
||
reference to register `regnum'. Otherwise, even `\(\)\1'
|
||
fails, since regnum is 1 at the back-reference.
|
||
|
||
* regex.c (re_match_2): in fail, don't examine the pattern if we
|
||
restored to pend.
|
||
|
||
* test_private.h: rename to private_tst.h. Change includes.
|
||
|
||
* regex.c (extend_bits_list): compute existing size for realloc
|
||
in bytes, not blocks.
|
||
|
||
* regex.c (re_match_2): in jump_past_next_alt, the for loop was
|
||
missing its (empty) statement. Even so, some register tests
|
||
still fail, although in a different way than in the previous change.
|
||
|
||
Fri Dec 13 15:55:08 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (re_match_2): in jump_past_next_alt, unconditionally
|
||
goto no_pop, since we weren't properly detecting if the
|
||
alternative matched something anyway. No, we need to not jump
|
||
to keep the register values correct; just change to not look at
|
||
register zero and not test RE_NO_EMPTY_ALTS (which is a
|
||
compile-time thing).
|
||
|
||
* regex.c (SET_REGS_MATCHED): start the loop at 1, since we never
|
||
care about register zero until the very end. (I think.)
|
||
|
||
* regex.c (PUSH_FAILURE_POINT, pop_failure_point): go back to
|
||
pushing and popping the active registers, instead of only doing
|
||
the registers before a group: (fooq|fo|o)*qbar against fooqbar
|
||
fails, since we restore back into the middle of group 1, yet it
|
||
isn't active, because the previous restore clobbered the active flag.
|
||
|
||
Thu Dec 12 17:25:36 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (PUSH_FAILURE_POINT): do not call
|
||
`equivalent_failure_points' after all; it causes the registers
|
||
to be ``wrong'' (according to POSIX), and an infinite loop on
|
||
`((a*)*)*' against `ab'.
|
||
|
||
* regex.c (re_compile_fastmap): don't push `pend' on the failure
|
||
stack.
|
||
|
||
Tue Dec 10 10:30:03 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (PUSH_FAILURE_POINT): if pushing same failure point that
|
||
is on the top of the stack, fail.
|
||
(equivalent_failure_points): new routine.
|
||
|
||
* regex.c (re_match_2): add debug statements for every opcode we
|
||
execute.
|
||
|
||
* regex.c (regex_compile/handle_close): restore
|
||
`fixup_inner_group_count' and `regnum' from the stack.
|
||
|
||
Mon Dec 9 13:51:15 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (PUSH_FAILURE_POINT): declare `this_reg' as int, so
|
||
unsigned arithmetic doesn't happen when we don't want to save
|
||
the registers.
|
||
|
||
Tue Dec 3 08:11:10 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (extend_bits_list): divide size by bits/block.
|
||
|
||
* regex.c (init_bits_list): remove redundant assignmen to
|
||
`bits_list_ptr'.
|
||
|
||
* regexinc.c (partial_compiled_pattern_printer): don't do *p++
|
||
twice in the same expr.
|
||
|
||
* regex.c (re_match_2): at on_failure_jump, use the correct
|
||
pattern positions for getting the stuff following the start_memory.
|
||
|
||
* regex.c (struct register_info): remove the bits_list for the
|
||
inner groups; make that a separate variable.
|
||
|
||
Mon Dec 2 10:42:07 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (PUSH_FAILURE_POINT): don't pass `failure_stack' as an
|
||
arg; change callers.
|
||
|
||
* regex.c (PUSH_FAILURE_POINT): print items in order they are
|
||
pushed.
|
||
(pop_failure_point): likewise.
|
||
|
||
* regex.c (main): prompt for the pattern and string.
|
||
|
||
* regex.c (FREE_VARIABLES) [!REGEX_MALLOC]: declare as nothing;
|
||
remove #ifdefs from around calls.
|
||
|
||
* regex.c (extract_number, extract_number_and_incr): declare static.
|
||
|
||
* regex.c: remove the canned main program.
|
||
* main.c: new file.
|
||
* Makefile (COMMON): add main.o.
|
||
|
||
Tue Sep 24 06:26:51 1991 Kathy Hargreaves (kathy at fosse)
|
||
|
||
* regex.c (re_match_2): Made `pend' and `dend' not register variables.
|
||
Only set string2 to string1 if string1 isn't null.
|
||
Send address of p, d, regstart, regend, and reg_info to
|
||
pop_failure_point.
|
||
Put in more debug statements.
|
||
|
||
* regex.c [debug]: Added global variable.
|
||
(DEBUG_*PRINT*): Only print if `debug' is true.
|
||
(DEBUG_DOUBLE_STRING_PRINTER): Changed DEBUG_STRING_PRINTER's
|
||
name to this.
|
||
Changed some comments.
|
||
(PUSH_FAILURE_POINT): Moved and added some debugging statements.
|
||
Was saving regstart on the stack twice instead of saving both
|
||
regstart and regend; remedied this.
|
||
[NUM_REGS_ITEMS]: Changed from 3 to 4, as now save lowest and
|
||
highest active registers instead of highest used one.
|
||
[NUM_NON_REG_ITEMS]: Changed name of NUM_OTHER_ITEMS to this.
|
||
(NUM_FAILURE_ITEMS): Use active registers instead of number 0
|
||
through highest used one.
|
||
(re_match_2): Have pop_failure_point put things in the variables.
|
||
(pop_failure_point): Have it do what the fail case in re_match_2
|
||
did with the failure stack, instead of throwing away the stuff
|
||
popped off. re_match_2 can ignore results when it doesn't
|
||
need them.
|
||
|
||
|
||
Thu Sep 5 13:23:28 1991 Kathy Hargreaves (kathy at fosse)
|
||
|
||
* regex.c (banner): Changed copyright years to be separate.
|
||
|
||
* regex.c [CHAR_UNSIGNED]: Put __ at both ends of this name.
|
||
[DEBUG, debug_count, *debug_p, DEBUG_PRINT_1, DEBUG_PRINT_2,
|
||
DEBUG_COMPILED_PATTERN_PRINTER ,DEBUG_STRING_PRINTER]:
|
||
defined these for debugging.
|
||
(extract_number): Added this (debuggable) routine version of
|
||
the macro EXTRACT_NUMBER. Ditto for EXTRACT_NUMBER_AND_INCR.
|
||
(re_compile_pattern): Set return_default_num_regs if the
|
||
syntax bit RE_ALLOCATE_REGISTERS is set.
|
||
[REGEX_MALLOC]: Renamed USE_ALLOCA to this.
|
||
(BUF_POP): Got rid of this, as don't ever use it.
|
||
(regex_compile): Made the type of `pattern' not be register.
|
||
If DEBUG, print the pattern to compile.
|
||
(re_match_2): If had a `$' in the pattern before a `^' then
|
||
don't record the `^' as an anchor.
|
||
Put (enum regexpcode) before references to b, as suggested
|
||
[RE_NO_BK_BRACES]: Changed RE_NO_BK_CURLY_BRACES to this.
|
||
(remove_pattern_offset): Removed this unused routine.
|
||
(PUSH_FAILURE_POINT): Changed to only save active registers.
|
||
Put in debugging statements.
|
||
(re_compile_fastmap): Made `pattern' not a register variable.
|
||
Use routine for extracting numbers instead of macro.
|
||
(re_match_2): Made `p', `mcnt' and `mcnt2' not register variables.
|
||
Added `num_regs_pushed' for debugging.
|
||
Only malloc registers if the syntax bit RE_ALLOCATE_REGISTERS is set.
|
||
Put in debug statements.
|
||
Put the macro NOTE_INNER_GROUP's code inline, as it was the
|
||
only called in one place.
|
||
For debugging, extract numbers using routines instead of macros.
|
||
In case fail: only restore pushed active registers, and added
|
||
debugging statements.
|
||
(pop_failure_point): Test for underfull stack.
|
||
(group_can_match_nothing, common_op_can_match_nothing): For
|
||
debugging, extract numbers using routines instead of macros.
|
||
(regexec): Changed formal parameters to not be prototypes.
|
||
Don't initialize `regs' or `private_preg' in their declarations.
|
||
|
||
Tue Jul 23 18:38:36 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h [RE_CONTEX_INDEP_OPS]: Moved the anchor stuff out of
|
||
this bit.
|
||
[RE_UNMATCHED_RIGHT_PAREN_ORD]: Defined this bit.
|
||
[RE_CONTEXT_INVALID_ANCHORS]: Defined this bit.
|
||
[RE_CONTEXT_INDEP_ANCHORS]: Defined this bit.
|
||
Added RE_CONTEXT_INDEP_ANCHORS to all syntaxes which had
|
||
RE_CONTEXT_INDEP_OPS.
|
||
Took RE_ANCHORS_ONLY_AT_ENDS out of the POSIX basic syntax.
|
||
Added RE_UNMATCHED_RIGHT_PAREN_ORD to the POSIX extended
|
||
syntax.
|
||
Took RE_REPEATED_ANCHORS_AWAY out of the POSIX extended syntax.
|
||
Defined REG_NOERROR (which will probably have to go away again).
|
||
Changed the type `off_t' to `regoff_t'.
|
||
|
||
* regex.c: Changed some commments.
|
||
(regex_compile): Added variable `had_an_endline' to keep track
|
||
of if hit a `$' since the beginning of the pattern or the last
|
||
alternative (if any).
|
||
Changed RE_CONTEXT_INVALID_OPS and RE_CONTEXT_INDEP_OPS to
|
||
RE_CONTEXT_INVALID_ANCHORS and RE_CONTEXT_INDEP_ANCHORS where
|
||
appropriate.
|
||
Put a `no_op' in the pattern if a repeat is only zero or one
|
||
times; in this case and if it is many times (whereupon a jump
|
||
backwards is pushed instead), keep track of the operator for
|
||
verify_and_adjust_endlines.
|
||
If RE_UNMATCHED_RIGHT_PAREN is set, make an unmatched
|
||
close-group operator match `)'.
|
||
Changed all error exits to exit (1).
|
||
(remove_pattern_offset): Added this routine, but don't use it.
|
||
(verify_and_adjust_endlines): At top of routine, if initialize
|
||
routines run out of memory, return true after setting
|
||
enough_memory false.
|
||
At end of endline, et al. case, don't set *p to no_op.
|
||
Repetition operators also set the level and active groups'
|
||
match statuses, unless RE_REPEATED_ANCHORS_AWAY is set.
|
||
(get_group_match_status): Put a return in front of call to get_bit.
|
||
(re_compile_fastmap): Changed is_a_succeed_n to a boolean.
|
||
If at end of pattern, then if the failure stack isn't empty,
|
||
go back to the failure point.
|
||
In *jump* case, only pop the stack if what's on top of it is
|
||
where we've just jumped to.
|
||
(re_search_2): Return -2 instead of val if val is -2.
|
||
(group_can_match_nothing, alternative_can_match_nothing,
|
||
common_op_can-match_nothing): Now pass in reg_info for the
|
||
`duplicate' case.
|
||
(re_match_2): Don't skip over the next alternative also if
|
||
empty alternatives aren't allowed.
|
||
In fail case, if failed to a backwards jump that's part of a
|
||
repetition loop, pop the current failure point and use the
|
||
next one.
|
||
(pop_failure_point): Check that there's as many register items
|
||
on the failure stack as the stack says there are.
|
||
(common_op_can_match_nothing): Added variables `ret' and
|
||
`reg_no' so can set reg_info for the group encountered.
|
||
Also break without doing anything if hit a no_op or the other
|
||
kinds of `endline's.
|
||
If not done already, set reg_info in start_memory case.
|
||
Put in no_pop_jump for an optimized succeed_n of zero repetitions.
|
||
In succeed_n case, if the number isn't zero, then return false.
|
||
Added `duplicate' case.
|
||
|
||
Sat Jul 13 11:27:38 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (REG_NOERROR): Added this error code definition.
|
||
|
||
* regex.c: Took some redundant parens out of macros.
|
||
(enum regexpcode): Added jump_past_next_alt.
|
||
Wrapped some macros in `do..while (0)'.
|
||
Changed some comments.
|
||
(regex_compile): Use `fixup_alt_jump' instead of `fixup_jump'.
|
||
Use `maybe_pop_jump' instead of `maybe_pop_failure_jump'.
|
||
Use `jump_past_next_alt' instead of `no_pop_jump' when at the
|
||
end of an alternative.
|
||
(re_match_2): Used REGEX_ALLOCATE for the registers stuff.
|
||
In stop_memory case: Add more boolean tests to see if the
|
||
group is in a loop.
|
||
Added jump_past_next_alt case, which doesn't jump over the
|
||
next alternative if the last one didn't match anything.
|
||
Unfortunately, to make this work with, e.g., `(a+?*|b)*'
|
||
against `bb', I also had to pop the alternative's failure
|
||
point, which in turn broke backtracking!
|
||
In fail case: Detect a dummy failure point by looking at
|
||
failure_stack.avail - 2, not stack[-2].
|
||
(pop_failure_point): Only pop if the stack isn't empty; don't
|
||
give an error if it is. (Not sure yet this is correct.)
|
||
(group_can_match_nothing): Make it return a boolean instead of int.
|
||
Make it take an argument indicating the end of where it should look.
|
||
If find a group that can match nothing, set the pointer
|
||
argument to past the group in the pattern.
|
||
Took out cases which can share with alternative_can_match_nothing
|
||
and call common_op_can_match_nothing.
|
||
Took ++ out of switch, so could call common_op_can_match_nothing.
|
||
Wrote lots more for on_failure_jump case to handle alternatives.
|
||
Main loop now doesn't look for matching stop_memory, but
|
||
rather the argument END; return true if hit the matching
|
||
stop_memory; this way can call itself for inner groups.
|
||
(alternative_can_match_nothing): Added for alternatives.
|
||
(common_op_can_match_nothing): Added for previous two routines'
|
||
common operators.
|
||
(regerror): Returns a message saying there's no error if gets
|
||
sent REG_NOERROR.
|
||
|
||
Wed Jul 3 10:43:15 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c: Removed unnecessary enclosing parens from several macros.
|
||
Put `do..while (0)' around a few.
|
||
Corrected some comments.
|
||
(INIT_FAILURE_STACK_SIZE): Deleted in favor of using
|
||
INIT_FAILURE_ALLOC.
|
||
(INIT_FAILURE_STACK, DOUBLE_FAILURE_STACK, PUSH_PATTERN_OP,
|
||
PUSH_FAILURE_POINT): Made routines of the same name (but with all
|
||
lowercase letters) into these macros, so could use `alloca'
|
||
when USE_ALLOCA is defined. The reason is stated below for
|
||
bits lists. Deleted analogous routines.
|
||
(re_compile_fastmap): Added variable void *destination for
|
||
PUSH_PATTERN_OP.
|
||
(re_match_2): Added variable void *destination for REGEX_REALLOCATE.
|
||
Used the failure stack macros in place of the routines.
|
||
Detected a dummy failure point by inspecting the failure stack's
|
||
(avail - 2)th element, not failure_stack.stack[-2]. This bug
|
||
arose when used the failure stack macros instead of the routines.
|
||
|
||
* regex.c [USE_ALLOCA]: Put this conditional around previous
|
||
alloca stuff and defined these to work differently depending
|
||
on whether or not USE_ALLOCA is defined:
|
||
(REGEX_ALLOCATE): Uses either `alloca' or `malloc'.
|
||
(REGEX_REALLOCATE): Uses either `alloca' or `realloc'.
|
||
(INIT_BITS_LIST, EXTEND_BITS_LIST, SET_BIT_TO_VALUE): Defined
|
||
macro versions of routines with the same name (only with all
|
||
lowercase letters) so could use `alloc' in re_match_2. This
|
||
is to prevent core leaks when C-g is used in Emacs and to make
|
||
things faster and avoid storage fragmentation. These things
|
||
have to be macros because the results of `alloca' go away with
|
||
the routine by which it's called.
|
||
(BITS_BLOCK_SIZE, BITS_BLOCK, BITS_MASK): Moved to above the
|
||
above-mentioned macros instead of before the routines defined
|
||
below regex_compile.
|
||
(set_bit_to_value): Compacted some code.
|
||
(reg_info_type): Changed inner_groups field to be bits_list_type
|
||
so could be arbitrarily long and thus handle arbitrary nesting.
|
||
(NOTE_INNER_GROUP): Put `do...while (0)' around it so could
|
||
use as a statement.
|
||
Changed code to use bits lists.
|
||
Added variable void *destination for REGEX_REALLOCATE (whose call
|
||
is several levels in).
|
||
Changed variable name of `this_bit' to `this_reg'.
|
||
(FREE_VARIABLES): Only define and use if USE_ALLOCA is defined.
|
||
(re_match_2): Use REGEX_ALLOCATE instead of malloc.
|
||
Instead of setting INNER_GROUPS of reg_info to zero, have to
|
||
use INIT_BITS_LIST and return -2 (and free variables if
|
||
USE_ALLOCA isn't defined) if it fails.
|
||
|
||
Fri Jun 28 13:45:07 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (re_match_2): set value of `dend' when we restore `d'.
|
||
|
||
* regex.c: remove declaration of alloca.
|
||
|
||
* regex.c (MISSING_ISGRAPH): rename to `ISGRAPH_MISSING'.
|
||
|
||
* regex.h [_POSIX_SOURCE]: remove these conditionals; always
|
||
define POSIX stuff.
|
||
* regex.c (_POSIX_SOURCE): change conditionals to use `POSIX'
|
||
instead.
|
||
|
||
Sat Jun 1 16:56:50 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.*: Changed RE_CONTEXTUAL_* to RE_CONTEXT_*,
|
||
RE_TIGHT_VBAR to RE_TIGHT_ALT, RE_NEWLINE_OR to
|
||
RE_NEWLINE_ALT, and RE_DOT_MATCHES_NEWLINE to RE_DOT_NEWLINE.
|
||
|
||
Wed May 29 09:24:11 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.texinfo (POSIX Pattern Buffers): cross-reference the
|
||
correct node name (Match-beginning-of-line, not ..._line).
|
||
(Syntax Bits): put @code around all syntax bits.
|
||
|
||
Sat May 18 16:29:58 1991 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (global): add casts to keep broken compilers from
|
||
complaining about malloc and realloc calls.
|
||
|
||
* regex.c (isgraph) [MISSING_ISGRAPH]: change test to this,
|
||
instead of `#ifndef isgraph', since broken compilers can't
|
||
have both a macro and a symbol by the same name.
|
||
|
||
* regex.c (re_comp, re_exec) [_POSIX_SOURCE]: do not define.
|
||
(regcomp, regfree, regexec, regerror) [_POSIX_SOURCE && !emacs]:
|
||
only define in this case.
|
||
|
||
Mon May 6 17:37:04 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (re_search, re_search_2): Changed BUFFER to not be const.
|
||
|
||
* regex.c (re_compile_pattern): `^' is in a leading position if
|
||
it precedes a newline.
|
||
(various routines): Added or changed header comments.
|
||
(double_pattern_offsets_list): Changed name from
|
||
`extend_pattern_offsets_list'.
|
||
(adjust_pattern_offsets_list): Changed return value from
|
||
unsigned to void.
|
||
(verify_and_adjust_endlines): Now returns `true' and `false'
|
||
instead of 1 and 0.
|
||
`$' is in a leading position if it follows a newline.
|
||
(set_bit_to_value, get_bit_value): Exit with error if POSITION < 0
|
||
so now calling routines don't have to.
|
||
(init_failure_stack, inspect_failure_stack_top,
|
||
pop_failure_stack_top, push_pattern_op, double_failure_stack):
|
||
Now return value unsigned instead of boolean.
|
||
(re_search, re_search_2): Changed BUFP to not be const.
|
||
(re_search_2): Added variable const `private_bufp' to send to
|
||
re_match_2.
|
||
(push_failure_point): Made return value unsigned instead of boolean.
|
||
|
||
Sat May 4 15:32:22 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (re_compile_fastmap): Added extern for this.
|
||
Changed some comments.
|
||
|
||
* regex.c (re_compile_pattern): In case handle_bar: put invalid
|
||
pattern test before levels matching stuff.
|
||
Changed some commments.
|
||
Added optimizing test for detecting an empty alternative that
|
||
ends with a trailing '$' at the end of the pattern.
|
||
(re_compile_fastmap): Moved failure_stack stuff to before this
|
||
so could use it. Made its stack dynamic.
|
||
Made it return an int so that it could return -2 if its stack
|
||
couldn't be allocated.
|
||
Added to header comment (about the return values).
|
||
(init_failure_stack): Wrote so both re_match_2 and
|
||
re_compile_fastmap could use it similar stacks.
|
||
(double_failure_stack): Added for above reasons.
|
||
(push_pattern_op): Wrote for re_compile_fastmap.
|
||
(re_search_2): Now return -2 if re_compile_fastmap does.
|
||
(re_match_2): Made regstart and regend type failure_stack_element*.
|
||
(push_failure_point): Made pattern_place and string_place type
|
||
failure_stack_element*.
|
||
Call double_failure_stack now.
|
||
Return true instead of 1.
|
||
|
||
Wed May 1 12:57:21 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (remove_intervening_anchors): Avoid erroneously making
|
||
ops into no_op's by making them no_op only when they're beglines.
|
||
(verify_and_adjust_endlines): Don't make '$' a normal character
|
||
if it's before a newline.
|
||
Look for the endline op in *p, not p[1].
|
||
(failure_stack_element): Added this declaration.
|
||
(failure_stack_type): Added this declaration.
|
||
(INIT_FAILURE_STACK_SIZE, FAILURE_STACK_EMPTY,
|
||
FAILURE_STACK_PTR_EMPTY, REMAINING_AVAIL_SLOTS): Added for
|
||
failure stack.
|
||
(FAILURE_ITEM_SIZE, PUSH_FAILURE_POINT): Deleted.
|
||
(FREE_VARIABLES): Now free failure_stack.stack instead of stackb.
|
||
(re_match_2): deleted variables `initial_stack', `stackb',
|
||
`stackp', and `stacke' and added `failure_stack' to replace them.
|
||
Replaced calls to PUSH_FAILURE_POINT with those to
|
||
push_failure_point.
|
||
(push_failure_point): Added for re_match_2.
|
||
(pop_failure_point): Rewrote to use a failure_stack_type of stack.
|
||
(can_match_nothing): Moved definition to below re_match_2.
|
||
(bcmp_translate): Moved definition to below re_match_2.
|
||
|
||
Mon Apr 29 14:20:54 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (enum regexpcode): Added codes endline_before_newline
|
||
and repeated_endline_before_newline so could detect these
|
||
types of endlines in the intermediate stages of a compiled
|
||
pattern.
|
||
(INIT_FAILURE_ALLOC): Renamed NFAILURES to this and set it to 5.
|
||
(BUF_PUSH): Put `do {...} while 0' around this.
|
||
(BUF_PUSH_2): Defined this to cut down on expansion of EXTEND_BUFFER.
|
||
(regex_compile): Changed some comments.
|
||
Now push endline_before_newline if find a `$' before a newline
|
||
in the pattern.
|
||
If a `$' might turn into an ordinary character, set laststart
|
||
to point to it.
|
||
In '^' case, if syntax bit RE_TIGHT_VBAR is set, then for `^'
|
||
to be in a leading position, it must be first in the pattern.
|
||
Don't have to check in one of the else clauses that it's not set.
|
||
If RE_CONTEXTUAL_INDEP_OPS isn't set but RE_ANCHORS_ONLY_AT_ENDS
|
||
is, make '^' a normal character if it isn't first in the pattern.
|
||
Can only detect at the end if a '$' after an alternation op is a
|
||
trailing one, so can't immediately detect empty alternatives
|
||
if a '$' follows a vbar.
|
||
Added a picture of the ``success jumps'' in alternatives.
|
||
Have to set bufp->used before calling verify_and_adjust_endlines.
|
||
Also do it before returning all error strings.
|
||
(remove_intervening_anchors): Now replaces the anchor with
|
||
repeated_endline_before_newline if it's an endline_before_newline.
|
||
(verify_and_adjust_endlines): Deleted SYNTAX parameter (could
|
||
use bufp's) and added GROUP_FORWARD_MATCH_STATUS so could
|
||
detect back references referring to empty groups.
|
||
Added variable `bend' to point past the end of the pattern buffer.
|
||
Added variable `previous_p' so wouldn't have to reinspect the
|
||
pattern buffer to see what op we just looked at.
|
||
Added endline_before_newline and repeated_endline_before_newline
|
||
cases.
|
||
When checking if in a trailing position, added case where '$'
|
||
has to be at the pattern's end if either of the syntax bits
|
||
RE_ANCHORS_ONLY_AT_ENDS or RE_TIGHT_VBAR are set.
|
||
Since `endline' can have the intermediate form `endline_in_repeat',
|
||
have to change it to `endline' if RE_REPEATED_ANCHORS_AWAY
|
||
isn't set.
|
||
Now disallow empty alternatives with trailing endlines in them
|
||
if RE_NO_EMPTY_ALTS is set.
|
||
Now don't make '$' an ordinary character if it precedes a newline.
|
||
Don't make it an ordinary character if it's before a newline.
|
||
Back references now affect the level matching something only if
|
||
they refer to nonempty groups.
|
||
(can_match_nothing): Now increment p1 in the switch, which
|
||
changes many of the cases, but makes the code more like what
|
||
it was derived from.
|
||
Adjust the return statement to reflect above.
|
||
(struct register_info): Made `can_match_nothing' field an int
|
||
instead of a bit so could have -1 in it if never set.
|
||
(MAX_FAILURE_ITEMS): Changed name from MAX_NUM_FAILURE_ITEMS.
|
||
(FAILURE_ITEM_SIZE): Defined how much space a failure items uses.
|
||
(PUSH_FAILURE_POINT): Changed variable `last_used_reg's name
|
||
to `highest_used_reg'.
|
||
Added variable `num_stack_items' and changed `len's name to
|
||
`stack_length'.
|
||
Test failure stack limit in terms of number of items in it, not
|
||
in terms of its length. rms' fix tested length against number
|
||
of items, which was a misunderstanding.
|
||
Use `realloc' instead of `alloca' to extend the failure stack.
|
||
Use shifts instead of multiplying by 2.
|
||
(FREE_VARIABLES): Free `stackb' instead of `initial_stack', as
|
||
might may have been reallocated.
|
||
(re_match_2): When mallocing `initial_stack', now multiply
|
||
the number of items wanted (what was there before) by
|
||
FAILURE_ITEM_SIZE.
|
||
(pop_failure_point): Need this procedure form of the macro of
|
||
the same name for debugging, so left it in and deleted the
|
||
macro.
|
||
(recomp): Don't free the pattern buffer's translate field.
|
||
|
||
Mon Apr 15 09:47:47 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_DUP_MAX): Moved to outside of #ifdef _POSIX_SOURCE.
|
||
* regex.c (#include <sys/types.h>): Removed #ifdef _POSIX_SOURCE
|
||
condition.
|
||
(malloc, realloc): Made return type void* #ifdef __STDC__.
|
||
(enum regexpcode): Added endline_in_repeat for the compiler's
|
||
use; this never ends up on the final compiled pattern.
|
||
(INIT_PATTERN_OFFSETS_LIST_SIZE): Initial size for
|
||
pattern_offsets_list_type.
|
||
(pattern_offset_type): Type for pattern offsets.
|
||
(pattern_offsets_list_type): Type for keeping a list of
|
||
pattern offsets.
|
||
(anchor_list_type): Changed to above type.
|
||
(PATTERN_OFFSETS_LIST_PTR_FULL): Tests if a pattern offsets
|
||
list is full.
|
||
(ANCHOR_LIST_PTR_FULL): Changed to above.
|
||
(BIT_BLOCK_SIZE): Changed to BITS_BLOCK_SIZE and moved to
|
||
above bits list routines below regex_compile.
|
||
(op_list_type): Defined to be pattern_offsets_list_type.
|
||
(compile_stack_type): Changed offsets to be
|
||
pattern_offset_type instead of unsigned.
|
||
(pointer): Changed the name of all structure fields from this
|
||
to `avail'.
|
||
(COMPILE_STACK_FULL): Changed so the stack is full if `avail'
|
||
is equal to `size' instead of `size' - 1.
|
||
(GET_BUFFER_SPACE): Changed `>=' to `>' in the while statement.
|
||
(regex_compile): Added variable `enough_memory' so could check
|
||
that routine that verifies '$' positions could return an
|
||
allocation error.
|
||
(group_count): Deleted this variable, as `regnum' already does
|
||
this work.
|
||
(op_list): Added this variable to keep track of operations
|
||
needed for verifying '$' positions.
|
||
(anchor_list): Now initialize using routine
|
||
`init_pattern_offsets_list'.
|
||
Consolidated the three bits_list initializations.
|
||
In case '$': Instead of trying to go past constructs which can
|
||
follow '$', merely detect the special case where it has to be
|
||
at the pattern's end, fix up any fixup jumps if necessary,
|
||
record the anchor if necessary and add an `endline' (and
|
||
possibly two `no-op's) to the pattern; will call a routine at
|
||
the end to verify if it's in a valid position or not.
|
||
(init_pattern_offsets_list): Added to initialize pattern
|
||
offsets lists.
|
||
(extend_anchor_list): Renamed this extend_pattern_offsets_list
|
||
and renamed parameters and internal variables appropriately.
|
||
(add_pattern_offset): Added this routine which both
|
||
record_anchor_position and add_op call.
|
||
(adjust_pattern_offsets_list): Add this routine to adjust by
|
||
some increment all the pattern offsets a list of such after a
|
||
given position.
|
||
(record_anchor_position): Now send in offset instead of
|
||
calculating it and just call add_pattern_offset.
|
||
(adjust_anchor_list): Replaced by above routine.
|
||
(remove_intervening_anchors): If the anchor is an `endline'
|
||
then replace it with `endline_in_repeat' instead of `no_op'.
|
||
(add_op): Added this routine to call in regex_compile
|
||
wherever push something relevant to verifying '$' positions.
|
||
(verify_and_adjust_endlines): Added routine to (1) verify that
|
||
'$'s in a pattern buffer (represented by `endline') were in
|
||
valid positions and (2) whether or not they were anchors.
|
||
(BITS_BLOCK_SIZE): Renamed BIT_BLOCK_SIZE and moved to right
|
||
above bits list routines.
|
||
(BITS_BLOCK): Defines which array element of a bits list the
|
||
bit corresponding to a given position is in.
|
||
(BITS_MASK): Has a 1 where the bit (in a bit list array element)
|
||
for a given position is.
|
||
|
||
Mon Apr 1 12:09:06 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (BIT_BLOCK_SIZE): Defined this for using with
|
||
bits_list_type, abstracted from level_list_type so could use
|
||
for more things than just the level match status.
|
||
(regex_compile): Renamed `level_list' variable to
|
||
`level_match_status'.
|
||
Added variable `group_match_status' of type bits_list_type.
|
||
Kept track of whether or not for all groups any of them
|
||
matched other than the empty string, so detect if a back
|
||
reference in front of a '^' made it nonleading or not.
|
||
Do this by setting a match status bit for all active groups
|
||
whenever leave a group that matches other than the empty string.
|
||
Could detect which groups are active by going through the
|
||
stack each time, but or-ing a bits list of active groups with
|
||
a bits list of group match status is faster, so make a bits
|
||
list of active groups instead.
|
||
Have to check that '^' isn't in a leading position before
|
||
going to normal_char.
|
||
Whenever set level match status of the current level, also set
|
||
the match status of all active groups.
|
||
Increase the group count and make that group active whenever
|
||
open a group.
|
||
When close a group, only set the next level down if the
|
||
current level matches other than the empty string, and make
|
||
the current group inactive.
|
||
At a back reference, only set a level's match status if the
|
||
group to which the back reference refers matches other than
|
||
the empty string.
|
||
(init_bits_list): Added to initialize a bits list.
|
||
(get_level_value): Deleted this. (Made into
|
||
get_level_match_status.)
|
||
(extend_bits_list): Added to extend a bits list. (Made this
|
||
from deleted routine `extend_level_list'.)
|
||
(get_bit): Added to get a bit value from a bits list. (Made
|
||
this from deleted routine `get_level_value'.)
|
||
(set_bit_to_value): Added to set a bit in a bits list. (Made
|
||
this from deleted routine `set_level_value'.)
|
||
(get_level_match_status): Added this to get the match status
|
||
of a given level. (Made from get_level_value.)
|
||
(set_this_level, set_next_lower_level): Made all routines
|
||
which set bits extend the bits list if necessary, thus they
|
||
now return an unsigned value to indicate whether or not the
|
||
reallocation failed.
|
||
(increase_level): No longer extends the level list.
|
||
(make_group_active): Added to mark as active a given group in
|
||
an active groups list.
|
||
(make_group_inactive): Added to mark as inactive a given group
|
||
in an active groups list.
|
||
(set_match_status_of_active_groups): Added to set the match
|
||
status of all currently active groups.
|
||
(get_group_match_status): Added to get a given group's match status.
|
||
(no_levels_match_anything): Removed the paramenter LEVEL.
|
||
(PUSH_FAILURE_POINT): Added rms' bug fix and changed RE_NREGS
|
||
to num_internal_regs.
|
||
|
||
Sun Mar 31 09:04:30 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_ANCHORS_ONLY_AT_ENDS): Added syntax so could
|
||
constrain '^' and '$' to only be anchors if at the beginning
|
||
and end of the pattern.
|
||
(RE_SYNTAX_POSIX_BASIC): Added the above bit.
|
||
|
||
* regex.c (enum regexcode): Changed `unused' to `no_op'.
|
||
(this_and_lower_levels_match_nothing): Deleted forward reference.
|
||
(regex_compile): case '^': if the syntax bit RE_ANCHORS_ONLY_AT_ENDS
|
||
is set, then '^' is only an anchor if at the beginning of the
|
||
pattern; only record anchor position if the syntax bit
|
||
RE_REPEATED_ANCHORS_AWAY is set; the '^' is a normal char if
|
||
the syntax bit RE_ANCHORS_ONLY_AT_END is set and we're not at
|
||
the beginning of the pattern (and neither RE_CONTEXTUAL_INDEP_OPS
|
||
nor RE_CONTEXTUAL_INDEP_OPS syntax bits are set).
|
||
Only adjust the anchor list if the syntax bit
|
||
RE_REPEATED_ANCHORS_AWAY is set.
|
||
|
||
* regex.c (level_list_type): Use to detect when '^' is
|
||
in a leading position.
|
||
(regex_compile): Added level_list_type level_list variable in
|
||
which we keep track of whether or not a grouping level (in its
|
||
current or most recent incarnation) matches anything besides the
|
||
empty string. Set the bit for the i-th level when detect it
|
||
should match something other than the empty string and the bit
|
||
for the (i-1)-th level when leave the i-th group. Clear all
|
||
bits for the i-th and higher levels if none of 0--(i - 1)-th's
|
||
bits are set when encounter an alternation operator on that
|
||
level. If no levels are set when hit a '^', then it is in a
|
||
leading position. We keep track of which level we're at by
|
||
increasing a variable current_level whenever we encounter an
|
||
open-group operator and decreasing it whenever we encounter a
|
||
close-group operator.
|
||
Have to adjust the anchor list contents whenever insert
|
||
something ahead of them (such as on_failure_jump's) in the
|
||
pattern.
|
||
(adjust_anchor_list): Adjusts the offsets in an anchor list by
|
||
a given increment starting at a given start position.
|
||
(get_level_value): Returns the bit setting of a given level.
|
||
(set_level_value): Sets the bit of a given level to a given value.
|
||
(set_this_level): Sets (to 1) the bit of a given level.
|
||
(set_next_lower_level): Sets (to 1) the bit of (LEVEL - 1) for a
|
||
given LEVEL.
|
||
(clear_this_and_higher_levels): Clears the bits for a given
|
||
level and any higher levels.
|
||
(extend_level_list): Adds sizeof(unsigned) more bits to a level list.
|
||
(increase_level): Increases by 1 the value of a given level variable.
|
||
(decrease_level): Decreases by 1 the value of a given level variable.
|
||
(lower_levels_match_nothing): Checks if any levels lower than
|
||
the given one match anything.
|
||
(no_levels_match_anything): Checks if any levels match anything.
|
||
(re_match_2): At case wordbeg: before looking at d-1, check that
|
||
we're not at the string's beginning.
|
||
At case wordend: Added some illuminating parentheses.
|
||
|
||
Mon Mar 25 13:58:51 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_NO_ANCHOR_AT_NEWLINE): Changed syntax bit name
|
||
from RE_ANCHOR_NOT_NEWLINE because an anchor never matches the
|
||
newline itself, just the empty string either before or after it.
|
||
(RE_REPEATED_ANCHORS_AWAY): Added this syntax bit for ignoring
|
||
anchors inside groups which are operated on by repetition
|
||
operators.
|
||
(RE_DOT_MATCHES_NEWLINE): Added this bit so the match-any-character
|
||
operator could match a newline when it's set.
|
||
(RE_SYNTAX_POSIX_BASIC): Set RE_DOT_MATCHES_NEWLINE in this.
|
||
(RE_SYNTAX_POSIX_EXTENDED): Set RE_DOT_MATCHES_NEWLINE and
|
||
RE_REPEATED_ANCHORS_AWAY in this.
|
||
(regerror): Changed prototypes to new POSIX spec.
|
||
|
||
* regex.c (anchor_list_type): Added so could null out anchors inside
|
||
repeated groups.
|
||
(ANCHOR_LIST_PTR_FULL): Added for above type.
|
||
(compile_stack_element): Changed name from stack_element.
|
||
(compile_stack_type): Changed name from compile_stack.
|
||
(INIT_COMPILE_STACK_SIZE): Changed name from INIT_STACK_SIZE.
|
||
(COMPILE_STACK_EMPTY): Changed name from STACK_EMPTY.
|
||
(COMPILE_STACK_FULL): Changed name from STACK_FULL.
|
||
(regex_compile): Changed SYNTAX parameter to non-const.
|
||
Changed variable name `stack' to `compile_stack'.
|
||
If syntax bit RE_REPEATED_ANCHORS_AWAY is set, then naively put
|
||
anchors in a list when encounter them and then set them to
|
||
`unused' when detect they are within a group operated on by a
|
||
repetition operator. Need something more sophisticated than
|
||
this, as they should only get set to `unused' if they are in
|
||
positions where they would be anchors. Also need a better way to
|
||
detect contextually invalid anchors.
|
||
Changed some commments.
|
||
(is_in_compile_stack): Changed name from `is_in_stack'.
|
||
(extend_anchor_list): Added to do anchor stuff.
|
||
(record_anchor_position): Added to do anchor stuff.
|
||
(remove_intervening_anchors): Added to do anchor stuff.
|
||
(re_match_2): Now match a newline with the match-any-character
|
||
operator if RE_DOT_MATCHES_NEWLINE is set.
|
||
Compacted some code.
|
||
(regcomp): Added new POSIX newline information to the header
|
||
commment.
|
||
If REG_NEWLINE cflag is set, then now unset RE_DOT_MATCHES_NEWLINE
|
||
in syntax.
|
||
(put_in_buffer): Added to do new POSIX regerror spec. Called
|
||
by regerror.
|
||
(regerror): Changed to take a pattern buffer, error buffer and
|
||
its size, and return type `size_t', the size of the full error
|
||
message, and the first ERRBUF_SIZE - 1 characters of the full
|
||
error message in the error buffer.
|
||
|
||
Wed Feb 27 16:38:33 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (#include <sys/types.h>): Removed this as new POSIX
|
||
standard has the user include it.
|
||
(RE_SYNTAX_POSIX_BASIC and RE_SYNTAX_POSIX_EXTENDED): Removed
|
||
RE_HAT_LISTS_NOT_NEWLINE as new POSIX standard has the cflag
|
||
REG_NEWLINE now set this. Similarly, added syntax bit
|
||
RE_ANCHOR_NOT_NEWLINE as this is now unset by REG_NEWLINE.
|
||
(RE_SYNTAX_POSIX_BASIC): Removed syntax bit
|
||
RE_NO_CONSECUTIVE_REPEATS as POSIX now allows them.
|
||
|
||
* regex.c (#include <sys/types.h>): Added this as new POSIX
|
||
standard has the user include it instead of us putting it in
|
||
regex.h.
|
||
(extern char *re_syntax_table): Made into an extern so the
|
||
user could allocate it.
|
||
(DO_RANGE): If don't find a range end, now goto invalid_range_end
|
||
instead of unmatched_left_bracket.
|
||
(regex_compile): Made variable SYNTAX non-const.????
|
||
Reformatted some code.
|
||
(re_compile_fastmap): Moved is_a_succeed_n's declaration to
|
||
inner braces.
|
||
Compacted some code.
|
||
(SET_NEWLINE_FLAG): Removed and put inline.
|
||
(regcomp): Made variable `syntax' non-const so can unset
|
||
RE_ANCHOR_NOT_NEWLINE syntax bit if cflag RE_NEWLINE is set.
|
||
If cflag RE_NEWLINE is set, set the RE_HAT_LISTS_NOT_NEWLINE
|
||
syntax bit and unset RE_ANCHOR_NOT_NEWLINE one of `syntax'.
|
||
|
||
Wed Feb 20 16:33:38 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_NO_CONSECUTIVE_REPEATS): Changed name from
|
||
RE_NO_CONSEC_REPEATS.
|
||
(REG_ENESTING): Deleted this POSIX return value, as the stack
|
||
is now unbounded.
|
||
(struct re_pattern_buffer): Changed some comments.
|
||
(re_compile_pattern): Changed a comment.
|
||
Deleted check on stack upper bound and corresponding error.
|
||
Now when there's no interval contents and it's the end of the
|
||
pattern, go to unmatched_left_curly_brace instead of end_of_pattern.
|
||
Removed nesting_too_deep error, as the stack is now unbounded.
|
||
(regcomp): Removed REG_ENESTING case, as the stack is now unbounded.
|
||
(regerror): Removed REG_ENESTING case, as the stack is now unbounded.
|
||
|
||
* regex.c (MAX_STACK_SIZE): Deleted because don't need upper
|
||
bound on array indexed with an unsigned number.
|
||
|
||
Sun Feb 17 15:50:24 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h: Changed and added some comments.
|
||
|
||
* regex.c (init_syntax_once): Made `_' a word character.
|
||
(re_compile_pattern): Added a comment.
|
||
(re_match_2): Redid header comment.
|
||
(regexec): With header comment about PMATCH, corrected and
|
||
removed details found regex.h, adding a reference.
|
||
|
||
Fri Feb 15 09:21:31 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (DO_RANGE): Removed argument parentheses.
|
||
Now get untranslated range start and end characters and set
|
||
list bits for the translated (if at all) versions of them and
|
||
all characters between them.
|
||
(re_match_2): Now use regs->num_regs instead of num_regs_wanted
|
||
wherever possible.
|
||
(regcomp): Now build case-fold translate table using isupper
|
||
and tolower facilities so will work on foreign language characters.
|
||
|
||
Sat Feb 9 16:40:03 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_HAT_LISTS_NOT_NEWLINE): Changed syntax bit name
|
||
from RE_LISTS_NOT_NEWLINE as it only affects nonmatching lists.
|
||
Changed all references to the match-beginning-of-string
|
||
operator to match-beginning-of-line operator, as this is what
|
||
it does.
|
||
(RE_NO_CONSEC_REPEATS): Added this syntax bit.
|
||
(RE_SYNTAX_POSIX_BASIC): Added above bit to this.
|
||
(REG_PREMATURE_END): Changed name to REG_EEND.
|
||
(REG_EXCESS_NESTING): Changed name to REG_ENESTING.
|
||
(REG_TOO_BIG): Changed name to REG_ESIZE.
|
||
(REG_INVALID_PREV_RE): Deleted this return POSIX value.
|
||
Added and changed some comments.
|
||
|
||
* regex.c (re_compile_pattern): Now sets the pattern buffer's
|
||
`return_default_num_regs' field.
|
||
(typedef struct stack_element, stack_type, INIT_STACK_SIZE,
|
||
MAX_STACK_SIZE, STACK_EMPTY, STACK_FULL): Added for regex_compile.
|
||
(INIT_BUF_SIZE): Changed value from 28 to 32.
|
||
(BUF_PUSH): Changed name from BUFPUSH.
|
||
(MAX_BUF_SIZE): Added so could use in many places.
|
||
(IS_CHAR_CLASS_STRING): Replaced is_char_class with this.
|
||
(regex_compile): Added a stack which could grow dynamically
|
||
and which has struct elements.
|
||
Go back to initializing `zero_times_ok' and `many_time_ok' to
|
||
0 and |=ing them inside the loop.
|
||
Now disallow consecutive repetition operators if the syntax
|
||
bit RE_NO_CONSEC_REPEATS is set.
|
||
Now detect trailing backslash when the compiler is expecting a
|
||
`?' or a `+'.
|
||
Changed calls to GET_BUFFER_SPACE which asked for 6 to ask for
|
||
3, as that's all they needed.
|
||
Now check for trailing backslash inside lists.
|
||
Now disallow an empty alternative right before an end-of-line
|
||
operator.
|
||
Now get buffer space before leaving space for a fixup jump.
|
||
Now check if at pattern end when at open-interval operator.
|
||
Added some comments.
|
||
Now check if non-interval repetition operators follow an
|
||
interval one if the syntax bit RE_NO_CONSEC_REPEATS is set.
|
||
Now only check if what precedes an interval repetition
|
||
operator isn't a regular expression which matches one
|
||
character if the syntax bit RE_NO_CONSEC_REPEATS is set.
|
||
Now return "Unmatched [ or [^" instead of "Unmatched [".
|
||
(is_in_stack): Added to check if a given register number is in
|
||
the stack.
|
||
(re_match_2): If initial variable allocations fail, return -2,
|
||
instead of -1.
|
||
Now set reg's `num_regs' field when allocating regs.
|
||
Now before allocating them, free regs->start and end if they
|
||
aren't NULL and return -2 if either allocation fails.
|
||
Now use regs->num_regs instead of num_regs_wanted to control
|
||
regs loops.
|
||
Now increment past the newline when matching it with an
|
||
end-of-line operator.
|
||
(recomp): Added to the header comment.
|
||
Now return REG_ESUBREG if regex_compile returns "Unmatched [
|
||
or [^" instead of doing so if it returns "Unmatched [".
|
||
Now return REG_BADRPT if in addition to returning "Missing
|
||
preceding regular expression", regex_compile returns "Invalid
|
||
preceding regular expression".
|
||
Now return new return value names (see regex.h changes).
|
||
(regexec): Added to header comment.
|
||
Initialize regs structure.
|
||
Now match whole string.
|
||
Now always free regs.start and regs.end instead of just when
|
||
the string matched.
|
||
(regerror): Now return "Regex error: Unmatched [ or [^.\n"
|
||
instead of "Regex error: Unmatched [.\n".
|
||
Now return "Regex error: Preceding regular expression either
|
||
missing or not simple.\n" instead of "Regex error: Missing
|
||
preceding regular expression.\n".
|
||
Removed REG_INVALID_PREV_RE case (it got subsumed into the
|
||
REG_BADRPT case).
|
||
|
||
Thu Jan 17 09:52:35 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h: Changed a comment.
|
||
|
||
* regex.c: Changed and added large header comments.
|
||
(re_compile_pattern): Now if detect that `laststart' for an
|
||
interval points to a byte code for a regular expression which
|
||
matches more than one character, make it an internal error.
|
||
(regerror): Return error message, don't print it.
|
||
|
||
Tue Jan 15 15:32:49 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (regcomp return codes): Added GNU ones.
|
||
Updated some comments.
|
||
|
||
* regex.c (DO_RANGE): Changed `obscure_syntax' to `syntax'.
|
||
(regex_compile): Added `following_left_brace' to keep track of
|
||
where pseudo interval following a valid interval starts.
|
||
Changed some instances that returned "Invalid regular
|
||
expression" to instead return error strings coinciding with
|
||
POSIX error codes.
|
||
Changed some comments.
|
||
Now consider only things between `[:' and `:]' to be possible
|
||
character class names.
|
||
Now a character class expression can't end a pattern; at
|
||
least a `]' must close the list.
|
||
Now if the syntax bit RE_NO_BK_CURLY_BRACES is set, then a
|
||
valid interval must be followed by yet another to get an error
|
||
for preceding an interval (in this case, the second one) with
|
||
a regular expression that matches more than one character.
|
||
Now if what follows a valid interval begins with a open
|
||
interval operator but doesn't begin a valid interval, then set
|
||
following_left_bracket to it, put it in C and go to
|
||
normal_char label.
|
||
Added some comments.
|
||
Return "Invalid character class name" instead of "Invalid
|
||
character class".
|
||
(regerror): Return messages for all POSIX error codes except
|
||
REG_ECOLLATE and REG_NEWLINE, along with all GNU error codes.
|
||
Added `break's after all cases.
|
||
(main): Call re_set_syntax instead of setting `obscure_syntax'
|
||
directly.
|
||
|
||
Sat Jan 12 13:37:59 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (Copyright): Updated date.
|
||
(#include <sys/types.h>): Include unconditionally.
|
||
(RE_CANNOT_MATCH_NEWLINE): Deleted this syntax bit.
|
||
(RE_SYNTAX_POSIX_BASIC, RE_SYNTAX_POSIX_EXTENDED): Removed
|
||
setting the RE_ANCHOR_NOT_NEWLINE syntax bit from these.
|
||
Changed and added some comments.
|
||
(struct re_pattern_buffer): Changed some flags from chars to bits.
|
||
Added field `syntax'; holds which syntax pattern was compiled with.
|
||
Added bit flag `return_default_num_regs'.
|
||
(externs for GNU and Berkeley UNIX routines): Added `const's to
|
||
parameter types to be compatible with POSIX.
|
||
(#define const): Added to support old C compilers.
|
||
|
||
* regex.c (Copyright): Updated date.
|
||
(enum regexpcode): Deleted `newline'.
|
||
(regex_compile): Renamed re_compile_pattern to this, added a
|
||
syntax parameter so it can set the pattern buffer's `syntax'
|
||
field.
|
||
Made `pattern', and `size' `const's so could pass to POSIX
|
||
interface routines; also made `const' whatever interval
|
||
variables had to be to make this work.
|
||
Changed references to `obscure_syntax' to new parameter `syntax'.
|
||
Deleted putting `newline' in buffer when see `\n'.
|
||
Consider invalid character classes which have nothing wrong
|
||
except the character class name; if so, return character-class error.
|
||
(is_char_class): Added routine for regex_compile.
|
||
(re_compile_pattern): added a new one which calls
|
||
regex_compile with `obscure_syntax' as the actual parameter
|
||
for the formal `syntax'.
|
||
Gave this the old routine's header comments.
|
||
Made `pattern', and `size' `const's so could use POSIX interface
|
||
routine parameters.
|
||
(re_search, re_search_2, re_match, re_match_2): Changed
|
||
`pbufp' to `bufp'.
|
||
(re_search_2, re_match_2): Changed `mstop' to `stop'.
|
||
(re_search, re_search_2): Made all parameters except `regs'
|
||
`const's so could use POSIX interface routines parameters.
|
||
(re_search_2): Added private copies of `const' parameters so
|
||
could change their values.
|
||
(re_match_2): Made all parameters except `regs' `const's so
|
||
could use POSIX interface routines parameters.
|
||
Changed `size1' and `size2' parameters to `size1_arg' and
|
||
`size2_arg' and so could change; added local `size1' and
|
||
`size2' and set to these.
|
||
Added some comments.
|
||
Deleted `newline' case.
|
||
`begline' can also possibly match if `d' contains a newline;
|
||
if it does, we have to increment d to point past the newline.
|
||
Replaced references to `obscure_syntax' with `bufp->syntax'.
|
||
(re_comp, re_exec): Made parameter `s' a `const' so could use POSIX
|
||
interface routines parameters.
|
||
Now call regex_compile, passing `obscure_syntax' via the
|
||
`syntax' parameter.
|
||
(re_exec): Made local `len' a `const' so could pass to re_search.
|
||
(regcomp): Added header comment.
|
||
Added local `syntax' to set and pass to regex_compile rather
|
||
than setting global `obscure_syntax' and passing it.
|
||
Call regex_compile with its `syntax' parameter rather than
|
||
re_compile_pattern.
|
||
Return REG_ECTYPE if character-class error.
|
||
(regexec): Don't initialize `regs' to anything.
|
||
Made `private_preg' a nonpointer so could set to what the
|
||
constant `preg' points.
|
||
Initialize `private_preg's `return_default_num_regs' field to
|
||
zero because want to return `nmatch' registers, not however
|
||
many there are subexpressions in the pattern.
|
||
Also test if `nmatch' > 0 to see if should pass re_match `regs'.
|
||
|
||
Tue Jan 8 15:57:17 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (struct re_pattern_buffer): Reworded comment.
|
||
|
||
* regex.c (EXTEND_BUFFER): Also reset beg_interval.
|
||
(re_search_2): Return val if val = -2.
|
||
(NUM_REG_ITEMS): Listed items in comment.
|
||
(NUM_OTHER_ITEMS): Defined this for using in > 1 definition.
|
||
(MAX_NUM_FAILURE_ITEMS): Replaced `+ 2' with NUM_OTHER_ITEMS.
|
||
(NUM_FAILURE_ITEMS): As with definition above and added to
|
||
comment.
|
||
(PUSH_FAILURE_POINT): Replaced `* 2's with `<< 1's.
|
||
(re_match_2): Test with equality with 1 to see pbufp->bol and
|
||
pbufp->eol are set.
|
||
|
||
Fri Jan 4 15:07:22 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (struct re_pattern_buffer): Reordered some fields.
|
||
Updated some comments.
|
||
Added not_bol and not_eol fields.
|
||
(extern regcomp, regexec, regerror): Added return types.
|
||
(extern regfree): Added `extern'.
|
||
|
||
* regex.c (min): Deleted unused macro.
|
||
(re_match_2): Compacted some code.
|
||
Removed call to macro `min' from `for' loop.
|
||
Fixed so unused registers get filled with -1's.
|
||
Fail if the pattern buffer's `not_bol' field is set and
|
||
encounter a `begline'.
|
||
Fail if the pattern buffer's `not_eol' field is set and
|
||
encounter a `endline'.
|
||
Deleted redundant check for empty stack in fail case.
|
||
Don't free pattern buffer's components in re_comp.
|
||
(regexec): Initialize variable regs.
|
||
Added `private_preg' pattern buffer so could set `not_bol' and
|
||
`not_eol' fields and hand to re_match.
|
||
Deleted naive attempt to detect anchors.
|
||
Set private pattern buffer's `not_bol' and `not_eol' fields
|
||
according to eflags value.
|
||
`nmatch' must also be > 0 for us to bother allocating
|
||
registers to send to re_match and filling pmatch
|
||
with their results after the call to re_match.
|
||
Send private pattern buffer instead of argument to re_match.
|
||
If use the registers, always free them and then set them to NULL.
|
||
(regerror): Added this Posix routine.
|
||
(regfree): Added this Posix routine.
|
||
|
||
Tue Jan 1 15:02:45 1991 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_NREGS): Deleted this definition, as now the user
|
||
can choose how many registers to have.
|
||
(REG_NOTBOL, REG_NOTEOL): Defined these Posix eflag bits.
|
||
(REG_NOMATCH, REG_BADPAT, REG_ECOLLATE, REG_ECTYPE,
|
||
REG_EESCAPE, REG_ESUBREG, REG_EBRACK, REG_EPAREN, REG_EBRACE,
|
||
REG_BADBR, REG_ERANGE, REG_ESPACE, REG_BADRPT, REG_ENEWLINE):
|
||
Defined these return values for Posix's regcomp and regexec.
|
||
Updated some comments.
|
||
(struct re_pattern_buffer): Now typedef this as regex_t
|
||
instead of the other way around.
|
||
(struct re_registers): Added num_regs field. Made start and
|
||
end fields pointers to char instead of fixed size arrays.
|
||
(regmatch_t): Added this Posix register type.
|
||
(regcomp, regexec, regerror, regfree): Added externs for these
|
||
Posix routines.
|
||
|
||
* regex.c (enum boolean): Typedefed this.
|
||
(re_pattern_buffer): Reformatted some comments.
|
||
(re_compile_pattern): Updated some comments.
|
||
Always push start_memory and its attendant number whenever
|
||
encounter a group, not just when its number is less than the
|
||
previous maximum number of registers; same for stop_memory.
|
||
Get 4 bytes of buffer space instead of 2 when pushing a
|
||
set_number_at.
|
||
(can_match_nothing): Added this to elaborate on and replace
|
||
code in re_match_2.
|
||
(reg_info_type): Made can_match_nothing field a bit instead of int.
|
||
(MIN): Added for re_match_2.
|
||
(re_match_2 macros): Changed all `for' loops which used
|
||
RE_NREGS to now use num_internal_regs as upper bounds.
|
||
(MAX_NUM_FAILURE_ITEMS): Use num_internal_regs instead of RE_NREGS.
|
||
(POP_FAILURE_POINT): Added check for empty stack.
|
||
(FREE_VARIABLES): Added this to free (and set to NULL)
|
||
variables allocated in re_match_2.
|
||
(re_match_2): Rearranged parameters to be in order.
|
||
Added variables num_regs_wanted (how many registers the user wants)
|
||
and num_internal_regs (how many groups there are).
|
||
Allocated initial_stack, regstart, regend, old_regstart,
|
||
old_regend, reginfo, best_regstart, and best_regend---all
|
||
which used to be fixed size arrays. Free them all and return
|
||
-1 if any fail.
|
||
Free above variables if starting position pos isn't valid.
|
||
Changed all `for' loops which used RE_NREGS to now use
|
||
num_internal_regs as upper bounds---except for the loops which
|
||
fill regs; then use num_regs_wanted.
|
||
Allocate regs if the user has passed it and wants more than 0
|
||
registers filled.
|
||
Set regs->start[i] and regs->end[i] to -1 if either
|
||
regstart[i] or regend[i] equals -1, not just the first.
|
||
Free allocated variables before returning.
|
||
Updated some comments.
|
||
(regcomp): Return REG_ESPACE, REG_BADPAT, REG_EPAREN when
|
||
appropriate.
|
||
Free translate array.
|
||
(regexec): Added this Posix interface routine.
|
||
|
||
Mon Dec 24 14:21:13 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h: If _POSIX_SOURCE is defined then #include <sys/types.h>.
|
||
Added syntax bit RE_CANNOT_MATCH_NEWLINE.
|
||
Defined Posix cflags: REG_EXTENDED, REG_NEWLINE, REG_ICASE, and
|
||
REG_NOSUB.
|
||
Added fields re_nsub and no_sub to struct re_pattern_buffer.
|
||
Typedefed regex_t to be `struct re_pattern_buffer'.
|
||
|
||
* regex.c (CHAR_SET_SIZE): Defined this to be 256 and replaced
|
||
incidences of this value with this constant.
|
||
(re_compile_pattern): Added switch case for `\n' and put
|
||
`newline' into the pattern buffer when encounter this.
|
||
Increment the pattern_buffer's `re_nsub' field whenever open a
|
||
group.
|
||
(re_match_2): Match a newline with `newline'---provided the
|
||
syntax bit RE_CANNOT_MATCH_NEWLINE isn't set.
|
||
(regcomp): Added this Posix interface routine.
|
||
(enum test_type): Added interface_test tag.
|
||
(main): Added Posix interface test.
|
||
|
||
Tue Dec 18 12:58:12 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (struct re_pattern_buffer): reformatted so would fit
|
||
in texinfo documentation.
|
||
|
||
Thu Nov 29 15:49:16 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_NO_EMPTY_ALTS): Added this bit.
|
||
(RE_SYNTAX_POSIX_EXTENDED): Added above bit.
|
||
|
||
* regex.c (re_compile_pattern): Disallow empty alternatives only
|
||
when RE_NO_EMPTY_ALTS is set, not when RE_CONTEXTUAL_INVALID_OPS is.
|
||
Changed RE_NO_BK_CURLY_BRACES to RE_NO_BK_PARENS when testing
|
||
for empty groups at label handle_open.
|
||
At label handle_bar: disallow empty alternatives if RE_NO_EMPTY_ALTS
|
||
is set.
|
||
Rewrote some comments.
|
||
|
||
(re_compile_fastmap): cleaned up code.
|
||
|
||
(re_search_2): Rewrote comment.
|
||
|
||
(struct register_info): Added field `inner_groups'; it records
|
||
which groups are inside of the current one.
|
||
Added field can_match_nothing; it's set if the current group
|
||
can match nothing.
|
||
Added field ever_match_something; it's set if current group
|
||
ever matched something.
|
||
|
||
(INNER_GROUPS): Added macro to access inner_groups field of
|
||
struct register_info.
|
||
|
||
(CAN_MATCH_NOTHING): Added macro to access can_match_nothing
|
||
field of struct register_info.
|
||
|
||
(EVER_MATCHED_SOMETHING): Added macro to access
|
||
ever_matched_something field of struct register_info.
|
||
|
||
(NOTE_INNER_GROUP): Defined macro to record that a given group
|
||
is inside of all currently active groups.
|
||
|
||
(re_match_2): Added variables *p1 and mcnt2 (multipurpose).
|
||
Added old_regstart and old_regend arrays to hold previous
|
||
register values if they need be restored.
|
||
Initialize added fields and variables.
|
||
case start_memory: Find out if the group can match nothing.
|
||
Save previous register values in old_restart and old_regend.
|
||
Record that current group is inside of all currently active
|
||
groups.
|
||
If the group is inside a loop and it ever matched anything,
|
||
restore its registers to values before the last failed match.
|
||
Restore the registers for the inner groups, too.
|
||
case duplicate: Can back reference to a group that never
|
||
matched if it can match nothing.
|
||
|
||
Thu Nov 29 11:12:54 1990 Karl Berry (karl at hayley)
|
||
|
||
* regex.c (bcopy, ...): define these if either _POSIX_SOURCE or
|
||
STDC_HEADERS is defined; same for including <stdlib.h>.
|
||
|
||
Sat Oct 6 16:04:55 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (struct re_pattern_buffer): Changed field comments.
|
||
|
||
* regex.c (re_compile_pattern): Allow a `$' to precede an
|
||
alternation operator (`|' or `\|').
|
||
Disallow `^' and/or `$' in empty groups if the syntax bit
|
||
RE_NO_EMPTY_GROUPS is set.
|
||
Wait until have parsed a valid `\{...\}' interval expression
|
||
before testing RE_CONTEXTUAL_INVALID_OPS to see if it's
|
||
invalidated by that.
|
||
Don't use RE_NO_BK_CURLY_BRACES to test whether or not a validly
|
||
parsed interval expression is invalid if it has no preceding re;
|
||
rather, use RE_CONTEXTUAL_INVALID_OPS.
|
||
If an interval parses, but there is no preceding regular
|
||
expression, yet the syntax bit RE_CONTEXTUAL_INDEP_OPS is set,
|
||
then that interval can match the empty regular expression; if
|
||
the bit isn't set, then the characters in the interval
|
||
expression are parsed as themselves (sans the backslashes).
|
||
In unfetch_interval case: Moved PATFETCH to above the test for
|
||
RE_NO_BK_CURLY_BRACES being set, which would force a goto
|
||
normal_backslash; the code at both normal_backsl and normal_char
|
||
expect a character in `c.'
|
||
|
||
Sun Sep 30 11:13:48 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h: Changed some comments to use the terms used in the
|
||
documentation.
|
||
(RE_CONTEXTUAL_INDEP_OPS): Changed name from `RE_CONTEXT_INDEP_OPS'.
|
||
(RE_LISTS_NOT_NEWLINE): Changed name from `RE_HAT_NOT_NEWLINE.'
|
||
(RE_ANCHOR_NOT_NEWLINE): Added this syntax bit.
|
||
(RE_NO_EMPTY_GROUPS): Added this syntax bit.
|
||
(RE_NO_HYPHEN_RANGE_END): Deleted this syntax bit.
|
||
(RE_SYNTAX_...): Reformatted.
|
||
(RE_SYNTAX_POSIX_BASIC, RE_SYNTAX_EXTENDED): Added syntax bits
|
||
RE_ANCHOR_NOT_NEWLINE and RE_NO_EMPTY_GROUPS, and deleted
|
||
RE_NO_HYPHEN_RANGE_END.
|
||
(RE_SYNTAX_POSIX_EXTENDED): Added syntax bit RE_DOT_NOT_NULL.
|
||
|
||
* regex.c (bcopy, bcmp, bzero): Define if _POSIX_SOURCE is defined.
|
||
(_POSIX_SOURCE): ifdef this, #include <stdlib.h>
|
||
(#ifdef emacs): Changed comment of the #endif for the its #else
|
||
clause to be `not emacs', not `emacs.'
|
||
(no_pop_jump): Changed name from `jump'.
|
||
(pop_failure_jump): Changed name from `finalize_jump.'
|
||
(maybe_pop_failure_jump): Changed name from `maybe_finalize_jump'.
|
||
(no_pop_jump_n): Changed name from `jump_n.'
|
||
(EXTEND_BUFFER): Use shift instead of multiplication to double
|
||
buf->allocated.
|
||
(DO_RANGE, recompile_pattern): Added macro to set the list bits
|
||
for a range.
|
||
(re_compile_pattern): Fixed grammar problems in some comments.
|
||
Checked that RE_NO_BK_VBAR is set to make `$' valid before a `|'
|
||
and not set to make it valid before a `\|'.
|
||
Checked that RE_NO_BK_PARENS is set to make `$' valid before a ')'
|
||
and not set to make it valid before a `\)'.
|
||
Disallow ranges starting with `-', unless the range is the
|
||
first item in a list, rather than disallowing ranges which end
|
||
with `-'.
|
||
Disallow empty groups if the syntax bit RE_NO_EMPTY_GROUPS is set.
|
||
Disallow nothing preceding `{' and `\{' if they represent the
|
||
open-interval operator and RE_CONTEXTUAL_INVALID_OPS is set.
|
||
(register_info_type): typedef-ed this using `struct register_info.'
|
||
(SET_REGS_MATCHED): Compacted the code.
|
||
(re_match_2): Made it fail if back reference a group which we've
|
||
never matched.
|
||
Made `^' not match a newline if the syntax bit
|
||
RE_ANCHOR_NOT_NEWLINE is set.
|
||
(really_fail): Added this label so could force a final fail that
|
||
would not try to use the failure stack to recover.
|
||
|
||
Sat Aug 25 14:23:01 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_CONTEXTUAL_OPS): Changed name from RE_CONTEXT_OPS.
|
||
(global): Rewrote comments and rebroke some syntax #define lines.
|
||
|
||
* regex.c (isgraph): Added definition for sequents.
|
||
(global): Now refer to character set lists as ``lists.''
|
||
Rewrote comments containing ``\('' or ``\)'' to now refer to
|
||
``groups.''
|
||
(RE_CONTEXTUAL_OPS): Changed name from RE_CONTEXT_OPS.
|
||
|
||
(re_compile_pattern): Expanded header comment.
|
||
|
||
Sun Jul 15 14:50:25 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_CONTEX_INDEP_OPS): the comment's sense got turned
|
||
around when we changed how it read; changed it to be correct.
|
||
|
||
Sat Jul 14 16:38:06 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_NO_EMPTY_BK_REF): changed name to
|
||
RE_NO_MISSING_BK_REF, as this describes it better.
|
||
|
||
* regex.c (re_compile_pattern): changed RE_NO_EMPTY_BK_REF
|
||
to RE_NO_MISSING_BK_REF, as above.
|
||
|
||
Thu Jul 12 11:45:05 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h (RE_NO_EMPTY_BRACKETS): removed this syntax bit, as
|
||
bracket expressions should *never* be empty regardless of the
|
||
syntax. Removes this bit from RE_SYNTAX_POSIX_BASIC and
|
||
RE_SYNTAX_POSIX_EXTENDED.
|
||
|
||
* regex.c (SET_LIST_BIT): in the comment, now refer to character
|
||
sets as (non)matching sets, as bracket expressions can now match
|
||
other things in addition to characters.
|
||
(re_compile_pattern): refer to groups as such instead of `\(...\)'
|
||
or somesuch, because groups can now be enclosed in either plain
|
||
parens or backslashed ones, depending on the syntax.
|
||
In the '[' case, added a boolean just_had_a_char_class to detect
|
||
whether or not a character class begins a range (which is invalid).
|
||
Restore way of breaking out of a bracket expression to original way.
|
||
Add way to detect a range if the last thing in a bracket
|
||
expression was a character class.
|
||
Took out check for c != ']' at the end of a character class in
|
||
the else clause, as it had already been checked in the if part
|
||
that also checked the validity of the string.
|
||
Set or clear just_had_a_char_class as appropriate.
|
||
Added some comments. Changed references to character sets to
|
||
``(non)matching lists.''
|
||
|
||
Sun Jul 1 12:11:29 1990 Karl Berry (karl at hayley)
|
||
|
||
* regex.h (BYTEWIDTH): moved back to regex.c.
|
||
|
||
* regex.h (re_compile_fastmap): removed declaration; this
|
||
shouldn't be advertised.
|
||
|
||
Mon May 28 15:27:53 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (ifndef Sword): Made comments more specific.
|
||
(global): include <stdio.h> so can write fatal messages on
|
||
standard error. Replaced calls to assert with fprintfs to
|
||
stderr and exit (1)'s.
|
||
(PREFETCH): Reformatted to make more readable.
|
||
(AT_STRINGS_BEG): Defined to test if we're at the beginning of
|
||
the virtual concatenation of string1 and string2.
|
||
(AT_STRINGS_END): Defined to test if at the end of the virtual
|
||
concatenation of string1 and string2.
|
||
(AT_WORD_BOUNDARY): Defined to test if are at a word boundary.
|
||
(IS_A_LETTER(d)): Defined to test if the contents of the pointer D
|
||
is a letter.
|
||
(re_match_2): Rewrote the wordbound, notwordbound, wordbeg, wordend,
|
||
begbuf, and endbuf cases in terms of the above four new macros.
|
||
Called SET_REGS_MATCHED in the matchsyntax, matchnotsyntax,
|
||
wordchar, and notwordchar cases.
|
||
|
||
Mon May 14 14:49:13 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (re_search_2): Fixed RANGE to not ever take STARTPOS
|
||
outside of virtual concatenation of STRING1 and STRING2.
|
||
Updated header comment as to this.
|
||
(re_match_2): Clarified comment about MSTOP in header.
|
||
|
||
Sat May 12 15:39:00 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (re_search_2): Checked for out-of-range STARTPOS.
|
||
Added comments.
|
||
When searching backwards, not only get the character with which
|
||
to compare to the fastmap from string2 if the starting position
|
||
>= size1, but also if size1 is zero; this is so won't get a
|
||
segmentation fault if string1 is null.
|
||
Reformatted code at label advance.
|
||
|
||
Thu Apr 12 20:26:21 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h: Added #pragma once and #ifdef...endif __REGEXP_LIBRARY.
|
||
(RE_EXACTN_VALUE): Added for search.c to use.
|
||
Reworded some comments.
|
||
|
||
regex.c: Punctuated some comments correctly.
|
||
(NULL): Removed this.
|
||
(RE_EXACTN_VALUE): Added for search.c to use.
|
||
(<ctype.h>): Moved this include to top of file.
|
||
(<assert.h>): Added this include.
|
||
(struct regexpcode): Assigned 0 to unused and 1 to exactn
|
||
because of RE_EXACTN_VALUE.
|
||
Added comment.
|
||
(various macros): Lined up backslashes near end of line.
|
||
(insert_jump): Cleaned up the header comment.
|
||
(re_search): Corrected the header comment.
|
||
(re_search_2): Cleaned up and completed the header comment.
|
||
(re_max_failures): Updated comment.
|
||
(struct register_info): Constructed as bits so as to save space
|
||
on the stack when pushing register information.
|
||
(IS_ACTIVE): Macro for struct register_info.
|
||
(MATCHED_SOMETHING): Macro for struct register_info.
|
||
(NUM_REG_ITEMS): How many register information items for each
|
||
register we have to push on the stack at each failure.
|
||
(MAX_NUM_FAILURE_ITEMS): If push all the registers on failure,
|
||
this is how many items we push on the stack.
|
||
(PUSH_FAILURE_POINT): Now pushes whether or not the register is
|
||
currently active, and whether or not it matched something.
|
||
Checks that there's enough space allocated to accomodate all the
|
||
items we currently want to push. (Before, a test for an empty
|
||
stack sufficed because we always pushed and popped the same
|
||
number of items).
|
||
Replaced ``2'' with MAX_NUM_FAILURE_POINTS when ``2'' refers
|
||
to how many things get pushed on the stack each time.
|
||
When copy the stack into the newly allocated storage, now only copy
|
||
the area in use.
|
||
Clarified comment.
|
||
(POP_FAILURE_POINT): Defined to use in places where put number
|
||
of registers on the stack into a variable before using it to
|
||
decrement the stack, so as to not confuse the compiler.
|
||
(IS_IN_FIRST_STRING): Defined to check if a pointer points into
|
||
the first string.
|
||
(SET_REGS_MATCHED): Changed to use the struct register_info
|
||
bits; also set the matched-something bit to false if the
|
||
register isn't currently active. (This is a redundant setting.)
|
||
(re_match_2): Cleaned up and completed the header comment.
|
||
Updated the failure stack comment.
|
||
Replaced the ``2'' with MAX_NUM_FAILURE_ITEMS in the static
|
||
allocation of initial_stack, because now more than two (now up
|
||
to MAX_FAILURE_ITEMS) items get pushed on the failure stack each
|
||
time.
|
||
Ditto for stackb.
|
||
Trashed restart_seg1, regend_seg1, best_regstart_seg1, and
|
||
best_regend_seg1 because they could have erroneous information
|
||
in them, such as when matching ``a'' (in string1) and ``ab'' (in
|
||
string2) with ``(a)*ab''; before using IS_IN_FIRST_STRING to see
|
||
whether or not the register starts or ends in string1,
|
||
regstart[1] pointed past the end of string1, yet regstart_seg1
|
||
was 0!
|
||
Added variable reg_info of type struct register_info to keep
|
||
track of currently active registers and whether or not they
|
||
currently match anything.
|
||
Commented best_regs_set.
|
||
Trashed reg_active and reg_matched_something and put the
|
||
information they held into reg_info; saves space on the stack.
|
||
Replaced NULL with '\000'.
|
||
In begline case, compacted the code.
|
||
Used assert to exit if had an internal error.
|
||
In begbuf case, because now force the string we're working on
|
||
into string2 if there aren't two strings, now allow d == string2
|
||
if there is no string1 (and the check for that is size1 == 0!);
|
||
also now succeeds if there aren't any strings at all.
|
||
(main, ifdef canned): Put test type into a variable so could
|
||
change it while debugging.
|
||
|
||
Sat Mar 24 12:24:13 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (GET_UNSIGNED_NUMBER): Deleted references to num_fetches.
|
||
(re_compile_pattern): Deleted num_fetches because could keep
|
||
track of the number of fetches done by saving a pointer into the
|
||
pattern.
|
||
Added variable beg_interval to be used as a pointer, as above.
|
||
Assert that beg_interval points to something when it's used as above.
|
||
Initialize succeed_n's to lower_bound because re_compile_fastmap
|
||
needs to know it.
|
||
(re_compile_fastmap): Deleted unnecessary variable is_a_jump_n.
|
||
Added comment.
|
||
(re_match_2): Put number of registers on the stack into a
|
||
variable before using it to decrement the stack, so as to not
|
||
confuse the compiler.
|
||
Updated comments.
|
||
Used error routine instead of printf and exit.
|
||
In exactn case, restored longer code from ``original'' regex.c
|
||
which doesn't test translate inside a loop.
|
||
|
||
* regex.h: Moved #define NULL and the enum regexpcode definition
|
||
and to regex.c. Changed some comments.
|
||
|
||
regex.c (global): Updated comments about compiling and for the
|
||
re_compile_pattern jump routines.
|
||
Added #define NULL and the enum regexpcode definition (from
|
||
regex.h).
|
||
(enum regexpcode): Added set_number_at to reset the n's of
|
||
succeed_n's and jump_n's.
|
||
(re_set_syntax): Updated its comment.
|
||
(re_compile_pattern): Moved its heading comment to after its macros.
|
||
Moved its include statement to the top of the file.
|
||
Commented or added to comments of its macros.
|
||
In start_memory case: Push laststart value before adding
|
||
start_memory and its register number to the buffer, as they
|
||
might not get added.
|
||
Added code to put a set_number_at before each succeed_n and one
|
||
after each jump_n; rewrote code in what seemed a more
|
||
straightforward manner to put all these things in the pattern so
|
||
the succeed_n's would correctly jump to the set_number_at's of
|
||
the matching jump_n's, and so the jump_n's would correctly jump
|
||
to after the set_number_at's of the matching succeed_n's.
|
||
Initialize succeed_n n's to -1.
|
||
(insert_op_2): Added this to insert an operation followed by
|
||
two integers.
|
||
(re_compile_fastmap): Added set_number_at case.
|
||
(re_match_2): Moved heading comment to after macros.
|
||
Added mention of REGS to heading comment.
|
||
No longer turn a succeed_n with n = 0 into an on_failure_jump,
|
||
because n needs to be reset each time through a loop.
|
||
Check to see if a succeed_n's n is set by its set_number_at.
|
||
Added set_number_at case.
|
||
Updated some comments.
|
||
(main): Added another main to run posix tests, which is compiled
|
||
ifdef both test and canned. (Old main is still compiled ifdef
|
||
test only).
|
||
|
||
Tue Mar 19 09:22:55 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.[hc]: Change all instances of the word ``legal'' to
|
||
``valid'' and all instances of ``illegal'' to ``invalid.''
|
||
|
||
Sun Mar 4 12:11:31 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h: Added syntax bit RE_NO_EMPTY_RANGES which is set if
|
||
an ending range point has to collate higher or equal to the
|
||
starting range point.
|
||
Added syntax bit RE_NO_HYPHEN_RANGE_END which is set if a hyphen
|
||
can't be an ending range point.
|
||
Set to two above bits in RE_SYNTAX_POSIX_BASIC and
|
||
RE_SYNTAX_POSIX_EXTENDED.
|
||
|
||
regex.c: (re_compile_pattern): Don't allow empty ranges if the
|
||
RE_NO_EMPTY_RANGES syntax bit is set.
|
||
Don't let a hyphen be a range end if the RE_NO_HYPHEN_RANGE_END
|
||
syntax bit is set.
|
||
(ESTACK_PUSH_2): renamed this PUSH_FAILURE_POINT and made it
|
||
push all the used registers on the stack, as well as the number
|
||
of the highest numbered register used, and (as before) the two
|
||
failure points.
|
||
(re_match_2): Fixed up comments.
|
||
Added arrays best_regstart[], best_regstart_seg1[], best_regend[],
|
||
and best_regend_seg1[] to keep track of the best match so far
|
||
whenever reach the end of the pattern but not the end of the
|
||
string, and there are still failure points on the stack with
|
||
which to backtrack; if so, do the saving and force a fail.
|
||
If reach the end of the pattern but not the end of the string,
|
||
but there are no more failure points to try, restore the best
|
||
match so far, set the registers and return.
|
||
Compacted some code.
|
||
In stop_memory case, if the subexpression we've just left is in
|
||
a loop, push onto the stack the loop's on_failure_jump failure
|
||
point along with the current pointer into the string (d).
|
||
In finalize_jump case, in addition to popping the failure
|
||
points, pop the saved registers.
|
||
In the fail case, restore the registers, as well as the failure
|
||
points.
|
||
|
||
Sun Feb 18 15:08:10 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c: (global): Defined a macro GET_BUFFER_SPACE which
|
||
makes sure you have a specified number of buffer bytes
|
||
allocated.
|
||
Redefined the macro BUFPUSH to use this.
|
||
Added comments.
|
||
|
||
(re_compile_pattern): Call GET_BUFFER_SPACE before storing or
|
||
inserting any jumps.
|
||
|
||
(re_match_2): Set d to string1 + pos and dend to end_match_1
|
||
only if string1 isn't null.
|
||
Force exit from a loop if it's around empty parentheses.
|
||
In stop_memory case, if found some jumps, increment p2 before
|
||
extracting address to which to jump. Also, don't need to know
|
||
how many more times can jump_n.
|
||
In begline case, d must equal string1 or string2, in that order,
|
||
only if they are not null.
|
||
In maybe_finalize_jump case, skip over start_memorys' and
|
||
stop_memorys' register numbers, too.
|
||
|
||
Thu Feb 15 15:53:55 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (BUFPUSH): off by one goof in deciding whether to
|
||
EXTEND_BUFFER.
|
||
|
||
Wed Jan 24 17:07:46 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h: Moved definition of NULL to here.
|
||
Got rid of ``In other words...'' comment.
|
||
Added to some comments.
|
||
|
||
regex.c: (re_compile_pattern): Tried to bulletproof some code,
|
||
i.e., checked if backward references (e.g., p[-1]) were within
|
||
the range of pattern.
|
||
|
||
(re_compile_fastmap): Fixed a bug in succeed_n part where was
|
||
getting the amount to jump instead of how many times to jump.
|
||
|
||
(re_search_2): Changed the name of the variable ``total'' to
|
||
``total_size.''
|
||
Condensed some code.
|
||
|
||
(re_match_2): Moved the comment about duplicate from above the
|
||
start_memory case to above duplicate case.
|
||
|
||
(global): Rewrote some comments.
|
||
Added commandline arguments to testing.
|
||
|
||
Wed Jan 17 11:47:27 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c: (global): Defined a macro STORE_NUMBER which stores a
|
||
number into two contiguous bytes. Also defined STORE_NUMBER_AND_INCR
|
||
which does the same thing and then increments the pointer to the
|
||
storage place to point after the number.
|
||
Defined a macro EXTRACT_NUMBER which extracts a number from two
|
||
continguous bytes. Also defined EXTRACT_NUMBER_AND_INCR which
|
||
does the same thing and then increments the pointer to the
|
||
source to point to after where the number was.
|
||
|
||
Tue Jan 16 12:09:19 1990 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.h: Incorporated rms' changes.
|
||
Defined RE_NO_BK_REFS syntax bit which is set when want to
|
||
interpret back reference patterns as literals.
|
||
Defined RE_NO_EMPTY_BRACKETS syntax bit which is set when want
|
||
empty bracket expressions to be illegal.
|
||
Defined RE_CONTEXTUAL_ILLEGAL_OPS syntax bit which is set when want
|
||
it to be illegal for *, +, ? and { to be first in an re or come
|
||
immediately after a | or a (, and for ^ not to appear in a
|
||
nonleading position and $ in a nontrailing position (outside of
|
||
bracket expressions, that is).
|
||
Defined RE_LIMITED_OPS syntax bit which is set when want +, ?
|
||
and | to always be literals instead of ops.
|
||
Fixed up the Posix syntax.
|
||
Changed the syntax bit comments from saying, e.g., ``0 means...''
|
||
to ``If this bit is set, it means...''.
|
||
Changed the syntax bit defines to use shifts instead of integers.
|
||
|
||
* regex.c: (global): Incorporated rms' changes.
|
||
|
||
(re_compile_pattern): Incorporated rms' changes
|
||
Made it illegal for a $ to appear anywhere but inside a bracket
|
||
expression or at the end of an re when RE_CONTEXTUAL_ILLEGAL_OPS
|
||
is set. Made the same hold for $ except it has to be at the
|
||
beginning of an re instead of the end.
|
||
Made the re "[]" illegal if RE_NO_EMPTY_BRACKETS is set.
|
||
Made it illegal for | to be first or last in an re, or immediately
|
||
follow another | or a (.
|
||
Added and embellished some comments.
|
||
Allowed \{ to be interpreted as a literal if RE_NO_BK_CURLY_BRACES
|
||
is set.
|
||
Made it illegal for *, +, ?, and { to appear first in an re, or
|
||
immediately follow a | or a ( when RE_CONTEXTUAL_ILLEGAL_OPS is set.
|
||
Made back references interpreted as literals if RE_NO_BK_REFS is set.
|
||
Made recursive intervals either illegal (if RE_NO_BK_CURLY_BRACES
|
||
isn't set) or interpreted as literals (if is set), if RE_INTERVALS
|
||
is set.
|
||
Made it treat +, ? and | as literals if RE_LIMITED_OPS is set.
|
||
Cleaned up some code.
|
||
|
||
Thu Dec 21 15:31:32 1989 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c: (global): Moved RE_DUP_MAX to regex.h and made it
|
||
equal 2^15 - 1 instead of 1000.
|
||
Defined NULL to be zero.
|
||
Moved the definition of BYTEWIDTH to regex.h.
|
||
Made the global variable obscure_syntax nonstatic so the tests in
|
||
another file could use it.
|
||
|
||
(re_compile_pattern): Defined a maximum length (CHAR_CLASS_MAX_LENGTH)
|
||
for character class strings (i.e., what's between the [: and the
|
||
:]'s).
|
||
Defined a macro SET_LIST_BIT(c) which sets the bit for C in a
|
||
character set list.
|
||
Took out comments that EXTEND_BUFFER clobbers C.
|
||
Made the string "^" match itself, if not RE_CONTEXT_IND_OPS.
|
||
Added character classes to bracket expressions.
|
||
Change the laststart pointer saved with the start of each
|
||
subexpression to point to start_memory instead of after the
|
||
following register number. This is because the subexpression
|
||
might be in a loop.
|
||
Added comments and compacted some code.
|
||
Made intervals only work if preceded by an re matching a single
|
||
character or a subexpression.
|
||
Made back references to nonexistent subexpressions illegal if
|
||
using POSIX syntax.
|
||
Made intervals work on the last preceding character of a
|
||
concatenation of characters, e.g., ab{0,} matches abbb, not abab.
|
||
Moved macro PREFETCH to outside the routine.
|
||
|
||
(re_compile_fastmap): Added succeed_n to work analogously to
|
||
on_failure_jump if n is zero and jump_n to work analogously to
|
||
the other backward jumps.
|
||
|
||
(re_match_2): Defined macro SET_REGS_MATCHED to set which
|
||
current subexpressions had matches within them.
|
||
Changed some comments.
|
||
Added reg_active and reg_matched_something arrays to keep track
|
||
of in which subexpressions currently have matched something.
|
||
Defined MATCHING_IN_FIRST_STRING and replaced ``dend == end_match_1''
|
||
with it to make code easier to understand.
|
||
Fixed so can apply * and intervals to arbitrarily nested
|
||
subexpressions. (Lots of previous bugs here.)
|
||
Changed so won't match a newline if syntax bit RE_DOT_NOT_NULL is set.
|
||
Made the upcase array nonstatic so the testing file could use it also.
|
||
|
||
(main.c): Moved the tests out to another file.
|
||
|
||
(tests.c): Moved all the testing stuff here.
|
||
|
||
Sat Nov 18 19:30:30 1989 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c: (re_compile_pattern): Defined RE_DUP_MAX, the maximum
|
||
number of times an interval can match a pattern.
|
||
Added macro GET_UNSIGNED_NUMBER (used to get below):
|
||
Added variables lower_bound and upper_bound for upper and lower
|
||
bounds of intervals.
|
||
Added variable num_fetches so intervals could do backtracking.
|
||
Added code to handle '{' and "\{" and intervals.
|
||
Added to comments.
|
||
|
||
(store_jump_n): (Added) Stores a jump with a number following the
|
||
relative address (for intervals).
|
||
|
||
(insert_jump_n): (Added) Inserts a jump_n.
|
||
|
||
(re_match_2): Defined a macro ESTACK_PUSH_2 for the error stack;
|
||
it checks for overflow and reallocates if necessary.
|
||
|
||
* regex.h: Added bits (RE_INTERVALS and RE_NO_BK_CURLY_BRACES)
|
||
to obscure syntax to indicate whether or not
|
||
a syntax handles intervals and recognizes either \{ and
|
||
\} or { and } as operators. Also added two syntaxes
|
||
RE_SYNTAX_POSIX_BASIC and RE_POSIX_EXTENDED and two command codes
|
||
to the enumeration regexpcode; they are succeed_n and jump_n.
|
||
|
||
Sat Nov 18 19:30:30 1989 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c: (re_compile_pattern): Defined INIT_BUFF_SIZE to get rid
|
||
of repeated constants in code. Tested with value 1.
|
||
Renamed PATPUSH as BUFPUSH, since it pushes things onto the
|
||
buffer, not the pattern. Also made this macro extend the buffer
|
||
if it's full (so could do the following):
|
||
Took out code at top of loop that checks to see if buffer is going
|
||
to be full after 10 additions (and reallocates if necessary).
|
||
|
||
(insert_jump): Rearranged declaration lines so comments would read
|
||
better.
|
||
|
||
(re_match_2): Compacted exactn code and added more comments.
|
||
|
||
(main): Defined macros TEST_MATCH and MATCH_SELF to do
|
||
testing; took out loop so could use these instead.
|
||
|
||
Tue Oct 24 20:57:18 1989 Kathy Hargreaves (kathy at hayley)
|
||
|
||
* regex.c (re_set_syntax): Gave argument `syntax' a type.
|
||
(store_jump, insert_jump): made them void functions.
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
left-margin: 8
|
||
version-control: never
|
||
End:
|