| ------------------------------------------------------------------------------ |
| -- -- |
| -- GNAT COMPILER COMPONENTS -- |
| -- -- |
| -- G N A T . R E G E X P -- |
| -- -- |
| -- S p e c -- |
| -- -- |
| -- Copyright (C) 1998-2005, AdaCore -- |
| -- -- |
| -- GNAT is free software; you can redistribute it and/or modify it under -- |
| -- terms of the GNU General Public License as published by the Free Soft- -- |
| -- ware Foundation; either version 2, or (at your option) any later ver- -- |
| -- sion. GNAT is distributed in the hope that it will be useful, but WITH- -- |
| -- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY -- |
| -- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License -- |
| -- for more details. You should have received a copy of the GNU General -- |
| -- Public License distributed with GNAT; see file COPYING. If not, write -- |
| -- to the Free Software Foundation, 51 Franklin Street, Fifth Floor, -- |
| -- Boston, MA 02110-1301, USA. -- |
| -- -- |
| -- As a special exception, if other files instantiate generics from this -- |
| -- unit, or you link this unit with other files to produce an executable, -- |
| -- this unit does not by itself cause the resulting executable to be -- |
| -- covered by the GNU General Public License. This exception does not -- |
| -- however invalidate any other reasons why the executable file might be -- |
| -- covered by the GNU Public License. -- |
| -- -- |
| -- GNAT was originally developed by the GNAT team at New York University. -- |
| -- Extensive contributions were provided by Ada Core Technologies Inc. -- |
| -- -- |
| ------------------------------------------------------------------------------ |
| |
| -- Simple Regular expression matching |
| |
| -- This package provides a simple implementation of a regular expression |
| -- pattern matching algorithm, using a subset of the syntax of regular |
| -- expressions copied from familiar Unix style utilities. |
| |
| ------------------------------------------------------------ |
| -- Summary of Pattern Matching Packages in GNAT Hierarchy -- |
| ------------------------------------------------------------ |
| |
| -- There are three related packages that perform pattern maching functions. |
| -- the following is an outline of these packages, to help you determine |
| -- which is best for your needs. |
| |
| -- GNAT.Regexp (files g-regexp.ads/g-regexp.adb) |
| -- This is a simple package providing Unix-style regular expression |
| -- matching with the restriction that it matches entire strings. It |
| -- is particularly useful for file name matching, and in particular |
| -- it provides "globbing patterns" that are useful in implementing |
| -- unix or DOS style wild card matching for file names. |
| |
| -- GNAT.Regpat (files g-regpat.ads/g-regpat.adb) |
| -- This is a more complete implementation of Unix-style regular |
| -- expressions, copied from the original V7 style regular expression |
| -- library written in C by Henry Spencer. It is functionally the |
| -- same as this library, and uses the same internal data structures |
| -- stored in a binary compatible manner. |
| |
| -- GNAT.Spitbol.Patterns (files g-spipat.ads/g-spipat.adb) |
| -- This is a completely general patterm matching package based on the |
| -- pattern language of SNOBOL4, as implemented in SPITBOL. The pattern |
| -- language is modeled on context free grammars, with context sensitive |
| -- extensions that provide full (type 0) computational capabilities. |
| |
| with Ada.Finalization; |
| |
| package GNAT.Regexp is |
| |
| -- The regular expression must first be compiled, using the Compile |
| -- function, which creates a finite state matching table, allowing |
| -- very fast matching once the expression has been compiled. |
| |
| -- The following is the form of a regular expression, expressed in Ada |
| -- reference manual style BNF is as follows |
| |
| -- regexp ::= term |
| |
| -- regexp ::= term | term -- alternation (term or term ...) |
| |
| -- term ::= item |
| |
| -- term ::= item item ... -- concatenation (item then item) |
| |
| -- item ::= elmt -- match elmt |
| -- item ::= elmt * -- zero or more elmt's |
| -- item ::= elmt + -- one or more elmt's |
| -- item ::= elmt ? -- matches elmt or nothing |
| |
| -- elmt ::= nchr -- matches given character |
| -- elmt ::= [nchr nchr ...] -- matches any character listed |
| -- elmt ::= [^ nchr nchr ...] -- matches any character not listed |
| -- elmt ::= [char - char] -- matches chars in given range |
| -- elmt ::= . -- matches any single character |
| -- elmt ::= ( regexp ) -- parens used for grouping |
| |
| -- char ::= any character, including special characters |
| -- nchr ::= any character except \()[].*+?^ or \char to match char |
| -- ... is used to indication repetition (one or more terms) |
| |
| -- See also regexp(1) man page on Unix systems for further details |
| |
| -- A second kind of regular expressions is provided. This one is more |
| -- like the wild card patterns used in file names by the Unix shell (or |
| -- DOS prompt) command lines. The grammar is the following: |
| |
| -- regexp ::= term |
| |
| -- term ::= elmt |
| |
| -- term ::= elmt elmt ... -- concatenation (elmt then elmt) |
| -- term ::= * -- any string of 0 or more characters |
| -- term ::= ? -- matches any character |
| -- term ::= [char char ...] -- matches any character listed |
| -- term ::= [char - char] -- matches any character in given range |
| -- term ::= {elmt, elmt, ...} -- alternation (matches any of elmt) |
| |
| -- Important note : This package was mainly intended to match regular |
| -- expressions against file names. The whole string has to match the |
| -- regular expression. If only a substring matches, then the function |
| -- Match will return False. |
| |
| type Regexp is private; |
| -- Private type used to represent a regular expression |
| |
| Error_In_Regexp : exception; |
| -- Exception raised when an error is found in the regular expression |
| |
| function Compile |
| (Pattern : String; |
| Glob : Boolean := False; |
| Case_Sensitive : Boolean := True) return Regexp; |
| -- Compiles a regular expression S. If the syntax of the given |
| -- expression is invalid (does not match above grammar, Error_In_Regexp |
| -- is raised. If Glob is True, the pattern is considered as a 'globbing |
| -- pattern', that is a pattern as given by the second grammar above. |
| -- As a special case, if Pattern is the empty string it will always |
| -- match. |
| |
| function Match (S : String; R : Regexp) return Boolean; |
| -- True if S matches R, otherwise False. Raises Constraint_Error if |
| -- R is an uninitialized regular expression value. |
| |
| private |
| type Regexp_Value; |
| |
| type Regexp_Access is access Regexp_Value; |
| |
| type Regexp is new Ada.Finalization.Controlled with record |
| R : Regexp_Access := null; |
| end record; |
| |
| pragma Finalize_Storage_Only (Regexp); |
| |
| procedure Finalize (R : in out Regexp); |
| -- Free the memory occupied by R |
| |
| procedure Adjust (R : in out Regexp); |
| -- Called after an assignment (do a copy of the Regexp_Access.all) |
| |
| end GNAT.Regexp; |