flex has the following options:
yy_flex_debug is non-zero (which is the default),
the scanner will write to stderr a line of the
form:
--accepting rule at line 53 ("the matched text")
The line number refers to the location of the rule
in the file defining the scanner (i.e., the file
that was fed to flex). Messages are also generated
when the scanner backs up, accepts the default
rule, reaches the end of its input buffer (or
encounters a NUL; at this point, the two look the
same as far as the scanner's concerned), or reaches
an end-of-file.
flex's options to
stdout and then exits. `-?' and `--help' are synonyms
for `-h'.
flex to generate a case-insensitive
scanner. The case of letters given in the flex input
patterns will be ignored, and tokens in the input
will be matched regardless of case. The matched
text given in yytext will have the preserved case
(i.e., it will not be folded).
lex implementation. Note that this does not
mean full compatibility. Use of this option costs
a considerable amount of performance, and it cannot
be used with the `-+, -f, -F, -Cf', or `-CF' options.
For details on the compatibilities it provides, see
the section "Incompatibilities With Lex And POSIX"
below. This option also results in the name
YY_FLEX_LEX_COMPAT being #define'd in the generated
scanner.
flex input file which will cause a serious loss
of performance in the resulting scanner. If you
give the flag twice, you will also get comments
regarding features that lead to minor performance
losses.
Note that the use of REJECT, `%option yylineno' and
variable trailing context (see the Deficiencies / Bugs section below)
entails a substantial performance penalty; use of `yymore()',
the `^' operator, and the `-I' flag entail minor performance
penalties.
stdout) to be suppressed. If
the scanner encounters input that does not match
any of its rules, it aborts with an error. This
option is useful for finding holes in a scanner's
rule set.
flex to write the scanner it generates to
standard output instead of `lex.yy.c'.
flex should write to stderr a
summary of statistics regarding the scanner it
generates. Most of the statistics are meaningless to
the casual flex user, but the first line identifies
the version of flex (same as reported by `-V'), and
the next line the flags used when generating the
scanner, including those that are on by default.
flex to generate a batch scanner, the
opposite of interactive scanners generated by `-I'
(see below). In general, you use `-B' when you are
certain that your scanner will never be used
interactively, and you want to squeeze a little more
performance out of it. If your goal is instead to
squeeze out a lot more performance, you should be
using the `-Cf' or `-CF' options (discussed below),
which turn on `-B' automatically anyway.
"case" return TOK_CASE; "switch" return TOK_SWITCH; ... "default" return TOK_DEFAULT; [a-z]+ return TOK_ID;then you're better off using the full table representation. If only the "identifier" rule is present and you then use a hash table or some such to detect the keywords, you're better off using `-F'. This option is equivalent to `-CFr' (see below). It cannot be used with `-+'.
flex to generate an interactive scanner.
An interactive scanner is one that only looks ahead
to decide what token has been matched if it
absolutely must. It turns out that always looking one
extra character ahead, even if the scanner has
already seen enough text to disambiguate the
current token, is a bit faster than only looking ahead
when necessary. But scanners that always look
ahead give dreadful interactive performance; for
example, when a user types a newline, it is not
recognized as a newline token until they enter
another token, which often means typing in another
whole line.
Flex scanners default to interactive unless you use
the `-Cf' or `-CF' table-compression options (see
below). That's because if you're looking for
high-performance you should be using one of these
options, so if you didn't, flex assumes you'd
rather trade off a bit of run-time performance for
intuitive interactive behavior. Note also that you
cannot use `-I' in conjunction with `-Cf' or `-CF'.
Thus, this option is not really needed; it is on by
default for all those cases in which it is allowed.
You can force a scanner to not be interactive by
using `-B' (see above).
flex not to generate `#line' directives.
Without this option, flex peppers the generated
scanner with #line directives so error messages in
the actions will be correctly located with respect
to either the original flex input file (if the
errors are due to code in the input file), or
`lex.yy.c' (if the errors are flex's fault -- you
should report these sorts of errors to the email
address given below).
flex run in trace mode. It will generate a
lot of messages to stderr concerning the form of
the input and the resultant non-deterministic and
deterministic finite automata. This option is
mostly for use in maintaining flex.
stdout and exits.
`--version' is a synonym for `-V'.
flex to generate a 7-bit scanner, i.e.,
one which can only recognized 7-bit characters in
its input. The advantage of using `-7' is that the
scanner's tables can be up to half the size of
those generated using the `-8' option (see below).
The disadvantage is that such scanners often hang
or crash if their input contains an 8-bit
character.
Note, however, that unless you generate your
scanner using the `-Cf' or `-CF' table compression options,
use of `-7' will save only a small amount of table
space, and make your scanner considerably less
portable. Flex's default behavior is to generate
an 8-bit scanner unless you use the `-Cf' or `-CF', in
which case flex defaults to generating 7-bit
scanners unless your site was always configured to
generate 8-bit scanners (as will often be the case
with non-USA sites). You can tell whether flex
generated a 7-bit or an 8-bit scanner by inspecting
the flag summary in the `-v' output as described
above.
Note that if you use `-Cfe' or `-CFe' (those table
compression options, but also using equivalence
classes as discussed see below), flex still
defaults to generating an 8-bit scanner, since
usually with these compression options full 8-bit
tables are not much more expensive than 7-bit
tables.
flex to generate an 8-bit scanner, i.e.,
one which can recognize 8-bit characters. This
flag is only needed for scanners generated using
`-Cf' or `-CF', as otherwise flex defaults to
generating an 8-bit scanner anyway.
See the discussion of `-7' above for flex's default
behavior and the tradeoffs between 7-bit and 8-bit
scanners.
flex to construct equivalence classes,
i.e., sets of characters which have identical
lexical properties (for example, if the only appearance
of digits in the flex input is in the character
class "[0-9]" then the digits '0', '1', ..., '9'
will all be put in the same equivalence class).
Equivalence classes usually give dramatic
reductions in the final table/object file sizes
(typically a factor of 2-5) and are pretty cheap
performance-wise (one array look-up per character
scanned).
`-Cf' specifies that the full scanner tables should
be generated - flex should not compress the tables
by taking advantages of similar transition
functions for different states.
`-CF' specifies that the alternate fast scanner
representation (described above under the `-F' flag)
should be used. This option cannot be used with
`-+'.
`-Cm' directs flex to construct meta-equivalence
classes, which are sets of equivalence classes (or
characters, if equivalence classes are not being
used) that are commonly used together.
Meta-equivalence classes are often a big win when using
compressed tables, but they have a moderate
performance impact (one or two "if" tests and one array
look-up per character scanned).
`-Cr' causes the generated scanner to bypass use of
the standard I/O library (stdio) for input.
Instead of calling `fread()' or `getc()', the scanner
will use the `read()' system call, resulting in a
performance gain which varies from system to
system, but in general is probably negligible unless
you are also using `-Cf' or `-CF'. Using `-Cr' can cause
strange behavior if, for example, you read from
yyin using stdio prior to calling the scanner
(because the scanner will miss whatever text your
previous reads left in the stdio input buffer).
`-Cr' has no effect if you define YY_INPUT (see The
Generated Scanner above).
A lone `-C' specifies that the scanner tables should
be compressed but neither equivalence classes nor
meta-equivalence classes should be used.
The options `-Cf' or `-CF' and `-Cm' do not make sense
together - there is no opportunity for
meta-equivalence classes if the table is not being
compressed. Otherwise the options may be freely
mixed, and are cumulative.
The default setting is `-Cem', which specifies that
flex should generate equivalence classes and
meta-equivalence classes. This setting provides the
highest degree of table compression. You can trade
off faster-executing scanners at the cost of larger
tables with the following generally being true:
slowest & smallest
-Cem
-Cm
-Ce
-C
-C{f,F}e
-C{f,F}
-C{f,F}a
fastest & largest
Note that scanners with the smallest tables are
usually generated and compiled the quickest, so
during development you will usually want to use the
default, maximal compression.
`-Cfe' is often a good compromise between speed and
size for production scanners.
put instead of `lex.yy.c'. If you combine `-o' with
the `-t' option, then the scanner is written to
stdout but its `#line' directives (see the `-L' option
above) refer to the file output.
flex for all
globally-visible variable and function names to instead be prefix.
For example, `-Pfoo' changes the name of yytext to
`footext'. It also changes the name of the default output file
from `lex.yy.c' to `lex.foo.c' (`lexfoo.c' on MS-DOS).
Here are all of the names affected:
yy_create_buffer yy_delete_buffer yy_flex_debug yy_init_buffer yy_flush_buffer yy_load_buffer_state yy_switch_to_buffer yyin yyleng yylex yylineno yyout yyrestart yytext yywrap(If you are using a C++ scanner, then only
yywrap
and yyFlexLexer are affected.) Within your scanner
itself, you can still refer to the global variables
and functions using either version of their name;
but externally, they have the modified name.
This option lets you easily link together multiple
flex programs into the same executable. Note,
though, that using this option also renames
`yywrap()', so you now must either provide your own
(appropriately-named) version of the routine for
your scanner, or use `%option noyywrap', as linking
with `-lfl' no longer provides one for you by
default.
flex
constructs its scanners. You'll never need this
option unless you are doing flex maintenance or
development.
flex also provides a mechanism for controlling options
within the scanner specification itself, rather than from
the flex command-line. This is done by including `%option'
directives in the first section of the scanner
specification. You can specify multiple options with a single
`%option' directive, and multiple directives in the first
section of your flex input file. Most options are given
simply as names, optionally preceded by the word "no"
(with no intervening whitespace) to negate their meaning.
A number are equivalent to flex flags or their negation:
7bit -7 option
8bit -8 option
align -Ca option
backup -b option
batch -B option
c++ -+ option
caseful or
case-sensitive opposite of -i (default)
case-insensitive or
caseless -i option
debug -d option
default opposite of -s option
ecs -Ce option
fast -F option
full -f option
interactive -I option
lex-compat -l option
meta-ecs -Cm option
perf-report -p option
read -Cr option
stdout -t option
verbose -v option
warn opposite of -w option
(use "%option nowarn" for -w)
array equivalent to "%array"
pointer equivalent to "%pointer" (default)
Some `%option's' provide features otherwise not available:
noyywrap (see below).
yyin
and yyout to nil FILE pointers, instead of stdin
and stdout.
flex to generate a scanner that maintains the number
of the current line read from its input in the global variable
yylineno. This option is implied by `%option lex-compat'.
yyin at a new file and calls
`yylex()' again).
flex scans your rule actions to determine whether you use
the REJECT or `yymore()' features. The reject and yymore
options are available to override its decision as to
whether you use the options, either by setting them (e.g.,
`%option reject') to indicate the feature is indeed used, or
unsetting them to indicate it actually is not used (e.g.,
`%option noyymore').
Three options take string-delimited values, offset with '=':
%option outfile="ABC"
is equivalent to `-oABC', and
%option prefix="XYZ"
is equivalent to `-PXYZ'.
Finally,
%option yyclass="foo"
only applies when generating a C++ scanner (`-+' option). It
informs flex that you have derived `foo' as a subclass of
yyFlexLexer so flex will place your actions in the member
function `foo::yylex()' instead of `yyFlexLexer::yylex()'.
It also generates a `yyFlexLexer::yylex()' member function that
emits a run-time error (by invoking `yyFlexLexer::LexerError()')
if called. See Generating C++ Scanners, below, for additional
information.
A number of options are available for lint purists who want to suppress the appearance of unneeded routines in the generated scanner. Each of the following, if unset, results in the corresponding routine not appearing in the generated scanner:
input, unput yy_push_state, yy_pop_state, yy_top_state yy_scan_buffer, yy_scan_bytes, yy_scan_string
(though `yy_push_state()' and friends won't appear anyway unless you use `%option stack').
Go to the first, previous, next, last section, table of contents.