Next: Perl scanner, Previous: Assembler scanner, Up: Extraction options
The plain text scanner is intended for human-language documents, or as the scanner of last resort for files that have no scanner that is more specific. It is customizable to the extent that character classes can be designated as token constituents or as token delimiters. The default token constituents are the alpha-numerics; all other characters are considered token delimiters.