Lex and yacc vtu programs




















There's also live online events, interactive content, certification prep materials, and more. Lex and yacc help you write programs that transform structured input. This includes an enormous range of applications—anything from a simple text search program that looks for patterns in its input file to a C compiler that transforms a source program into optimized object code. In programs with structured input, two tasks that occur over and over are dividing the input into meaningful units, and then discovering the relationship among the units.

For a C program, the units are variable names, constants, strings, operators, punctuation, and so forth. This division into units which are usually called tokens is known as lexical analysis , or lexing for short. Lex helps you by taking a set of descriptions of possible tokens and producing a C routine, which we call a lexical analyzer , or a lexer , or a scanner for short, that can identify those tokens.

The set of descriptions you give to lex is called a lex specification. The token descriptions that lex uses are known as regular expressions , extended versions of the familiar patterns used by the grep and egrep commands.

Lex turns these regular expressions into a form that the lexer can use to scan the input text extremely fast, independent of the number of expressions that it is trying to match. A lex lexer is almost always faster than a lexer that you might write in C by hand. As the input is divided into tokens, a program often needs to establish the relationship among the tokens.

A C compiler needs to find the expressions, statements, declarations, blocks, and procedures in the program. This task is known as parsing and the list of rules that define the relationships that the program understands is a grammar.

Yacc takes a concise description of a grammar and produces a C routine that can parse that grammar, a parser. A yacc parser is generally not as fast as a parser you could write by hand, but the ease in writing and modifying the parser is invariably worth any speed loss.

The amount of time a program spends in a parser is rarely enough to be an issue anyway. When a task involves dividing the input into units and establishing some relationship among those units, you should think of lex and yacc. We do not intend for this chapter to be a complete tutorial on lex and yacc, but rather a gentle introduction to their use. It acts very much like the UNIX cat command run with no arguments. Lex automatically generates the actual C program code needed to handle reading the input file and sometimes, as in this case, writing the output as well.

We start by identifying parts of speech noun, verb, etc. Example shows a simple lex specification to recognize these verbs. What we type is in bold. This first section, the definition section , introduces any initial C program code we want copied into the final program. This is especially important if, for example, we have header files that must be included for code later in the file to work.

In this example, the only thing in the definition section is some C comments. You might wonder whether we could have included the comments without the delimiters. The next section is the rules section. Each rule is made up of two parts: a pattern and an action , separated by whitespace. The lexer that lex generates will execute the action when it recognizes the pattern.

These patterns are UNIX-style regular expressions, a slightly extended version of the same expressions used by tools such as grep, sed , and ed. Chapter 6 describes all the rules for regular expressions. The first rule in our example is the following:. Thus, this pattern describes whitespace any combination of tabs and spaces.

The second part of the rule, the action , is simply a semicolon, a do-nothing C statement. Its effect is to ignore the input. This is a special action that means to use the same action as the next pattern, so all of the verbs use the action specified for the last one.

Our patterns match any of the verbs in the list. Once we recognize a verb, we execute the action, a C printf statement. The array yytext contains the text that matched the pattern. The answer is that lex has a set of simple disambiguating rules. The two that make our lexer work are:. Lex executes the action for the longest possible match for the current input. If you think about how a lex lexer matches patterns, you should be able to see how our example matches only the verbs listed.

The last line is the default case. The special action ECHO prints the matched pattern on the output, copying any punctuation or other characters. We explicitly list this case although it is the default behavior. We have seen some complex lexers that worked incorrectly because of this very feature, producing occasional strange output when the default pattern matched unanticipated input characters. Alfred V Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Assemblers: Basic assembler functions, machine dependent assembler features, machine independent assembler features, assembler design options.

Module-2 Introduction 10 hours Introduction: Language Processors, The structure of a compiler, The evaluation of programming languages, The science of building compiler, Applications of compiler technology.

System Software by Leland. Beck, D Manjula, 3rd edition, 2. Reference Books: 1. Systems programming — Srimanta Pal , Oxford university press, 2. When the executable is run, it analyzes its input for occurrences of the regular expressions. Whenever it finds one, it executes the corresponding C code. Once you are proficient with Bison, you may use it to develop a wide range of language parsers, from those used in simple desk calculators to complex programming languages. Bison is upward compatible with Yacc: all properly-written Yacc grammars ought to work with Bison with no change.

Anyone familiar with Yacc should be able to use Bison with little trouble. Aho, Ravi Sethi, Jeffrey D.



0コメント

  • 1000 / 1000