What are BNF grammar examples

BNF - EBNF

1. Backus-Naur form (BNF)

Formal languages ​​are clearly described by grammars. The Backus-Naur-Form (BNF) is a descriptive language for such grammars. It knows the elements "terminal symbols", "non-terminal symbols" and "productions".

Terminal symbols are characters (letters, numbers, special characters) that cannot be broken down further. Non-terminal symbols (hereinafter referred to as "variables") are gradually converted into terminal symbols by production rules. The sum of all productions forms the grammar.

A production rule has the structure

Variable :: = expression.

The expression on the right describes the structure of the non-terminal symbol on the left. It consists of a list of terminal and / or non-terminal symbols. Variable names are put in angle brackets for better readability. A vertical line means "or".

An example should clarify this. In the C programming language, identifiers must follow the following rules: "Identifiers consist of digits, letters and underscores, whereby the first character cannot be a digit".

Examples of valid identifiers are:

x, _x21, _a_b_c_, a1_b2

Examples of invalid identifiers are:

123, 2_a, 42bcd

A grammar according to BNF describes the rules for identifiers as follows:

[1] :: = | _ | | _ [2] :: = | _ | | | _ | [3] :: = A | ... | Z | a | ... | z [4] :: = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Please note the recursion in rule [2], which allows any number of additional letters, digits and underscores. Rule [3] is also written in a short form, which is actually Not Is part of the BNF.

[Back to table of contents]

2. Extended Backus-Naur Form (EBNF)

By adding the following elements, the BNF becomes the EBNF:

  • Grouping through brackets
  • Optionality
  • Repetitions

In addition, variables are not placed in angle brackets; instead, terminal symbols are enclosed in single or double quotation marks.

The characters for optionality and repetition have the following meanings:

? – one or none
* – any number (including none)
+ – any number, but at least one

The above example of the C identifier is written as follows in the EBNF:

[1] Identifier :: = (letter | '_') (letter | '_' | digit) * [2] letter :: = 'A' | ... | 'Z' | 'a' | ... | 'z' [3] digit :: = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'

You can see that the expansion of the EBNF saves one rule compared to the BNF and a more compact representation is achieved. The "remainder" rule is no longer required and the remaining "identifier" rule is formulated more compact and clearer and therefore more readable than [1] and [2] in the BNF representation.

[Back to table of contents]