Rules

The rules tag is the most important tag in the specification, because it defines the actual grammar for the binary file format. The user can define any number of nonterminal rules that can be used to parse the format, each one of which will have a type. A rule's type will either be a simple type or a functional type. A simple type is just the result type of the rule, which must be a valid Java type. A functional type has one or more parameters which are themselves simple types, as well as a result type which is also a simple type. Only a rule that takes parameters may have a functional type.

Terminal Rules

There are many predefined terminal symbols that can be used to parse atomic elements of a binary file. We describe those terminal symbols below:

Nonterminal Rules

In addition to the predefined terminal rules, the user may define new nonterminal rules.

Examples:
         u8 ::= byte(cast=int) [[ $0 = $1; ]] ;;

         color ::= u8 u8 u8 [[ $0 = new int[]{$1, $2, $3}; ]] ;;

         colorTable(size) ::= list(type=color; length=[[size]])
                              [[ $0 = new ColorTable();
                                 for (int[] color : $1)
                                    $0.addColor(color);
                              ]] ;;
      

Switch Rule

The switch rule is one that will multiplex based on a given input value. The value will be compared against several case values, and the first matching case value will cause the corresponding rule item to be used.

Syntax:
        switch [[target]] {
          [[case1]] -> action1 ;;
          [[case2]] -> action2 ;;
          ...
          [[caseN]] -> actionN ;;
          default -> defaultAction ;;
        }
      

The target of a switch rule is a Java expression. The expression can evaluate to any valid Java type. The case expressions are also Java expressions, and they will be evaluated one by one and compared against the target expression. The comparison is done through the equals method, and primitives are wrapped before comparison is performed. The first case expression that matches the target will have its action rule item executed. The default case is required and its action will be executed if none of the other cases match.

Expression Rule

The expression rule item is a way to inject a Java expression into the item numbering. It contains both the type of the expression and the Java code for the expression itself.

Syntax:
        [:[ java type ]:[ java expression ]:]
      

Subparse Rule

The subparse rule item allows you to parse a Java buffer or stream according a rule.

Syntax
        subparse(input=[[ java expression  ]]; rule=any rule item)
      

The input value is a Java expression whose type must be either byte[] or java.util.InputStream. The rule value is any valid rule item symbol. The result value of the subparse is the result of parsing the given rule item on the given input expression.

Predicate Rule

The predicate rule does not produce a value. Instead, it evaluates a boolean Java expression. If the Java expression evaluates to false, then the parsing fails and backtracks. If it evaluates to true, then no action is taken and parsing continues.

Syntax:
        {{ java boolean expression }}
      

Code Rule

The code rule does not produce a value. It executes a piece of Java code that runs in the current context. It has access to all of the $N values that have been parsed so far. If multiple code blocks are used in the same rule, then they execute in the same context and may share variables.

Syntax:
        [[ java code ]]
      

Align Rule

The align rule does not produce a value. It takes an integer Java expression as a parameter, and adjusts the current stream index to be aligned on the given byte boundary.

Syntax:
        align( [[ java int expression ]] )
      

EOF Rule

The EOF rule does not produce a value. It matches against the end of the current stream. The EOF rule will only succeed if it is the very last rule of the grammar.

Syntax:
        EOF