The PIGBARF system in essence is a grammar based YACC-like tool for producing Java code that will parse a particular binary file format. The syntax is intentionally similar to YACC, so users familiar to YACC should find it easy to use.
The grammar is defined inside the <rules> tag. It involves defining a set of nonterminal rules. A nonterminal consists of a name, some (optional) parameters, and a right-hand side. The right-hand side of a rule is a list of rule items. Rule items are of two types: value items and non-value items. Value items will read something from the data stream and produce a value as a result. Every nonterminal rule is the definition of a value rule item. The result value of the nonterminal is stored in a variable named $0. The value of this variable must be set inside the nonterminal rule by some code rule item. The other value rule items' values are assigned to variables named $1, $2, etc. They can be referenced and manipulated by code rule items or predicate rule items.
Example:<rules start="mynonterm"> mynonterm ::= int <!-- value rule item, stored into $1 --> int <!-- value rule item, stored into $2 --> {{ $1 == $2 }} <-- predicate rule item, has no value --> [[ $0 = "the result value"; ]] <-- code rule item, has no value, assigns value to $0 --> ;; </rules>
In the example above, the type of variables $1 and $2 is defined by the type associated with the int terminal rule. The type of the $0 variable is result type of the nonterminal being defined. This type is specified in the <types> tag.
Example:<types> String : mynonterm ; </types>
Nonterminals may also take parameters. The parameters will be Java values that are passed from one rule to another. The parameters are named in the rule definition and the types of the parameters is specified in the types definition.
Example:<rules start="mynonterm3"> mynonterm3 ::= int mynonterm2([[$1]]) [[ $0 = $2; ]] ;; mynonterm2(numelements) ::= chars(length=[[numelements]]) [[ $0 = $1; ]] ;; </rules> <types> (int) char[] : mynonterm2 ; char[] : mynonterm3 ; </types>
The nonterminal 'mynonterm2' takes 1 parameter named 'numelements'. The type of numelements is 'int', as specified in the types tag. For a nonterminal with parameters, the types of the parameters go first, in parentheses, separated by commans. Then after the parens comes the result type of the nonterminal.
Now you know the basic idea behind creating nonterminals and assigning their types. To get more details on the different types of rule items you can use, go here.
The PIGBARF system is a Java JAR that runs on the command line. The command-line interface takes two parameters: the name of the XML specification file, and the desired name of the parser class that PIGBARF produces as output. The output itself will be Java code that is sent to STDOUT. The JAR file is executable, and the main class is pigbarf.Main.
Example Usage:// Terminal $ java -jar pigbarf.jar USAGE: Main <spec.xml> <class name> $ java -jar pigbarf.jar stl.xml STLParser > STLParser.java $ // now STLParser.java contains the Java code that was produced by PIGBARF