Building Parsers with MGrammar for .Net applications

by Doug Finke on November 2, 2008

in Intellipad, MGrammar, Oslo, Parser

MGrammar, part of the M Language, is Microsoft’s new addition for writing textual, domain-specific languages. Combined with Microsoft’s Intellipad, it lowers the barrier for developing DSLs for the .Net platform.

"M" is a core feature of "Oslo" and is a language for textually describing and authoring domains. "M" comprises the following: MGraph, MSchema, and MGrammar. MGraph is for serializing data values to a graph structure similar to syntaxes like JSON. MSchema builds on MGraph by providing a structural type system, extent declarations for storing values, and computed values, which are queries over values and extents. MGrammar is used to describe domain-specific languages in terms of rules that are used to transform input text to MGraph.

The following is right out of Steve Metsker’s book Building Parsers with Java. It is amazing how easy it is to use his grammar shorthand to ramp up with MGrammar.

Example: Designing a Grammar for a Track Robot

The idea is to create a command language for a simple factory. The robot can pick up, place and scan material on different conveyor belts. Here are some example commands for the robot:

   1: pick carrier from LineIn
   2: place carrier at DBOut
   3: scan DBOut

Grammar Rules

Using the shorthand rules we  have this

command = pickcommand | placeCommand | scanCommand;
pickCommand = "pick" "carrier" "from" location;
location = Word;

Intellipad and MGrammar

Create a file called Robot.mg, start Intellipad. Make sure to start Intellipad (Samples Enabled).

image

Once launched, press Ctrl-Shift-T. In the open file dialog specify the location of Robot.mg. There should be 4 panes.

image

The middle pane, in MGrammarMode, is where we will define the Robot grammar using MGrammar. The left pane, DynamicParserMode, is where we can type the example commands. The right pane, MGrammarPreviewMode, will contain the parse tree when the example commands are successfully processed by the grammar. The bottom pane, HyperlinkMode, shows errors from either the left pane or middle pane.

As you specify the grammar in the middle, errors will show up in the bottom. For things like invalid MGrammar syntax or errors in your example commands being fed into your grammar.

Productivity

Intellipad is productive. Typing in either the left or middle pane kicks off the cycle. Meaning, it compiles the grammar and takes whatever is in the DynamicParserMode and runs it against your parser. Errors will show up as squiggles and hyperlinked elements in the error pane.

Let’s start by defining the pickCommand. In the DynamicParserMode pane I enter the example command and then flesh out the grammar in the middle.

image

Here, I use MGrammar’s module, language and syntax keywords. I do not compile the grammar, Intellipad does this as I type, and reports errors as I go. Intellipad also runs the text in the DynamicParserMode pane against the grammar. This streamlines the process. I don’t have to set up a test harness, or compile the grammar or create the code for feeding the commands to my DSL.

Defining the pickCommand

Below I define pickCommand taking the literal “pick”.  The MGrammar statement syntax Main = command; kicks off the matching against the process.

Intellipad reports an error showing a red squiggle in my robot command. You can click the error line in the HyperlinkMode window and it will jump to that location.

image

Add a few more literals to pickCommand and the interleave keyword to handle whitespace.

Notice the the error moves to LineIn in the example command. LineIn is a ‘variable’ and we’ll treat it as a Word. We’ll build this pattern next.

image

Finishing the pickCommand

Below a few things have been done:

  • Added another example command
  • Defined Word and added it to our pickCommand syntax
  • Indicated in Main that commands can be 1 or more with the + sign

Our grammar compiles, the example commands process and the parsed results appear in the PreviewMode on the right.

image

The Full Robot Grammar

image

Next Steps

This is just scratching the surface. Both Intellipad and MGrammar have far more to explore. Intellipad is highly customizable. Its use of IronPython is just one way to extend it. MGrammar also supports modularity, advanced grammar techniques, including parameterization and recursion and custom projections.

Next we need to use the language in a .Net application. This requires the mgx compiler, and new namespaces, System.Dataflow and Microsoft.M.Grammar.

Source for Robot Grammar

module RobotLibrary
{
    language Robot
    {        
        syntax command 
            = pickCommand 
            | placeCommand 
            | scanCommand;
            
        syntax pickCommand 
            = "pick" "carrier" "from" Word;
 
        syntax placeCommand 
            = "place" "carrier" "at" Word;
 
        syntax scanCommand 
            = "scan" Word;
        
        token Word = ('A'..'Z' | 'a'..'z')+;
        
        syntax Main = command+;
        interleave Whitespace = " " | "\r" | "\n";
    }
}

Resources

{ 2 comments… read them below or add one }

1 Grant 11.07.08 at 4:45 pm

Thanks for explaining how to augment Parsers and DSL by picking up where Steve Metsker left off. As you explained this is now a functional approach, rather than Metskers Java (your C#) object approach. This approach is easily understood and I look foward to seeing the full round trip integration with embedding a M based DSL in a C# program. Keep up the good sleuthing and kudos for recognizing the value this approach brings to the Modern programmer’s toolkit.

2 Grant 12.05.08 at 6:28 pm

Good tutorial for those just starting out with M.
The plus ordinal indicator and pipe | all speak to an easier syntax than BNF

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>