Proposal for a Morphable Programming Language

Gil Müller

30.6.2007

In nova fert animus mutatas dicere formas corpora. (Ovid: Metamorphoses)

This document argues for a morphable programming language as a tool for creating domain languages and constructing middleware or frameworks. This language would provides facilities for extending and modifying its syntax and semantics. In order to simplify the morphing a modular language structure is proposed.

Introduction

Why another programming language? Because we need more languages. Domain languages are they called. These are languages which are dedicated to a particular application area like databases, hypertext or 3D programming. They are typically not general purpose languages but rather restricted to the domain or are embedded in a full programming language. They fit the gap between a mere configurable user interface and a general purpose language.

Their specialization make them the ideal means to provide applications or services in the dedicated domain. But quite often they are foreign to their host languages. This means they are either realized as libraries or are just data (like strings). This makes the development process even harder. Those hybrid programs are more difficult to develop because the domain languages share not abilities of the host language and the IDE.

But is this really a matter of the domain languages? It may be asked why are the modern host languages so rigid like C++, Java, C# or the many scripting languages. Why is it not possible to morph such languages so that the syntax and semantics of the domain language can be expressed in the host language.

We will discuss such a language, which we will call Morla.

Will this be a kind of specification language? Surely the language mechanism are borrowed from those languages. But, the language is designed to provide a unified approach. This means there is no distinction between the meta-level and language level. Another problem of specification langauges is that they are rather declarative, this means the language model differs quite a lot from the underlying execution model leading to rather inefficient programs. The aim of Morla is to provide language facilities that are near the means to the target machine.

Before we start discussing the language, we will take a look on the overall process of construction a domain language.

Creating a Domain Language

The task of constructing a domain language requires some experience. First of all knowledge of the given domain is fundamental. Then the designer has to be aware of how the language shall be used. Last not least, the designer has to be trained in language design.

All in all, this requires at least three different roles. The domain analyst will add his or her knowledge of the domain. The language designer is responsible for shaping the languages. Finally, the language engineer will realize the language.

At the present state the construction of domain language is rather a matter of art then real engineering. People draw from their own experience and present or former languages. There is no real methodology for language design.

Of course, there is literature on programming language concepts and methods to specify syntax and semantics. Still we need methodologies on the whole design process. A good starting point could to look how GUI applications are done. Afterwards, we need also some criteria on how we could judge domain languages.

If we do a domain language, what kind of fundamental requirements do we have to consider. The following shall be just a first attempt:

In terms of a life-cycle, it seems to me that the creation of a domain language is a evolutionary process. It could be handled similar to the construction of GUI applications. Im comparision to GUI applications it is much difficult. Actually, it has much more in common with frameworks, which also mature slowly.

The aim for Morla is to offer it as a rapid prototyping tool for domain language creators. It shall provide standard mechanisms for specifying a language. Additionally, it would simplify the task of creation, as the language designer need to start from scratch but could rather extend or modify existing language concepts.

Example: Report Generation

We will now demonstrate how a morphable language can be used to provide a generated report. The scenario would be that we would like to write some text. Certain parts of the text contain program fragments that would fill in text during the execution. Thus our syntax for the report generation language becomes:

  report = tokens
  tokens = token | token tokens
  token = text | "{" program "}" 

Our report is just a sequence of tokens. Each token is either some text or a program, which is the standard start symbol in our grammar. The interpretation is given as follows ([[ . ]] is the interpretation function):

  [[ report ]] = [[ tokens ]]
  [[ tokens ]] = [[ token | token tokens ]]
  [[ token | token tokens ]] = [[ token ]] | [[ token tokens ]]
  [[ token tokens ]] = [[ token ]]; [[ tokens ]]
  [[ token ]] = [[ text | "{" program "}" ]]
  [[ text | "{" program "}" ]] = [[ text ]] | [[ "{" program "}" ]]
  [[ text ]] = print text
  [[ "{" program "}" ]] = print [[ program ]];

This interpretation will output all text parts. It is expected that program fragments return strings as result, which are then printed. Note, that | is a semantic or. It will try first to execute its first operand; if that one is failing the second one is tried.

A variation of the interpretation would let us support literate programming. In literate programming the program and the documentation are part of the same file. The file is interpreted in two ways. For execution of the program all text parts are just ignored and only the program parts are executed. For documentation the file is just transformed to some representation format.

  [[ report ]] = if (docMode)
                   printHeader (); 
                 [[ tokens ]]; 
                 if (docMode)
                   printFooter();
  [[ tokens ]] = [[ token | token tokens ]]
  [[ token | token tokens ]] = [[ token ]] | [[ token tokens ]]
  [[ token ]] = [[ text | "{" program "}" ]]
  [[ text | "{" program "}" ]] = [[ text ]] | [[ "{" program "}" ]]
  [[ text ]] = if (docMode)
                 print text;
  [[ "{" program "}" ]] = if (docMode)
                            printProgram(program);
                          else
                            [[ program ]];

The flag docMode indicates the mode of interprettation: either documentation or execution. The functions printHeader and printFooter produce data for the presentation format (e.g. a HTML header and footer). printProgram takes the syntactical representation of the program and pretty prints it.

The Architecture of Morla

The language or better its interpretation process will be shaped by the language stack. In a standard setting it consists of components which are typically for a compiler or interpreter: a scanner for the lexical analysis, a parser for the syntactical analysis, a type checker and the evaluator which interprets or compiles the program.

In contrast to other languages which might include access to a compiler, morla language components would be configurable. This means the rules for the different components are truly dynamic and may even change during the interpretation of the program. Additionally, also the language stack as a whole shall configurable, so that components can be added or removed.

The result of the interpretation will be executed by the runtime system of Morla. The instructions are given in the core language of Morla. In a initial setting the language stack is configured to support the standard language of Morla.

© 2007 by Gil Müller (www.gil-mueller.com)