Grammar freaks assemble!

Grammar freaks assemble!

Over the recent years, the attention placed on modeling and modeling devices has heavily declined. While there could be multiple reasons for this*, the fact is that the robustness of the standard modeling language (UML) and modeling tools are not helping the situation. In a number of projects however, it would be reasonable to use smaller, more domain-specific languages but they often require setting aside the convenient features typical to today’s development environments. Or do they?

Ever since we started working with computers, professionals have always made efforts to reduce the complexity of work processes, the end product and created tools that can accelerate and simplify coding. This objective can be achieved by increasing the level of abstraction of the development, making it more organised and managing logically connected units together as independent blocks or modules. In short: building models.

Software modeling has now become a heavily standardised and a practically over-regulated field which frequently forces the mobilisation of a huge range of tools in order to deal with relatively simple issues. Compared to the needs of most projects, UML contains an unnecessary amount of language elements, only a fraction of which are truly necessary. Moreover, most modeling tools do not support the mass modification or editing of the model. Typically, models have to be created through “drawing”, which is unusual for most developers fundamentally accustomed to text-based interfaces. The solution partly lies in DSLs (domain-specific languages), small-size languages typically created for a single, simple purpose through which smaller problems can be easily managed. However, apart from deviating from standard solutions, a practical limitation of DSLs is that creating our own languages doesn’t necessarily provide features (such as syntax check, code augmentation, etc.) that are highly important and useful for effective work.

So, what features should a tool like this provide?

First of all, it has to be simple as it’s being used for languages with a limited scope that don’t require a high level of complexity. Another indispensable requirement is flexibility, as we might wish to use a unique approach in our meta model. The third condition is that the tool has to be primarily text-based (as opposed to the visual language of UML) so one wouldn’t have to draw diagrams in order to enter data into the system.

Created in 2006 and later developed at Eclipse (under the Eclipse Modeling Project from 2008), Xtext fully satisfies these conditions. One of its great advantages is that it’s based on ANTLR (Another Tool for Language Recognition), thanks to which the system automatically provides full parsing and lexing support regardless of what language we create. Xtext is integrated with EMF (the tool used to create meta models in the Eclipse world), if a meta model is uploaded in the system, Xtext can automatically generate the corresponding syntax.

Let’s look at a specific example, we create a simple database description language containing tables and sequences. The tables have columns, the columns have a data type and a flag to indicate whether the column is nullable or non-nullable. Additionally, foreign keys can be added to the tables: a given column of a table can point to the given column of another table (no self-references are allowed).

Database descriptive language meta model


This can be concisely defined in the following syntax: 

Database descriptive syntax

The created syntax doesn’t yet provide the opportunity to, for example, restrict which columns can be part of the foreign key expressions. For this reason, we could easily create invalid references (e.g., between the columns of two entirely different tables, or in our special case, itself). In order to prevent this from happening, we would like the system to automatically offer the elements (similar to most IDE code-completion features) that we can select from in the given context. Similarly, in the case of simple formal mistakes in the code, such as a name starting with a lower-case letter, we would like the system to automatically provide a warning of the in appropriate format in an automatic pop-up text bubble. Xtext’s scoping and validator functions provide assistance in such situations.

Scoping: given the context, even complex restrictions can be coded (in Xtend)


Table(name) validation


So, what's Xtext's Achilles heel?

One might say that it’s Eclipse itself, as many of us no longer work in this ecosystem and in light of the current trends, this will increasingly be the case in the future. Prior to 2016, this was a legitimate concern, yet since then Microsoft has come up with a solution called Language Server Protocol (LSP), which allows for the syntax check to take place “remotely”, regardless of the tool used for coding. If familiar with LSP-protocol, the language server created in Xtext can seamlessly support development, even for several different types of IDEs at the same time. Therefore, despite the fact that an increasing number of popular coding platforms have appeared over the last 6-8 years, the option of language extensions has been retained, even in the case of web-based IDEs.

Xtext server user from IDEA with an LSP plugin




The same function in VSCode



* One might think of the decline of fixed-price projects (and methodologies), the decreasing emphasis on (technology-)monolithic (and thus possibly easy-to-generate) architectures as well as the widespread use of tried and tested frameworks (e.g., Spring) that successfully deal with the technological challenges of development projects that make up the vast majority of the market.

Ferenc Kovács
Written by

Ferenc Kovács

Java Architect


Grammar freaks assemble!

5 min



domain-specific language





language server protocol

Contact us

+36 1 611 0462