Sebastian Zarnekow's Blog: Xtext Corner #9

Wednesday, November 14, 2012

Xtext Corner #9 - About Keywords, Again

In the last weeks, I compiled some information about proper usage of keywords and generally about terminals in Xtext:

Keywords may help to recover from parse errors in a sense that they guide the parser.
It's recommended to use libraries instead of a hard wired keyword-ish representation for some built in language features.
Data type rules are the way to go if you want to represent complex syntactical concepts as atomic values in the AST.

In addition to these hints, there is one particular issue that arises quite often in the Xtext forum. People often wonder why their grammar does not work properly for some input files but perfectly well for others. What it boils down to in many of these examples is this:

Spaces are evil!

This seems to be a bold statement but let me explain why I think that keywords should never contain space characters. I'm assuming you use the default terminals but actually this fits for almost all terminal rules that I've seen so far. There is usually a concept of an ID which is defined similar to this:

terminal ID:
('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')*;

IDs and Keywords

IDs start with a character followed by an arbitrary number of additional characters or digits. And keywords usually look quite similar to an ID. No surprises so far. Now let's assume a keyword definition like 'some' 'input' compared to 'some input'. What happens if the lexer encounters an input sequence 'som ' is the following. It'll start to consume the leading 's' and has not yet decided which token to emit, since it could become a keyword or an identifier. Same for the 'o' and the 'm'. The trailing space is the character where it can finally decide that 'som ' contains two tokens: an identifier and a whitespace token. So far so good.

Let the Parser Fail - For Free

Now comes the tricky part since the user continues to type an 'e' after the 'm': 'some '. Again, the lexer starts with the 's' and continues to consume the 'o', 'm' and 'e'. No decision was made yet: it could still be an ID or the start of the keyword 'some input'. The next character is a space, and that's the crucial part here: If grammar contains a keyword 'some input', the space is expected since it is part of the keyword. Now, the lexer has only one valid alternative. After the space it is keen on consuming an 'i', 'n', 'p', 'u' and 't'. Unfortunately, there is no 'i' in the parsed text since the lexer already reached the end of the file.

As already mentioned in an earlier post, the lexer will never roll back to the token 'some' in order to create an ID token and a subsequent whitespace. In fact the space character was expected as part of single token so it was safe to consume it. Instead of rolling back and creating two tokens, the lexer will emit an error token which cannot be handled by the parser. Even though the text appeared to be a perfectly valid ID followed by a whitespace, the parser will fail. That's why spaces in keywords are considered harmful.

In contrast, the variant with two split keywords of the grammar works fine. Here, the user is free to apply all sorts of formatting to the two adjacent keywords, any number of spaces, line breaks or even comments can appear between them, are valid and handled well by the parser. If you are concerned about the convenience in the editor - after all, a single keyword with a space seems to be more user friendly in the content assistant - I recommend to tweak that one instead of using an error prone grammar definition.

14 comments:

Aaron Digulla said...: Shouldn't that be "current parser technologies are evil"?

Spaces are a great way to visually align input.

The problem starts when (lazy?) developers cut corners and create lexers and parsers which aren't smart enough to handle common cases like dangling else or mixing grammars (i.e. when you want to embed SQL into a Java grammar).; November 14, 2012 at 4:58 PM
Unknown said...: Aaron,

visual alignment should not be tied to the semantics of the code. Instead that's a matter of the pretty printer or formatter - and of course it's something that depends on the habits of the developer.

I doubt that there are better solutions for the rather complex problem of mixing languages than a divide and conquer approach so I don't consider that cutting corners but a pragmatic approach to solve most of the common problems.

Regards,
Sebastian; November 14, 2012 at 7:58 PM
Unknown said...: Methylone Shoppe is a research and chemical supplier from China. Methylone Shoppe is the largest privately-branded methylone

supplier in USA. For more info visit us at : http://methyloneshoppe.com/; June 27, 2014 at 12:39 PM
Unknown said...: Designing a poster(plakat),logo etc in online can be frustrating, often times
we try to include too much information and loose the focus of our message.
Visit our site today and see what we can do for your business.
Rullebanner; March 13, 2016 at 6:53 PM
Unknown said...: If you want to learn more about how to design your own logo. Learn how to design the best and most effective for your own businessVisit our site today and see what we can do for your business.Rullebanner; April 9, 2016 at 3:20 PM
Unknown said...: Do you want to design your own Roller banners, your own logo, your own t-shirts, own banner, own flag, logo, USB stick, banner,Door etc?What if I tell you that you can make better,
so that you can learn more what makes design the best, and how you can make your design better. visit us: Rullebanner; April 17, 2016 at 8:38 PM
Anonymous said...: Roller banner display stands are the most efficient and most economy way of portable display solution. Roller banners also known as roll up banner stands or pull up banners.
Rullebanner; May 16, 2016 at 1:27 AM
Unknown said...: Nice post your blog is good i will come back to read more from you
BK-EBDP for sale; November 2, 2018 at 6:23 AM
miawri said...: Hi There,
Thank you for sharing the knowledgeable blog with us I hope that you will post many more blog with us:-
Buy Carfentanil drug Online is an analog of the synthetic opioid analgesic fentanyl. A unit of carfentanil is 100 times as potent as the same amount of fentanyl.
Email:info@buycocaineonline.se
Click here for more information:- more info; November 23, 2020 at 10:47 AM
gardencarepharmaceuticals said...: buy-apetamin-syrup-weight-gain
buy-bensedin-online-diazepam-10mg
buy-actavis-promethazine-online
buy-valium-online
buy-methadone-online
buy-cialis-online
order-oxycontin-online
buy-dilaudid-online; January 26, 2021 at 8:37 PM
Ink Drop said...: Buy Real Driver's License Online

Email Us:- inkdrop121@gmail.com
We produce Unique Real Registered Documents with Best Quality Novelty.
All our documents are undetectable and will pass all Ultraviolet light Test
such documents includes:
-Passport
-Drivers License
-National Identification Card
-Resident Permit
-Visa/Invitation
-Diploma Certificate
-Bank Statement and many more.....

What's App Number +44 7520660907
Email Us:- inkdrop121@gmail.com; April 11, 2021 at 11:17 AM
Samuel said...: My name is Samuel, I was in love with my wife and we were married for eight years with a son his is Liam,I loved my wife so much she had access to all my bank account and even my cash app which my accountant agreed to and said it was a great idea, then it took my wife and my accountant two month to get hold of all my properties,all accounts but I had a cash app which they knew nothing about,I was thrown out of my own house was sleeping in a hotel for weeks she also took possession of my son could only see him once a week then I found out she was in love with my accountant all these while so I went online and I came across a private investigator who help me get all my properties and my accounts back even my company back how he did these I don’t know but I gave all the information he asked for and followed all his instructions and now I’m happy my life’s better now.
Thanks to premiumhackservices@gmail.com
I just said I should share my own story here
Thank you; May 12, 2021 at 4:56 PM
ken Henry said...: We produce only high-quality Registered Passports, ID Cards, Driver’s License, IELTS Certificate, VISA’s, Resident Permit, Birth Certificate, Diplomas SSN, TOEFL, Exit/Entry Stamps, etc that can be used legally both nationally and internationally. It will be produce with 100% authenticity like the original documents. We also use new biometric technologies for all types of our documents.

Contact:
Wickr ID:.....Spidoplug
Email:.....firstclassdocuments20@gmail.com

Documents duplicate service:
Documents duplicates producing means we will clone real existing documents and replace the informations with your provided details to suit your activities, database considering on your age, sex, nationality, etc. It will contain real name of parents of the person, address, some other useful information which can be asked at the airport and customs by immigration, ect.

Documents registration service:
For some Countries we can offer to register your new documents in the government database after it will be produced. In fact it will be the official issued documents and you can use it like the original ones. But the price for registered documents will be higher than for the regular documents producing.

Visa/Stamps Affixion Service:
We provide a possibility to affix almost all kind of stamps/VISAs into the passports to fill you more confident. We don't provide this kind of service separately from passport producing.

IELTS Certificates:
We offer high qualitative English test certificates without exam. Certificates will be original and registered in official database. All certificates we issue carries a band scores level of your choosing (6.5-9.0). IELTS is accepted by more than 10,000 organizations in over 145 countries.
This includes:
Universities, Schools, Training Colleges and Tertiary Institutes
Government departments and agencies
Professional and industry bodies
Companies and employers.

Contact:
Wickr ID:.....Spidoplug
Email:.....firstclassdocuments20@gmail.com; August 12, 2022 at 3:19 PM
Steve Alder said...: We are Reliable and the leading suppliers of Nembutal Pentobarbital and other euthanasia products online. We ship discreetly and risk free to individuals and laboratories .

Whats-App +447958552302

E-Mail Address:……..(stevealder579@gmail.com)

Thank you for taking your time to go trough my information,and We Are waiting to here from you anytime soon, E-Mail Your Questions and Comments ( stevealder579@gmail.com )

Where to buy Nembutal Pentobarbital liquid online is an issue to get now our days fake supplier .

Not only buying Nembutal liquid online have been a problem, but buying Nembutal(powder, pills, liquid, and injectable) in general has been a measure issue for lots of people who are looking for a genuine and reliable supplier.

Many people have been scammed because they are trying to find an honest supplier of Nembutal. Some people believe that Nembutal does not exist anymore due to the fact that they have been deceived numerous times trying to buy Nembutal liquid online.

Nembutal pentobarbital sodium liquid does exist and it is available for sale just by a few vendors of course we are one of them.

Directives for Nembutal pentobarbital sodium solution:
-It’s a very easy and simple process.
-Drink the anti-emetic pills with water before drinking the Nembutal pentobarbital sodium solution.
-Shake the bottle well before opening and pour in a glass then drink at once.

-After drinking the Nembutal pentobarbital sodium solution, deep sleep then comes in 15 to 20 minutes with a peaceful death without any pains and no vomiting in 20 – 35 minutes. It’s a rapid process to bring peaceful suicide.

Nembutal pentobarbital oral liquid is researched and made in our laboratories. We are proud of its 100% purity and effectiveness, Order now and have it delivered anywhere all over the world through the safest mode.

We provide you with Nembutal drugs in three forms – Powder, Liquid, and Tablets. If you have any doubts regarding the number of doses or the best way to take the drug, You can contact us.

Whats-App +447958552302

E-Mail Address:……..stevealder579@gmail.com; September 22, 2023 at 1:59 AM