Acorn with tolerant mode #431

piotrtomiak · 2016-07-01T09:40:29Z

@marijnh - at Genuitec I have developed a tolerant version of your parser for purpose of source code validation. It is different than the loose parser, which has some heuristics in it to recover from errors in a nice way for content assist. The only area of major difference between those two is how unclosed blocks are handled. Tolerant version of parser, since it doesn't have any heuristics, will behave slightly worse, however, otherwise it behaves much better than the loose parser, which is not maintained as much as the main one. The idea would be to add a switch to main parser to allow tolerant mode.

You can see how our tolerant version of acorn behaves by installing a Webclipse JSjet plugin to Eclipse JEE or by testing MyEclipse. We use it for validation, semantic syntax highlighting and code folding within JavaScript editor. Since it's tolerant, but equally correct to normal parser, we only need to parse the file once, even if it has errors in the code.

We would like to contribute the code to acorn. It has been developed in a similar manner as LooseParser is developed, but using 3.0.4 files as a base. It'll require some work to migrate contribution to main parser code, so please let me know if you would be interested in the contribution at all.

TimothyGu · 2016-07-02T08:49:31Z

You can make it a plugin so that the core doesn't have to be changed.

Can you give an example of where your parser performs better than the loose parser, both source and parsed AST?

piotrtomiak · 2016-07-04T08:04:55Z

@TimothyGu it is a plugin right now, and it has to be simultaneously maintained, similarly to loose parser. So, every time there is an enhancement to acorn parser that enhancement has to be copied over to the tolerant parser. We want to avoid such a huge maintenance burden.

Can you give an example of where your parser performs better than the loose parser, both source and parsed AST?

The biggest difference is that loose parser cannot validate source code. It is doing indentation heuristics to mitigate problems with unclosed blocks, even when blocks are properly closed. Some other problems AFAIR were around new ES6 syntax. It wasn't too good at dealing with errors in it.

However, I don't think that the tolerant parser would be suitable for content assist at the present state. I might require further work on a better brace/parenthesis matching heuristics. Our main goal with tolerant parser was to provide a way to fully validate a file without stopping at the first error. At the time there was no JS open source parser providing such validation mode. There is tolerant mode in Esprima now, though the errors reported are not too verbose, it's mostly "Syntax error", so we prefer Acorn here.

marijnh · 2016-07-04T15:22:36Z

however, otherwise it behaves much better than the loose parser, which is not maintained as much as the main one. The idea would be to add a switch to main parser to allow tolerant mode.

The loose parser is definitely maintained. In what situations does the tolerant parser do better? If it's not suitable for autocompletion (or is significantly slower than the loose parser), and is as such not useable in Tern, I am not really interested in adding it to the project.

See also #199.

piotrtomiak · 2016-07-04T18:22:20Z

The loose parser is definitely maintained.

I know that, but the code has diverged a lot already, AFAICS. It needs doubled effort to support new constructs and there is double parsing needed for content assist.

In what situations does the tolerant parser do better?

I'll prepare some examples.

or is significantly slower than the loose parser

The parser is only a little bit slower than the base acorn one, since there is a little bit of additional code needed to handle errors. I can do some benchmarks to see how much slower it actually is. Keep in mind that for broken files it'll be much faster, probably twice, since only once pass is required.

If it's not suitable for autocompletion, and is as such not useable in Tern, I am not really interested in adding it to the project.

Most of the focus during development of tolerant mode went into ensuring that validation works correctly. However, we observed that broken/incomplete code is pretty well parsed. As a result it is used for code folding and semantic syntax highlighting. Both work pretty well. The missing piece is to match closing braces correctly. I am not sure how good/bad content assist would behave with current state, but from my own experience missing braces are not that common anyway. Many editors automatically insert braces for you. Anyway, if content assist is such a huge factor for you, we might as well ensure that it works with content assist. How would you asses if the parser fulfills your requirements?

It might sound that the tolerant mode introduces a lot of changes to the parser code, but none of the flows are actually modified and there are only few places which required something more than: report and continue approach. The biggest changes are around error reporting which includes start and end of the error and a capability to gather errors. Some error messages were improved as well.

I attach the full patch of my changes on top of 3.0.4: https://gist.github.com/piotrtomiak/7b7c778d8b44bcf41ff18415a9f311e0 . Of course some changes apply only when doing a separate parser and there will have to be a support for tolerant mode switch, most likely when creating dummy idents or when reporting any error, an exception would be thrown as before. Have a look at improvements to error reporting, seen in the test suite.

Last, but not least, my impression is that acorn is more than just a parser for content assist. I saw that there has been a lot of effort put into reporting parsing errors in a user friendly way, which is a great advantage over Esprima parser and not really needed for content assist. Being able to recover from errors and report more than just one, I think, would be very beneficial to all users of the parser.

RReverser · 2016-07-04T20:07:38Z

I know that, but the code has diverged a lot already, AFAICS. It needs doubled effort to support new constructs and there is double parsing needed for content assist.

Not really, all the new constructs were simultaneously added to both of them. The code is surely different as it handles various edge cases differently (errors in one parser, stubs or ignored in the other parser), but functionally it covers the same syntax.

TimothyGu · 2016-09-05T23:45:35Z

~~@piotrtomiak, if you are willing to share the source code that'd be great, but~~ I guess this should be closed due to inactivity.

Edit: found it above.

piotrtomiak · 2016-09-06T07:50:34Z

@TimothyGu - thanks for looking at the issue again :) I am still willing to contribute and waiting for your thoughts guys.

marijnh · 2018-09-10T09:29:22Z

I think there's no way forward here—I'm not willing to take on a big chunk of extra code for this functionality, so a fork or a plugin would be a more appropriate place for it.

marijnh closed this as completed Sep 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Acorn with tolerant mode #431

Acorn with tolerant mode #431

piotrtomiak commented Jul 1, 2016

TimothyGu commented Jul 2, 2016

piotrtomiak commented Jul 4, 2016

marijnh commented Jul 4, 2016

piotrtomiak commented Jul 4, 2016

RReverser commented Jul 4, 2016

TimothyGu commented Sep 5, 2016 •

edited

Loading

piotrtomiak commented Sep 6, 2016

marijnh commented Sep 10, 2018

Acorn with tolerant mode #431

Acorn with tolerant mode #431

Comments

piotrtomiak commented Jul 1, 2016

TimothyGu commented Jul 2, 2016

piotrtomiak commented Jul 4, 2016

marijnh commented Jul 4, 2016

piotrtomiak commented Jul 4, 2016

RReverser commented Jul 4, 2016

TimothyGu commented Sep 5, 2016 • edited Loading

piotrtomiak commented Sep 6, 2016

marijnh commented Sep 10, 2018

TimothyGu commented Sep 5, 2016 •

edited

Loading