Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acorn with tolerant mode #431

Closed
piotrtomiak opened this issue Jul 1, 2016 · 8 comments
Closed

Acorn with tolerant mode #431

piotrtomiak opened this issue Jul 1, 2016 · 8 comments

Comments

@piotrtomiak
Copy link

@marijnh - at Genuitec I have developed a tolerant version of your parser for purpose of source code validation. It is different than the loose parser, which has some heuristics in it to recover from errors in a nice way for content assist. The only area of major difference between those two is how unclosed blocks are handled. Tolerant version of parser, since it doesn't have any heuristics, will behave slightly worse, however, otherwise it behaves much better than the loose parser, which is not maintained as much as the main one. The idea would be to add a switch to main parser to allow tolerant mode.

You can see how our tolerant version of acorn behaves by installing a Webclipse JSjet plugin to Eclipse JEE or by testing MyEclipse. We use it for validation, semantic syntax highlighting and code folding within JavaScript editor. Since it's tolerant, but equally correct to normal parser, we only need to parse the file once, even if it has errors in the code.

We would like to contribute the code to acorn. It has been developed in a similar manner as LooseParser is developed, but using 3.0.4 files as a base. It'll require some work to migrate contribution to main parser code, so please let me know if you would be interested in the contribution at all.

@TimothyGu
Copy link
Contributor

You can make it a plugin so that the core doesn't have to be changed.

Can you give an example of where your parser performs better than the loose parser, both source and parsed AST?

@piotrtomiak
Copy link
Author

@TimothyGu it is a plugin right now, and it has to be simultaneously maintained, similarly to loose parser. So, every time there is an enhancement to acorn parser that enhancement has to be copied over to the tolerant parser. We want to avoid such a huge maintenance burden.

Can you give an example of where your parser performs better than the loose parser, both source and parsed AST?

The biggest difference is that loose parser cannot validate source code. It is doing indentation heuristics to mitigate problems with unclosed blocks, even when blocks are properly closed. Some other problems AFAIR were around new ES6 syntax. It wasn't too good at dealing with errors in it.

However, I don't think that the tolerant parser would be suitable for content assist at the present state. I might require further work on a better brace/parenthesis matching heuristics. Our main goal with tolerant parser was to provide a way to fully validate a file without stopping at the first error. At the time there was no JS open source parser providing such validation mode. There is tolerant mode in Esprima now, though the errors reported are not too verbose, it's mostly "Syntax error", so we prefer Acorn here.

@marijnh
Copy link
Member

marijnh commented Jul 4, 2016

however, otherwise it behaves much better than the loose parser, which is not maintained as much as the main one. The idea would be to add a switch to main parser to allow tolerant mode.

The loose parser is definitely maintained. In what situations does the tolerant parser do better? If it's not suitable for autocompletion (or is significantly slower than the loose parser), and is as such not useable in Tern, I am not really interested in adding it to the project.

See also #199.

@piotrtomiak
Copy link
Author

The loose parser is definitely maintained.

I know that, but the code has diverged a lot already, AFAICS. It needs doubled effort to support new constructs and there is double parsing needed for content assist.

In what situations does the tolerant parser do better?

I'll prepare some examples.

or is significantly slower than the loose parser

The parser is only a little bit slower than the base acorn one, since there is a little bit of additional code needed to handle errors. I can do some benchmarks to see how much slower it actually is. Keep in mind that for broken files it'll be much faster, probably twice, since only once pass is required.

If it's not suitable for autocompletion, and is as such not useable in Tern, I am not really interested in adding it to the project.

Most of the focus during development of tolerant mode went into ensuring that validation works correctly. However, we observed that broken/incomplete code is pretty well parsed. As a result it is used for code folding and semantic syntax highlighting. Both work pretty well. The missing piece is to match closing braces correctly. I am not sure how good/bad content assist would behave with current state, but from my own experience missing braces are not that common anyway. Many editors automatically insert braces for you. Anyway, if content assist is such a huge factor for you, we might as well ensure that it works with content assist. How would you asses if the parser fulfills your requirements?

It might sound that the tolerant mode introduces a lot of changes to the parser code, but none of the flows are actually modified and there are only few places which required something more than: report and continue approach. The biggest changes are around error reporting which includes start and end of the error and a capability to gather errors. Some error messages were improved as well.

I attach the full patch of my changes on top of 3.0.4: https://gist.github.com/piotrtomiak/7b7c778d8b44bcf41ff18415a9f311e0 . Of course some changes apply only when doing a separate parser and there will have to be a support for tolerant mode switch, most likely when creating dummy idents or when reporting any error, an exception would be thrown as before. Have a look at improvements to error reporting, seen in the test suite.

Last, but not least, my impression is that acorn is more than just a parser for content assist. I saw that there has been a lot of effort put into reporting parsing errors in a user friendly way, which is a great advantage over Esprima parser and not really needed for content assist. Being able to recover from errors and report more than just one, I think, would be very beneficial to all users of the parser.

@RReverser
Copy link
Member

I know that, but the code has diverged a lot already, AFAICS. It needs doubled effort to support new constructs and there is double parsing needed for content assist.

Not really, all the new constructs were simultaneously added to both of them. The code is surely different as it handles various edge cases differently (errors in one parser, stubs or ignored in the other parser), but functionally it covers the same syntax.

@TimothyGu
Copy link
Contributor

TimothyGu commented Sep 5, 2016

@piotrtomiak, if you are willing to share the source code that'd be great, but I guess this should be closed due to inactivity.

Edit: found it above.

@piotrtomiak
Copy link
Author

@TimothyGu - thanks for looking at the issue again :) I am still willing to contribute and waiting for your thoughts guys.

@marijnh
Copy link
Member

marijnh commented Sep 10, 2018

I think there's no way forward here—I'm not willing to take on a big chunk of extra code for this functionality, so a fork or a plugin would be a more appropriate place for it.

@marijnh marijnh closed this as completed Sep 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants