Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement IGrammar tokenizeLine2 like vscode-textmate #38

Open
angelozerr opened this issue Jan 27, 2017 · 6 comments
Open

Implement IGrammar tokenizeLine2 like vscode-textmate #38

angelozerr opened this issue Jan 27, 2017 · 6 comments

Comments

@angelozerr
Copy link
Contributor

angelozerr commented Jan 27, 2017

See microsoft/vscode#16206 (comment) and https://github.com/Microsoft/vscode-textmate/blob/master/src/tests/themedTokenizer.ts#L25

The tokenizeLine2 seems to provide:

  • binary result (perhaps it will improve memory?)
  • result is a merge between the grammar and the theme (perhaps it will improve performance?)
angelozerr added a commit that referenced this issue May 10, 2017
angelozerr added a commit that referenced this issue May 10, 2017
angelozerr added a commit that referenced this issue May 10, 2017
angelozerr added a commit that referenced this issue May 11, 2017
angelozerr added a commit that referenced this issue May 11, 2017
angelozerr added a commit that referenced this issue May 11, 2017
angelozerr added a commit that referenced this issue May 11, 2017
angelozerr added a commit that referenced this issue May 11, 2017
angelozerr added a commit that referenced this issue May 11, 2017
angelozerr added a commit that referenced this issue May 11, 2017
angelozerr added a commit that referenced this issue May 11, 2017
mickaelistria pushed a commit to mickaelistria/textmate.java that referenced this issue Sep 10, 2017
@angelozerr
Copy link
Contributor Author

@sebthom I see that you are very activated on TM4E. Thanks for your contribution!

If you have (a lot) time, I think it should be really nice to work on this issue. I had implemented tokenizeLine2, but I didn't consume it. It should b ereally nice to consume it.

Why using tokenizeLine2? I suggest that you read https://code.visualstudio.com/blogs/2017/02/08/syntax-highlighting-optimizations

@sebthom
Copy link
Member

sebthom commented May 4, 2022

@angelozerr I had a look at tokenizeLine2 but I am unsure if the current implementation is supposed to work already.

I only ever get two int values back per line. E.g. I have the following test:

	@Test
	void testTokenizeLine2() throws Exception {
		final var path = "JavaScript.tmLanguage";
		try (var in = Data.class.getResourceAsStream(path)) {
			final var grammar = new Registry().loadGrammarFromPathSync(path, in);

			final var lineTokens = grammar.tokenizeLine("function add(a,b) { return a+b; }");
			for (int i = 0; i < lineTokens.getTokens().length; i++) {
				final var token = lineTokens.getTokens()[i];
				final String s = "Token from " + token.getStartIndex() + " to " + token.getEndIndex() + " with scopes "
						+ token.getScopes();
				System.out.println(s);
			}

            System.out.println("----------");

			final var lineTokens2 = grammar.tokenizeLine2("function add(a,b) { return a+b; }");
			for (int i = 0; i < lineTokens2.getTokens().length; i++) {
				int token = lineTokens2.getTokens()[i];
				System.out.println(token);
			}
		}

It outputs:

Token from 0 to 8 with scopes [source.js, meta.function.js, storage.type.function.js]
Token from 8 to 9 with scopes [source.js, meta.function.js]
Token from 9 to 12 with scopes [source.js, meta.function.js, entity.name.function.js]
Token from 12 to 13 with scopes [source.js, meta.function.js, meta.function.type.parameter.js, meta.brace.round.js]
Token from 13 to 14 with scopes [source.js, meta.function.js, meta.function.type.parameter.js, parameter.name.js, variable.parameter.js]
Token from 14 to 15 with scopes [source.js, meta.function.js, meta.function.type.parameter.js]
Token from 15 to 16 with scopes [source.js, meta.function.js, meta.function.type.parameter.js, parameter.name.js, variable.parameter.js]
Token from 16 to 17 with scopes [source.js, meta.function.js, meta.function.type.parameter.js, meta.brace.round.js]
Token from 17 to 18 with scopes [source.js, meta.function.js]
Token from 18 to 19 with scopes [source.js, meta.function.js, meta.decl.block.js, meta.brace.curly.js]
Token from 19 to 20 with scopes [source.js, meta.function.js, meta.decl.block.js]
Token from 20 to 26 with scopes [source.js, meta.function.js, meta.decl.block.js, keyword.control.js]
Token from 26 to 28 with scopes [source.js, meta.function.js, meta.decl.block.js]
Token from 28 to 29 with scopes [source.js, meta.function.js, meta.decl.block.js, keyword.operator.arithmetic.js]
Token from 29 to 32 with scopes [source.js, meta.function.js, meta.decl.block.js]
Token from 32 to 33 with scopes [source.js, meta.function.js, meta.decl.block.js, meta.brace.curly.js]
----------
0
16793600

I would have expected to at least get more than two ints back with tokenizeLine2.
I also tested it with an old commit before I started my refactoring attempts to ensure that I didn't break anything along the way but the behavior there is the same.

any thoughts?

@angelozerr
Copy link
Contributor Author

To be honnest with you when I implemented that I have just copy paste code from vscode textmate and translate it from typescript to java without understand. I did the same things for tests if I remember.

I cannot help you more but I think it can be good to study it because vscode uses this strategy and not the the old strategy than tm4e is using.

@zulus
Copy link

zulus commented Nov 13, 2023

Is this possible that after this change embeded grammars will be better detected? Or this might be java oniguruma implementation problem?

@angelozerr
Copy link
Contributor Author

@zulus to be honnest with you, I don't know. I did that to try to have the same behavior than vscode-textmate.

@zulus
Copy link

zulus commented Nov 13, 2023

@zulus to be honnest with you, I don't know. I did that to try to have the same behavior than vscode-textmate.

Thanks, I'll try ;) Currently vue grammars behave differently in compare to vscode (and other text-mate based editors like nova in osx). For example I haven't javascript coloring inside v-if @event :attribute-bind etc..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants