Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance] Avoid reading SourceFile twice #2910

Closed
wants to merge 1 commit into from

Conversation

jukzi
Copy link
Contributor

@jukzi jukzi commented Sep 4, 2024

During compile parsing happens in two stages:

  1. diet parse (any blocks like method bodies are skipped)
  2. parse bodies Both phases did read the source .java file from file system. With this change the file contents is kept until no longer needed. It is cached in a SoftReference to avoid OutOfMemoryError.

#2691

@jukzi jukzi linked an issue Sep 4, 2024 that may be closed by this pull request
@jukzi jukzi force-pushed the SourceFile branch 2 times, most recently from 87e11f3 to e2f4ffc Compare September 5, 2024 08:47
During compile parsing happens in two stages:
1. diet parse (any blocks like method bodies are skipped)
2. parse bodies
Both phases did read the source .java file from file system. With this
change the file contents is kept until no longer needed. It is cached in
a SoftReference to avoid OutOfMemoryError.

eclipse-jdt#2691
@jukzi
Copy link
Contributor Author

jukzi commented Sep 9, 2024

@stephan-herrmann could you review, please?
On my computer building the whole platform workspace it reduces reading sources from 7 to 3.5 seconds - which is expected (halfling that time)

@stephan-herrmann
Copy link
Contributor

@stephan-herrmann could you review, please? On my computer building the whole platform workspace it reduces reading sources from 7 to 3.5 seconds - which is expected (halfling that time)

My focus is still on shipping complete support for Java 23. Any other reviews will have to wait until after 17 Sept.

Copy link
Contributor

@stephan-herrmann stephan-herrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution looks fine, but I still wonder if SourceFile is the best location for this cache.

Given the large number of clients of this class, and many of them calling getContents(), it is difficult to see what will be the effect for clients other than Compiler.

  • Do any of those read the same source more than once?
  • It would be difficult to define clean-up protocol for each of them.
    • hence we need to ask: is holding unused SoftReferences for free, or does it put some kind of pressure on the garbage collector?

Since compiler always accesses the SourceFile via CompilationResult, perhaps the latter could be made responsible for caching (encapsulated as CompilationResult.getContents()). And then cleanup could be hooked into the existing CompilationUnitDeclaration.cleanUp(), to the advantage that we don't add a new protocol to be observed.

Comment on lines +862 to +886
ICompilationUnit sourceUnit = sourceUnits[i];
if (this.options.verbose) {
this.out.println(
Messages.bind(Messages.compilation_request,
new String[] {
String.valueOf(i + 1),
String.valueOf(maxUnits),
new String(sourceUnits[i].getFileName())
new String(sourceUnit.getFileName())
}));
}
// diet parsing for large collection of units
CompilationUnitDeclaration parsedUnit;
unitResult = new CompilationResult(sourceUnits[i], i, maxUnits, this.options.maxProblemsPerUnit);
unitResult = new CompilationResult(sourceUnit, i, maxUnits, this.options.maxProblemsPerUnit);
long parseStart = System.currentTimeMillis();
if (this.totalUnits < this.parseThreshold) {
parsedUnit = this.parser.parse(sourceUnits[i], unitResult);
parsedUnit = this.parser.parse(sourceUnit, unitResult);
} else {
parsedUnit = this.parser.dietParse(sourceUnits[i], unitResult);
parsedUnit = this.parser.dietParse(sourceUnit, unitResult);
}
long resolveStart = System.currentTimeMillis();
this.stats.parseTime += resolveStart - parseStart;
// initial type binding creation
this.lookupEnvironment.buildTypeBindings(parsedUnit, null /*no access restriction*/);
this.stats.resolveTime += System.currentTimeMillis() - resolveStart;
addCompilationUnit(sourceUnits[i], parsedUnit);
addCompilationUnit(sourceUnit, parsedUnit);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this method changed?

@jukzi
Copy link
Contributor Author

jukzi commented Sep 30, 2024

encapsulated as CompilationResult.getContents())

please see #3030 for an alternate version based on CompilationResult

@jukzi jukzi closed this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean build: SourceFile(s) read twice
3 participants