-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid reload block when seeking backward in SegmentTermsEnum. #13253
base: main
Are you sure you want to change the base?
Conversation
Edit: Need more consider about multi floor block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice idea @vsop-479! It would speed up cases where the caller seeks backwards a bit, remaining in the current frame.
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java
Outdated
Show resolved
Hide resolved
@mikemccand |
This might be a needle-moving optimization for apps that reuse a single |
Yes.
I just implemented |
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution! |
Thanks @vsop-479 I will try to re-engage here soon! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for being patient @vsop-479 -- I keep trying to review/understand this change and keep struggling :)
I left some superficial-ish comments.
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Outdated
Show resolved
Hide resolved
|
||
// We got lastFrame by comparing target and term, and target less than last seeked term in | ||
// currentFrame. If lastFrame's fp is same with currentFrame's fp, we can reduce entCount to | ||
// nextEnt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change to we can rewind without reloading
? We don't seem to reduce entCount
anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't seem to reduce entCount anymore?
We can reduce entCount to nextEnt if we finally seek the same block.
This has been already implemented in:
// We still stay on withOutReload frame, reduce entCount to nextEnt.
int origEntCount = -1;
if (currentFrame.fp == withOutReloadFp && origNextEnt != 0) {
origEntCount = currentFrame.entCount;
currentFrame.entCount = origNextEnt;
}
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java
Outdated
Show resolved
Hide resolved
currentFrame.rewind(); | ||
} else { | ||
// Prepare to reduce entCount. | ||
if (currentIsLast && currentFrame.isLeafBlock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused why there are two cases under the rewindWithoutReload
case. One case you can roll back entCount
temporarily, because you know the target term is before the current term? And in the 2nd case, you don't know that, but you do know the target term is in this current block? But then why are we even rewinding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One case you can roll back entCount temporarily, because you know the target term is before the current term?
Yes.
but you do know the target term is in this current block? But then why are we even rewinding?
We get lastFrame
from stack[0]
firstly, and compare target
to last seek'd term
to update lastFrame
to reuse seek state, and assign it to currentFrame
. This currentFrame
only got by target
and last seek'd term
's common prefix, we still need to continue walking the index to seek target
term's block.
I think we need rewinding to set frame's state (nextEnt
, entCount
, etc.) to prepare for seeking. Unlike rewind
, rewindWithoutReload
can keep loaded block's data (same FP
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewindWithoutReload
is called when currentFrame's is loaded and fp
equals fpOrig
.
Doing this can avoid reload a loaded block, when we finally need seek it, and it is still in stack
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I was not clear. I mean we can rewindWithoutReload
when currentFrame.fp == currentFrame.fpOrig
, base on this, if we finally seek the same block with last term
, roll back entCount
temporarily.
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution! |
Conflicts resolved. |
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution! |
In
SegmentTermsEnum.seekExact
andSegmentTermsEnum.seekCeil
, if we seek a smaller target after seeking a greater one, we setnextEnt
to -1 inSegmentTermsEnumFrame.rewind
. Which result inForce reload
a loaded block.I think we could just set
nextEnt
to 0 to let seek from start, and temporarily setentCount
to lastnextEnt
to reduce the seeking range, and resetsuffixesReader
andsuffixLengthsReader
's position. But don't need to reload the block again.