-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add to support byte input stream back by file #10717
Conversation
✅ Deploy Preview for meta-velox canceled.
|
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
964c912
to
66881bd
Compare
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
70226dc
to
4b9a815
Compare
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
16c5812
to
d94f51f
Compare
@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
This pull request was exported from Phabricator. Differential Revision: D61148216 |
Summary: Pull Request resolved: facebookincubator#10717 Differential Revision: D61148216 Pulled By: xiaoxmeng
This pull request was exported from Phabricator. Differential Revision: D61148216 |
Summary: Pull Request resolved: facebookincubator#10717 Differential Revision: D61148216 Pulled By: xiaoxmeng
This pull request was exported from Phabricator. Differential Revision: D61148216 |
@@ -91,64 +94,29 @@ class OStreamOutputStream : public OutputStream { | |||
std::ostream* out_; | |||
}; | |||
|
|||
/// Read-only stream over one or more byte buffers. | |||
/// Read-only byte input stream interface. | |||
class ByteInputStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move this base class to the base folder as it has both memory stream subclass BufferInputStream
and file stream subclass FileInputStream
?
This pull request was exported from Phabricator. Differential Revision: D61148216 |
Summary: Current spill leverage SpillInputStream to handle the byte stream read from spilled file. SpillInputStream is a derived class from ByteInputStream and its implementation a bit hack as most of operations of ByteInputStream doesn't apply for SpillInputStream as we don't support backward seek on a file for now. It just works for now. It is better to split them and make a generic file based input stream which can be used in other cases as well. Pull Request resolved: facebookincubator#10717 Reviewed By: tanjialiang Differential Revision: D61148216 Pulled By: xiaoxmeng
Summary: Current spill leverage SpillInputStream to handle the byte stream read from spilled file. SpillInputStream is a derived class from ByteInputStream and its implementation a bit hack as most of operations of ByteInputStream doesn't apply for SpillInputStream as we don't support backward seek on a file for now. It just works for now. It is better to split them and make a generic file based input stream which can be used in other cases as well. Pull Request resolved: facebookincubator#10717 Reviewed By: tanjialiang Differential Revision: D61148216 Pulled By: xiaoxmeng
This pull request was exported from Phabricator. Differential Revision: D61148216 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % one nit will resolve it in the follow-up per discuss.
Summary: Current spill leverage SpillInputStream to handle the byte stream read from spilled file. SpillInputStream is a derived class from ByteInputStream and its implementation a bit hack as most of operations of ByteInputStream doesn't apply for SpillInputStream as we don't support backward seek on a file for now. It just works for now. It is better to split them and make a generic file based input stream which can be used in other cases as well. Pull Request resolved: facebookincubator#10717 Reviewed By: tanjialiang Differential Revision: D61148216 Pulled By: xiaoxmeng
This pull request was exported from Phabricator. Differential Revision: D61148216 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except one optimization idea. Also can you have a line in the description to state the new roles of the classes (ByteInputStream
becomes interface, BufferInputStream
is the old ByteInputStream
etc.)?
if (skipBytes == 0) { | ||
return; | ||
} | ||
readNextRange(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to actually do the IO? Can we just manipulate fileOffset_
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think we could. Let me see how to skip io in followup. Thanks!
Summary: Current spill leverage SpillInputStream to handle the byte stream read from spilled file. SpillInputStream is a derived class from ByteInputStream and its implementation a bit hack as most of operations of ByteInputStream doesn't apply for SpillInputStream as we don't support backward seek on a file for now. It just works for now. It is better to split them and make a generic file based input stream which can be used in other cases as well. Pull Request resolved: facebookincubator#10717 Reviewed By: Yuhta, tanjialiang Differential Revision: D61148216 Pulled By: xiaoxmeng
This pull request was exported from Phabricator. Differential Revision: D61148216 |
Summary: Current spill leverage SpillInputStream to handle the byte stream read from spilled file. SpillInputStream is a derived class from ByteInputStream and its implementation a bit hack as most of operations of ByteInputStream doesn't apply for SpillInputStream as we don't support backward seek on a file for now. It just works for now. It is better to split them and make a generic file based input stream which can be used in other cases as well. With this change, ByteInputStream has two derived implementations: (1) BufferInputStream which support byte stream API back by a set of buffers; (2) FileInputStream which support byte stream API back by a sequential read file and doesn't support for backward seek. FileInputStream can be used by spill, tracing and file-based broadcast join etc. Pull Request resolved: facebookincubator#10717 Reviewed By: Yuhta, tanjialiang, ktsiam Differential Revision: D61148216 Pulled By: xiaoxmeng
This pull request was exported from Phabricator. Differential Revision: D61148216 |
This pull request was exported from Phabricator. Differential Revision: D61148216 |
Summary: Current spill leverage SpillInputStream to handle the byte stream read from spilled file. SpillInputStream is a derived class from ByteInputStream and its implementation a bit hack as most of operations of ByteInputStream doesn't apply for SpillInputStream as we don't support backward seek on a file for now. It just works for now. It is better to split them and make a generic file based input stream which can be used in other cases as well. With this change, ByteInputStream has two derived implementations: (1) BufferInputStream which support byte stream API back by a set of buffers; (2) FileInputStream which support byte stream API back by a sequential read file and doesn't support for backward seek. FileInputStream can be used by spill, tracing and file-based broadcast join etc. Pull Request resolved: facebookincubator#10717 Reviewed By: Yuhta, tanjialiang, ktsiam Differential Revision: D61148216 Pulled By: xiaoxmeng
Summary: Current spill leverage SpillInputStream to handle the byte stream read from spilled file. SpillInputStream is a derived class from ByteInputStream and its implementation a bit hack as most of operations of ByteInputStream doesn't apply for SpillInputStream as we don't support backward seek on a file for now. It just works for now. It is better to split them and make a generic file based input stream which can be used in other cases as well. With this change, ByteInputStream has two derived implementations: (1) BufferInputStream which support byte stream API back by a set of buffers; (2) FileInputStream which support byte stream API back by a sequential read file and doesn't support for backward seek. FileInputStream can be used by spill, tracing and file-based broadcast join etc. Pull Request resolved: facebookincubator#10717 Reviewed By: Yuhta, tanjialiang, ktsiam Differential Revision: D61148216 Pulled By: xiaoxmeng
This pull request was exported from Phabricator. Differential Revision: D61148216 |
@xiaoxmeng merged this pull request in 9523840. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Current spill leverage SpillInputStream to handle the byte stream read from spilled file. SpillInputStream is a
derived class from ByteInputStream and its implementation a bit hack as most of operations of ByteInputStream
doesn't apply for SpillInputStream as we don't support backward seek on a file for now. It just works for now.
It is better to split them and make a generic file based input stream which can be used in other cases as well.