-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(tf): set visible_device_list for TF C++ #4172
base: devel
Are you sure you want to change the base?
Conversation
Fix deepmodeling#4171. Signed-off-by: Jinzhe Zeng <[email protected]>
📝 WalkthroughWalkthroughThe changes primarily involve modifications to the GPU configuration and model execution logic across several files. The Changes
Assessment against linked issues
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Need to test manually before megering. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (1)
source/api_cc/src/DataModifierTF.cc (1)
53-54
: Use stable references in commentsThe comment links to a specific commit in the TensorFlow repository. Linking to a specific commit may become outdated as the codebase evolves. Consider linking to a stable reference or the latest documentation to ensure the link remains valid over time.
Suggestion
Update the link to point to the master branch or official documentation:
- // https://github.com/tensorflow/tensorflow/blame/8fac27b486939f40bc8e362b94a16a4a8bb51869/tensorflow/core/protobuf/config.proto#L80 + // https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto#L80Alternatively, reference the TensorFlow GPU configuration guide available in the official documentation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- source/api_cc/src/DataModifierTF.cc (1 hunks)
- source/api_cc/src/DeepPotTF.cc (1 hunks)
- source/api_cc/src/DeepTensorTF.cc (1 hunks)
Signed-off-by: Jinzhe Zeng <[email protected]>
Signed-off-by: Jinzhe Zeng <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## devel #4172 +/- ##
=======================================
Coverage 83.46% 83.46%
=======================================
Files 537 537
Lines 52168 52168
Branches 3046 3046
=======================================
Hits 43543 43543
Misses 7678 7678
Partials 947 947 ☔ View full report in Codecov by Sentry. |
Should I assist with the test? |
Yes, please. |
I've just done the testing, the issue has been resolved! Now I'm looking forward to merging! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
source/api_cc/src/DeepTensorTF.cc (2)
Line range hint
639-680
: Approved: Enhanced computew methods with optimization potentialThe new
computew
method overloads are a valuable addition:
- They provide support for ghost atoms and neighbor lists, increasing the flexibility of the framework.
- The conditional execution based on
request_deriv
can improve performance by avoiding unnecessary computations.These changes align well with the goal of improving the framework's efficiency and flexibility.
Consider extracting the common logic in these methods to reduce code duplication. For example:
template <typename VALUETYPE> void DeepTensorTF::computew(std::vector<VALUETYPE> &global_tensor, std::vector<VALUETYPE> &force, std::vector<VALUETYPE> &virial, std::vector<VALUETYPE> &atom_tensor, std::vector<VALUETYPE> &atom_virial, const std::vector<VALUETYPE> &coord, const std::vector<int> &atype, const std::vector<VALUETYPE> &box, const int nghost, const InputNlist &inlist, const bool request_deriv) { if (request_deriv) { compute(global_tensor, force, virial, atom_tensor, atom_virial, coord, atype, box, nghost, inlist); } else { compute(global_tensor, coord, atype, box, nghost, inlist); force.clear(); virial.clear(); atom_tensor.clear(); atom_virial.clear(); } }Then, specialize this template for both
double
andfloat
types, and use it for both the new and existingcomputew
methods.
Line range hint
1-680
: Approved: Robust structure with room for documentation improvementThe overall structure of the
DeepTensorTF
class is well-organized and consistent:
- Template specializations for various compute methods provide type-specific optimizations.
- Error handling improvements, such as in the constructor, enhance the code's robustness.
- The consistent structure across methods makes the code easier to maintain and understand.
Consider adding more inline documentation, especially for complex methods like
compute_inner
andrun_model
. This would improve code readability and make it easier for other developers to understand and maintain the code. For example:/** * @brief Computes the inner tensor operations. * @param dtensor_ Output tensor * @param dcoord_ Input coordinates * @param datype_ Atom types * @param dbox Box dimensions * @param nghost Number of ghost atoms * @param nlist_ Neighbor list */ template <typename VALUETYPE> void DeepTensorTF::compute_inner(std::vector<VALUETYPE> &dtensor_, const std::vector<VALUETYPE> &dcoord_, const std::vector<int> &datype_, const std::vector<VALUETYPE> &dbox, const int nghost, const InputNlist &nlist_) { // ... existing implementation ... }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- source/api_cc/src/DataModifierTF.cc (1 hunks)
- source/api_cc/src/DeepPotTF.cc (1 hunks)
- source/api_cc/src/DeepTensorTF.cc (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- source/api_cc/src/DataModifierTF.cc
🔇 Additional comments (3)
source/api_cc/src/DeepTensorTF.cc (1)
49-53
: Improved GPU device selection logicThe changes in the GPU device selection are a significant improvement:
- Replacing the hardcoded "/gpu:0" with a dynamic selection based on
gpu_rank % gpu_num
allows for better utilization of multiple GPUs.- Setting the
visible_device_list
ensures that each process only sees and uses its assigned GPU.These modifications directly address the VRAM waste issue described in #4171 by preventing each MPI rank from consuming VRAM on all GPUs.
source/api_cc/src/DeepPotTF.cc (2)
451-452
: Clarify the comment for better readabilityThe existing comment lacks context and may not provide sufficient information without following the link. Consider adding a brief description to enhance clarity and maintainability.
450-454
: Ensure consistency between default device andvisible_device_list
The variable
str
is set to"/gpu:0"
, whilevisible_device_list
usesgpu_rank % gpu_num
. This mismatch could lead to inconsistencies where the default device does not match the visible devices, potentially causing unexpected behavior.
Fix #4171.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation