You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL;DR: I'd like to know what exactly max_cat_threshold controls and I may suggest marginal improvements of the documentation.
I'm quite interested in XGBoost's support for categorical features. I dived into the documentation, but can't understand the exact effect of max_cat_threshold. By reading the C++ code (here), I understand that it is used to determine the begin oand end points of the double scan of the sorted histogram. Here is an example:
TL;DR: I'd like to know what exactly
max_cat_threshold
controls and I may suggest marginal improvements of the documentation.I'm quite interested in XGBoost's support for categorical features. I dived into the documentation, but can't understand the exact effect of
max_cat_threshold
. By reading the C++ code (here), I understand that it is used to determine the begin oand end points of the double scan of the sorted histogram. Here is an example:Case with
max_cat_threshold
= 1In this case all partitions are considered.
Case with
max_cat_threshold
= 2In this case only partitions with 2+ categories are considered.
Is this the way
max_cat_threshold
? If yes, I might open a PR to add a paragraph here. Does it sound like a good idea?The text was updated successfully, but these errors were encountered: