商研所碩士論文 CatBoost與LightGBM的比較

Some of the data used can be downloaded if you read the thesis and find the corresponding links.

Summary: CatBoost tends to choose categorical features, while LightGBM uses numerical ones. When the dataset has important numerical features, LightGBM performs better than CatBoost. But if the dataset has prominent categorical features, CatBoost tend to predict better.

Implication: You should use CatBoost to predict if you want a better performance on common tabular medium-sized data. As for why catboost speed is much slower, I guess because Catboost preprocess categorical columns on its own, while LightGBM requires the data to be preprocessced before feeding into the model. Categorical column encoding is time consuming, and the CatBoost uses Target Encoding may be the reason why the model training speed is slow.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
bank_marketing_eda_and_catboost.ipynb		bank_marketing_eda_and_catboost.ipynb
bank_marketing_lightgbm.ipynb		bank_marketing_lightgbm.ipynb
cat_in_the_dat_eda_and_catboost.ipynb		cat_in_the_dat_eda_and_catboost.ipynb
cat_in_the_dat_lightgbm.ipynb		cat_in_the_dat_lightgbm.ipynb
e_sun_eda_and_catboost.ipynb		e_sun_eda_and_catboost.ipynb
e_sun_lightgbm.ipynb		e_sun_lightgbm.ipynb
titanic_eda_and_catboost.ipynb		titanic_eda_and_catboost.ipynb
titanic_lightgbm.ipynb		titanic_lightgbm.ipynb
立瑜graduation paper(6.12)_Revised version.docx		立瑜graduation paper(6.12)_Revised version.docx
邵立瑜_論文口試簡報_final.pptx		邵立瑜_論文口試簡報_final.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

商研所碩士論文 CatBoost與LightGBM的比較

Powerpoint file briefly introduce what I've done in this thesis,

but mostly are in chinese, although I believe with graphics you can still understand what I'm trying to do.

About

Releases

Packages

Languages

GISH123/NTU_Master_Thesis_Comparsion_between_LightGBM_and_CatBoost

Folders and files

Latest commit

History

Repository files navigation

商研所碩士論文 CatBoost與LightGBM的比較

Powerpoint file briefly introduce what I've done in this thesis,

but mostly are in chinese, although I believe with graphics you can still understand what I'm trying to do.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages