[Performance] Very slow load of ONNX model in Windows #22219

dhatraknilam · 2024-09-25T11:22:18Z

Describe the issue

I am trying to load XGBoost onnx models using onnxruntime on Windows machine.
The model size is 52 mb and the RAM it is consuming on loading is 1378.9 MB. The time to load the model is 15 mins!!
This behavior is observed only on Windows, in Linux the models are loaded in few seconds. but the memory consumption is high in Linux as well.

I tried solution suggested in [https://github.com//issues/3802#issuecomment-624464802] but getting this error
AttributeError: 'onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions' object attribute 'graph_optimization_level' is read-only

This is the simple code I used to load the model,
# sess = rt.InferenceSession(modelSav_path, providers=["CPUExecutionProvider"])

To reproduce

Train and a XGBoost classification model following params:
`

Classifier

update_registered_converter(
XGBClassifier,
"XGBoostXGBClassifier",
calculate_linear_classifier_output_shapes,
convert_xgboost,
options={"nocl": [True, False], "zipmap": [True, False, "columns"]},
)

param = {'n_estimators': 3435, 'max_delta_step': 6, 'learning_rate': 0.030567232354470994, 'base_score': 0.700889637773676, 'scale_pos_weight': 0.29833333651319716, 'booster': 'gbtree', 'reg_lambda': 0.0005531812782988272, 'reg_alpha': 4.8213852607021606e-05, 'subsample': 0.9816268623744107, 'colsample_bytree': 0.3187040821569215, 'max_depth': 17, 'min_child_weight': 2, 'eta': 6.2582977222245746e-06, 'gamma': 2.2248460288603035e-07, 'grow_policy': 'depthwise'}

x_train.columns = range(x_train.shape[1])
x_test.columns = range(x_train.shape[1])

pipe = Pipeline([("xgb", MultiOutputClassifier(XGBClassifier(**param)))])
pipe.fit(x_train.to_numpy(), y_train)
model_onnx = convert_sklearn(
pipe,
"pipeline_xgboost",
[("input", FloatTensorType([None, x_train.shape[1]]))],
verbose=1,
target_opset={"": 12, "ai.onnx.ml": 2},
)

with open("modelname.onnx", "wb") as f:
f.write(model_onnx.SerializeToString())
`

Train and a XGBoost regressor model following params:
`

Regressor

update_registered_converter(
XGBRegressor,
"XGBoostXGBRegressor",
calculate_linear_regressor_output_shapes,
convert_xgboost,

)

param = {'n_estimators': 3435, 'max_delta_step': 6, 'learning_rate': 0.030567232354470994, 'base_score': 0.700889637773676, 'scale_pos_weight': 0.29833333651319716, 'booster': 'gbtree', 'reg_lambda': 0.0005531812782988272, 'reg_alpha': 4.8213852607021606e-05, 'subsample': 0.9816268623744107, 'colsample_bytree': 0.3187040821569215, 'max_depth': 17, 'min_child_weight': 2, 'eta': 6.2582977222245746e-06, 'gamma': 2.2248460288603035e-07, 'grow_policy': 'depthwise'}

x_train.columns = range(x_train.shape[1])
x_test.columns = range(x_train.shape[1])

pipe = Pipeline([("xgb", MultiOutputRegressor(XGBRegressor(**param)))])
pipe.fit(x_train.to_numpy(), y_train)

model_onnx = convert_sklearn(
pipe,
"pipeline_xgboost",
[("input", FloatTensorType([None, x_train.shape[1]]))],
verbose=1,
target_opset={"": 12, "ai.onnx.ml": 2},
options={type(pipe):{'zipmap':False}}
)

with open("modelname.onnx", "wb") as f:
f.write(model_onnx.SerializeToString())`

Load the model with following code,
sess = rt.InferenceSession(modelSav_path, providers=["CPUExecutionProvider"])
And observe the load time and RAM usage.

Urgency

This is release critical issue, since we can't deliver these models with such low performance. Although the models are performing well, we are stuck with the loading time issue. We also thought to use other libraries to package the ML models but we don't have necessary compliance also we trust Microsoft.

Platform

Windows

OS Version

11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.18.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No

The text was updated successfully, but these errors were encountered:

xadupre · 2024-09-27T07:56:51Z

This PR should solve this: #22043.

dhatraknilam added the performance issues related to performance regressions label Sep 25, 2024

github-actions bot added the platform:windows issues related to the Windows platform label Sep 25, 2024

dhatraknilam changed the title ~~[Performance] Very slow load of ONNX model in memory in Windows~~ [Performance] Very slow load of ONNX model in Windows Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] Very slow load of ONNX model in Windows #22219

[Performance] Very slow load of ONNX model in Windows #22219

dhatraknilam commented Sep 25, 2024 •

edited

Loading

xadupre commented Sep 27, 2024

[Performance] Very slow load of ONNX model in Windows #22219

[Performance] Very slow load of ONNX model in Windows #22219

Comments

dhatraknilam commented Sep 25, 2024 • edited Loading

Describe the issue

To reproduce

Classifier

Regressor

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

xadupre commented Sep 27, 2024

dhatraknilam commented Sep 25, 2024 •

edited

Loading