high system memory usage when extracting embeddings #687

joshctaylor · 2024-08-30T20:12:46Z

Hi,

I'm wondering if there is a simple way of restricting RAM (not vRAM) use when calculating embeddings - sometimes I need to use a laptop.

I'm figuring that the multi-threading used to load the GPU is taking up a lot of RAM?

Thanks all

joshctaylor · 2024-09-01T21:54:43Z

I've dug into this a little now, it seems that around 150GB of ram is needed when the process starts. 5x memory allocation error messages are shown before the process settles down to using 15GB system ram.

As there are 5x workers for for multi_load_audio_window in chirp/audio_utils.py this could point to where the issue might be.

I'm using NVIDIA A100 80GB on Intel Xeon 24 core VM with 220 GB ram, if I try to use a lesser machine, the process is killed by the linux kernel when it exhausts ram and swap.

I'm working on 1 hour duration Flac formal files.

Found 0 existing embedding ids. 
Processing 1574 new source infos. 
  0%|          | 0/1574 [00:00<?, ?it/s]2024-09-01 20:47:30.019656: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 453120000 exceeds 10% of free system memory.
2024-09-01 20:47:30.484099: W tensorflow/compiler/tf2xla/kernels/assert_op.cc:38] Ignoring Assert operator jax2tf_infer_fn_/assert_equal_1/Assert/AssertGuard/Assert
2024-09-01 20:47:33.430616: E external/local_xla/xla/service/slow_operation_alarm.cc:65] Trying algorithm eng0{} for conv (f32[708,640,501,1]{3,2,1,0}, u8[0]{0}) custom-call(f32[708,1,160640,1]{3,2,1,0}, f32[640,1,640,1]{3,2,1,0}), window={size=640x1 stride=320x1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convForward", backend_config={"conv_result_scale":1,"activation_mode":"kNone","side_input_scale":0,"leakyrelu_alpha":0} is taking a while...

...... simalar messages ....

Trying algorithm eng46{k2=5,k5=3,k14=4} for conv (f32[708,144,125,40]{3,2,1,0}, u8[0]{0}) custom-call(f32[708,144,125,40]{3,2,1,0}, f32[144,1,3,3]{3,2,1,0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, feature_group_count=144, custom_call_target="__cudnn$convForward", backend_config={"conv_result_scale":1,"activation_mode":"kNone","side_input_scale":0,"leakyrelu_alpha":0} is taking a while...
W0000 00:00:1725223732.544358    5221 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
  0%|          | 1/1574 [01:36<42:03:23, 96.25s/it]2024-09-01 20:48:56.767735: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 453120000 exceeds 10% of free system memory.
  0%|          | 2/1574 [01:40<18:27:41, 42.28s/it]2024-09-01 20:48:58.825938: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 453120000 exceeds 10% of free system memory.
  0%|          | 3/1574 [01:43<10:31:21, 24.11s/it]2024-09-01 20:49:00.984838: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 453120000 exceeds 10% of free system memory.
  0%|          | 4/1574 [01:44<6:38:29, 15.23s/it] 2024-09-01 20:49:02.478230: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 453120000 exceeds 10% of free system memory.

joshctaylor changed the title ~~Is there a way to limit threads to use embedding.py when RAM is tight~~ high system memory usage when extracting embeddings Sep 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

high system memory usage when extracting embeddings #687

high system memory usage when extracting embeddings #687

joshctaylor commented Aug 30, 2024 •

edited

Loading

joshctaylor commented Sep 1, 2024 •

edited

Loading

high system memory usage when extracting embeddings #687

high system memory usage when extracting embeddings #687

Comments

joshctaylor commented Aug 30, 2024 • edited Loading

joshctaylor commented Sep 1, 2024 • edited Loading

joshctaylor commented Aug 30, 2024 •

edited

Loading

joshctaylor commented Sep 1, 2024 •

edited

Loading