Memory issue #42

cmehta126 · 2017-11-13T20:21:40Z

I'm running Broccoli for permutation tests on MRI data. I'm getting an error of the type:

Run kernel error for kernel 'CalculateBetaWeightsGLM' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE'
Run kernel error for kernel 'CalculateStatisticalMapsGLMTTest' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE'
Run kernel error for kernel 'CalculateStatisticalMapsGLMTTestSecondLevelPermutation' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE'
Run kernel error for kernel 'TransformData' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE'

It seems this is a memory issue. Are there anyways getting around this? The data I'm permuting are spatial maps of dimension 256x256x256 for several hundred samples. I'm using a mask generated from "Smoothing".

What is more is that, prior to this error, the output of RandomiseGroupLevel states

Permutation threshold for contrast 1 for a significance level of 0.050000 is 0.000000
Permutation threshold for contrast 2 for a significance level of 0.050000 is 0.000000
Permutation threshold for contrast 3 for a significance level of 0.050000 is 0.000000

Could this imply that the data is too highly correlated over voxels for "RandomiseGroupLevel" to work properly or is it most likely a memory issue? The volumes file is 2.5 GB in total with 356 subjects. My device information is:

Device info

Platform number: 0

Platform vendor: NVIDIA Corporation
Platform name: NVIDIA CUDA
Platform extentions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
Platform profile: FULL_PROFILE

Device number: 0

Device vendor: NVIDIA Corporation
Device name: Tesla K80
Hardware version: OpenCL 1.2 CUDA
Software version: 375.66
OpenCL C version: OpenCL C 1.2
Device extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
Global memory size in MB: 11439
Size of largest memory object in MB: 2859
Global memory cache size in KB: 208
Local memory size in KB: 48
Constant memory size in KB: 64
Parallel compute units: 13
Clock frequency in MHz: 823
Max number of threads per block: 1024
Max number of threads in each dimension: 1024 1024 64

It seems the hardware could theoretically handle loading 2.5 Gigs of data (but not sure if it is enough for Permutation tests).

thank you.

Best,
Chintan

wanderine · 2017-11-13T21:33:19Z

Can you show the output of GetOpenCLInfo and the full output of your call to RandomiseGroupLevel ? 2017-11-13 21:21 GMT+01:00 cmehta126 <[email protected]>:

…

I'm running Broccoli for permutation tests on MRI data. I'm getting an error of the type: Run kernel error for kernel 'CalculateBetaWeightsGLM' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE' Run kernel error for kernel 'CalculateStatisticalMapsGLMTTest' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE' Run kernel error for kernel 'CalculateStatisticalMapsGLMTTestSecondLevelPermutation' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE' Run kernel error for kernel 'TransformData' is 'CL_MEM_OBJECT_ALLOCATION_ FAILURE' It seems this is a memory issue. Are there anyways getting around this? The data I'm permuting are spatial maps of dimension 256x256x256 for several hundred samples. I'm using a mask generated from "Smoothing". What is more is that, prior to this error, the output of RandomiseGroupLevel states Permutation threshold for contrast 1 for a significance level of 0.050000 is 0.000000 Permutation threshold for contrast 2 for a significance level of 0.050000 is 0.000000 Permutation threshold for contrast 3 for a significance level of 0.050000 is 0.000000 Could this imply that the data is too highly correlated over voxels for "RandomiseGroupLevel" to work properly or is it most likely a memory issue? thank you. Best, Chintan — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#42>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEGryE4M0tzK714xLjGjD_T80UmrKgpJks5s2KTUgaJpZM4QcWuo> .

-- Anders Eklund, PhD

cmehta126 · 2017-11-13T23:26:02Z

RandomiseGroupLevel worked as I expected on my dataset after downsampling spatial maps in the input volume from 256x256x256 (1mm x 1mm x 1mm) to 128x128x128 (2mm x 2mm x 2mm). That significantly reduced the amount of memory needed for loading this data, without sacrificing specificity. The voxel resolution of the original data from diffusion weighted imaging (DWI) was on the order of (2mm x 2mm x 2mm) to begin with. I registered the DWI to FreeSurfer's CVS template which has voxel resolution (1mm x 1mm x 1mm) to enable group analysis, but I don't believe there is much loss of information by downsampling.

Regardless, here is the output from GetOpenCLInfo and RandomiseGroupLevel when using volumes of spatial maps with dimesnionality 256x256x256 (as I did originally and got error with)

GetOpenCLInfo:

Device info

Platform number: 0

Platform vendor: NVIDIA Corporation
Platform name: NVIDIA CUDA
Platform extentions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
Platform profile: FULL_PROFILE

Device number: 0

Device vendor: NVIDIA Corporation
Device name: Tesla K80
Hardware version: OpenCL 1.2 CUDA
Software version: 375.66
OpenCL C version: OpenCL C 1.2
Device extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
Global memory size in MB: 11439
Size of largest memory object in MB: 2859
Global memory cache size in KB: 208
Local memory size in KB: 48
Constant memory size in KB: 64
Parallel compute units: 13
Clock frequency in MHz: 823
Max number of threads per block: 1024
Max number of threads in each dimension: 1024 1024 64

The output of the call to RandomiseGroupLevel (this here is with 500 permutations. But same thing held when running 5000 permutations).

Authored by K.A. Eklund
Data size: 256 x 256 x 256 x 362
Number of permutations: 500
Number of regressors: 8
Number of contrasts: 3
Performing 3 t-tests
Correlation design detected for t-contrast 1
Correlation design detected for t-contrast 2
Correlation design detected for t-contrast 3
Max number of permutations for contrast 1 is inf
Max number of permutations for contrast 2 is inf
Max number of permutations for contrast 3 is inf
Starting permutation 1
Starting permutation 101
Starting permutation 201
Starting permutation 301
Starting permutation 401
Permutation threshold for contrast 1 for a significance level of 0.050000 is 0.000000
Starting permutation 1
Starting permutation 101
Starting permutation 201
Starting permutation 301
Starting permutation 401
Permutation threshold for contrast 2 for a significance level of 0.050000 is 0.000000
Starting permutation 1
Starting permutation 101
Starting permutation 201
Starting permutation 301
Starting permutation 401
Permutation threshold for contrast 3 for a significance level of 0.050000 is 0.000000
Run kernel error for kernel 'CalculateBetaWeightsGLM' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE'
Run kernel error for kernel 'CalculateStatisticalMapsGLMTTest' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE'
Run kernel error for kernel 'CalculateStatisticalMapsGLMTTestSecondLevelPermutation' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE'
Run kernel error for kernel 'TransformData' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE'

wanderine · 2017-11-14T10:37:38Z

362 volumes of size 256 x 256 x 256 requires about 22.6 GB of memory in float format, while your graphics card has 11 GB of memory. 2017-11-14 0:26 GMT+01:00 cmehta126 <[email protected]>:

…

RandomiseGroupLevel worked as I expected on my dataset after downsampling spatial maps in the input volume from 256x256x256 (1mm x 1mm x 1mm) to 128x128x128 (2mm x 2mm x 2mm). That significantly reduced the amount of memory needed for loading this data, without sacrificing specificity. The voxel resolution of the original data from diffusion weighted imaging (DWI) was on the order of (2mm x 2mm x 2mm) to begin with. I registered the DWI to FreeSurfer's CVS template which has voxel resolution (1mm x 1mm x 1mm) to enable group analysis, but I don't believe there is much loss of information by downsampling. Regardless, here is the output from GetOpenCLInfo and RandomiseGroupLevel when using volumes of spatial maps with dimesnionality 256x256x256 (as I did originally and got error with) GetOpenCLInfo: Device info ------------------------------ Platform number: 0 Platform vendor: NVIDIA Corporation Platform name: NVIDIA CUDA Platform extentions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer Platform profile: FULL_PROFILE ------------------------------ Device number: 0 Device vendor: NVIDIA Corporation Device name: Tesla K80 Hardware version: OpenCL 1.2 CUDA Software version: 375.66 OpenCL C version: OpenCL C 1.2 Device extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer Global memory size in MB: 11439 Size of largest memory object in MB: 2859 Global memory cache size in KB: 208 Local memory size in KB: 48 Constant memory size in KB: 64 Parallel compute units: 13 Clock frequency in MHz: 823 Max number of threads per block: 1024 Max number of threads in each dimension: 1024 1024 64 The output of the call to RandomiseGroupLevel (this here is with 500 permutations. But same thing held when running 5000 permutations). Authored by K.A. Eklund Data size: 256 x 256 x 256 x 362 Number of permutations: 500 Number of regressors: 8 Number of contrasts: 3 Performing 3 t-tests Correlation design detected for t-contrast 1 Correlation design detected for t-contrast 2 Correlation design detected for t-contrast 3 Max number of permutations for contrast 1 is inf Max number of permutations for contrast 2 is inf Max number of permutations for contrast 3 is inf Starting permutation 1 Starting permutation 101 Starting permutation 201 Starting permutation 301 Starting permutation 401 Permutation threshold for contrast 1 for a significance level of 0.050000 is 0.000000 Starting permutation 1 Starting permutation 101 Starting permutation 201 Starting permutation 301 Starting permutation 401 Permutation threshold for contrast 2 for a significance level of 0.050000 is 0.000000 Starting permutation 1 Starting permutation 101 Starting permutation 201 Starting permutation 301 Starting permutation 401 Permutation threshold for contrast 3 for a significance level of 0.050000 is 0.000000 Run kernel error for kernel 'CalculateBetaWeightsGLM' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE' Run kernel error for kernel 'CalculateStatisticalMapsGLMTTest' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE' Run kernel error for kernel 'CalculateStatisticalMapsGLMTTestSecondLevelPermutation' is 'CL_MEM_OBJECT_ALLOCATION_FAILURE' Run kernel error for kernel 'TransformData' is 'CL_MEM_OBJECT_ALLOCATION_ FAILURE' — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#42 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEGryD0NQkDulCw5zSKLk0nuytV_1_PCks5s2NALgaJpZM4QcWuo> .

-- Anders Eklund, PhD

cmehta126 · 2017-11-14T16:23:04Z

Thank you. Does Broccoli have a way of using additional memory to augment the RAM of a graphics card, given my graphics card is limited to 11GB of memory (with largest object size capped at 2.8 GB). I have available ~1 TB of Fast SSD memory mounted.

wanderine · 2017-11-15T13:47:41Z

No, but you can install an OpenCL driver for the CPU, and then run BROCCOLI in parallel on the CPU cores. 2017-11-14 17:23 GMT+01:00 cmehta126 <[email protected]>:

…

Thank you. Does Broccoli have a way of using additional memory to augment the RAM of a graphics card, given my graphics card is limited to 11GB of memory (with largest object size capped at 2.8 GB). I have available ~1 TB of Fast SSD memory mounted. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#42 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEGryKoYGScfPYI64WhSo2NXct3D_Oyqks5s2b5pgaJpZM4QcWuo> .

-- Anders Eklund, PhD

cmehta126 · 2017-11-15T13:49:51Z

Thank you, that is very helpful.

…

On Nov 15, 2017 8:47 AM, "Anders Eklund" ***@***.***> wrote: No, but you can install an OpenCL driver for the CPU, and then run BROCCOLI in parallel on the CPU cores. 2017-11-14 17:23 GMT+01:00 cmehta126 ***@***.***>: > Thank you. Does Broccoli have a way of using additional memory to augment > the RAM of a graphics card, given my graphics card is limited to 11GB of > memory (with largest object size capped at 2.8 GB). I have available ~1 TB > of Fast SSD memory mounted. > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#42 (comment) >, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ AEGryKoYGScfPYI64WhSo2NXct3D_Oyqks5s2b5pgaJpZM4QcWuo> > . > -- Anders Eklund, PhD — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#42 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE74rhAh1rN_bLHhmYaqlVQsiTydOuvrks5s2ut-gaJpZM4QcWuo> .

changken1 mentioned this issue May 25, 2018

RandomiseGroupLevel crashing #51

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory issue #42

Memory issue #42

cmehta126 commented Nov 13, 2017 •

edited

Loading

Platform number: 0

Device number: 0

wanderine commented Nov 13, 2017 via email

cmehta126 commented Nov 13, 2017

Platform number: 0

Device number: 0

wanderine commented Nov 14, 2017 via email

cmehta126 commented Nov 14, 2017

wanderine commented Nov 15, 2017 via email

cmehta126 commented Nov 15, 2017 via email

Memory issue #42

Memory issue #42

Comments

cmehta126 commented Nov 13, 2017 • edited Loading

Platform number: 0

Device number: 0

wanderine commented Nov 13, 2017 via email

cmehta126 commented Nov 13, 2017

Platform number: 0

Device number: 0

wanderine commented Nov 14, 2017 via email

cmehta126 commented Nov 14, 2017

wanderine commented Nov 15, 2017 via email

cmehta126 commented Nov 15, 2017 via email

cmehta126 commented Nov 13, 2017 •

edited

Loading