Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask question.(5)

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

ITtutoria

ITtutoria Logo ITtutoria Logo

ITtutoria Navigation

  • Python
  • Java
  • Reactjs
  • JavaScript
  • R
  • PySpark
  • MYSQL
  • Pandas
  • QA
  • C++
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Python
  • Science
  • Java
  • JavaScript
  • Reactjs
  • Nodejs
  • Tools
  • QA
Home/ Questions/Ways to resolve the error ''cuda_error_out_of_memory''.
Next
Answered
Hunter Williams
  • 12
Hunter Williams
Asked: May 18, 20222022-05-18T21:31:39+00:00 2022-05-18T21:31:39+00:00In: Programs

Ways to resolve the error ”cuda_error_out_of_memory”.

  • 12

. Advertisement .

..3..

. Advertisement .

..4..

I get the error: cuda_error_out_of_memory when I try to run the program below:

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
 I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
 I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
 I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
 I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
 I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
 I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
 name: GeForce GTX 1080
 major: 6 minor: 1 memoryClockRate (GHz) 1.7335
 pciBusID 0000:01:00.0
 Total memory: 7.92GiB
 Free memory: 7.81GiB
 I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0
 I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y
 I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device:0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
 E tensorflow/stream_executor/cuda/cuda_driver.cc:965] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
 Iter 20, Minibatch Loss= 40491.636719
 ...
+-----------------------------------------------------------------------------+ 
 | NVIDIA-SMI 367.27 Driver Version: 367.27 
 |-------------------------------+----------------------+----------------------+ 
 | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | 
 | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M.
 |===============================+======================+======================|
 | 0 GeForce GTX 1080 Off | 0000:01:00.0 Off | N/A | 
 | 40% 61C P2 46W / 180W | 8107MiB / 8111MiB | 96% Default | 
 +-------------------------------+----------------------+----------------------+ 
 | 1 GeForce GTX 1080 Off | 0000:02:00.0 Off | N/A | 
 | 0% 40C P0 40W / 180W | 0MiB / 8113MiB | 0% Default | 
 +-------------------------------+----------------------+----------------------+ 
  │
 +-----------------------------------------------------------------------------+ 
 | Processes: GPU Memory | 
 | GPU PID Type Process name Usage | 
 |=============================================================================| 
 | 0 22932 C python 8105MiB |
 +-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+ 
 | NVIDIA-SMI 367.27 Driver Version: 367.27 
 |-------------------------------+----------------------+----------------------+ 
 | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | 
 | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M.
 |===============================+======================+======================|
 | 0 GeForce GTX 1080 Off | 0000:01:00.0 Off | N/A | 
 | 40% 61C P2 46W / 180W | 7793MiB / 8111MiB | 99% Default | 
 +-------------------------------+----------------------+----------------------+ 
 | 1 GeForce GTX 1080 Off | 0000:02:00.0 Off | N/A | 
 | 0% 40C P0 40W / 180W | 0MiB / 8113MiB | 0% Default | 
 +-------------------------------+----------------------+----------------------+ 
  │
 +-----------------------------------------------------------------------------+ 
 | Processes: GPU Memory | 
 | GPU PID Type Process name Usage | 
 |=============================================================================| 
 | 0 22932 C python 7791MiB |
 +-----------------------------------------------------------------------------+

The error appears the system notifies as follows:

CUDA_ERROR_OUT_OF_MEMORY

I tried to solve it with another sample. I got the reference in the community forum, but it still returned an invalid result. If someone knows the solution, please give me the support. Thanks!

cuda error out of memory
  • 2 2 Answers
  • 137 Views
  • 0 Followers
  • 0
Answer
Share
  • Facebook
  • Report

2 Answers

  • Voted
  • Oldest
  • Recent
  • Random
  1. Best Answer
    lyytutoria Expert
    2022-06-26T08:04:55+00:00Added an answer on June 26, 2022 at 8:04 am

    The cause:

    Tensorflow tries to assign a portion of the GPU RAM per process by default in order to save costs for memory management. Due to another process is using the GPU, so this fails and gives the CUDA OUT OF MEMORY warnings.

    Solution:

    allow_growth = True can make your memory usage smaller but it reduces the efficiency if you don’t use it exactly.

    Another way is manually deleting the GPU memory before each time operating by using nvidia-smi to check the memory usage of GPU:

    nvidia-smi
    
    nvidia-smi --gpu-reset

    However, this way is not effective if the GPU is being used in other process. If you come across such a situation, let’s use this command:

    sudo fuser -v /dev/nvidia*

    You will receive the following output:

    USER PID ACCESS COMMAND
    /dev/nvidia0: root 2216 F...m Xorg
    sid 6114 F...m krunner
    sid 6116 F...m plasmashell
    sid 7227 F...m akonadi_archive
    sid 7239 F...m akonadi_mailfil
    sid 7249 F...m akonadi_sendlat
    sid 18120 F...m chrome
    sid 18163 F...m chrome
    sid 24154 F...m code
    /dev/nvidiactl: root 2216 F...m Xorg
    sid 6114 F...m krunner
    sid 6116 F...m plasmashell
    sid 7227 F...m akonadi_archive
    sid 7239 F...m akonadi_mailfil
    sid 7249 F...m akonadi_sendlat
    sid 18120 F...m chrome
    sid 18163 F...m chrome
    sid 24154 F...m code
    /dev/nvidia-modeset: root 2216 F.... Xorg
    sid 6114 F.... krunner
    sid 6116 F.... plasmashell
    sid 7227 F.... akonadi_archive
    sid 7239 F.... akonadi_mailfil
    sid 7249 F.... akonadi_sendlat
    sid 18120 F.... chrome
    sid 18163 F.... chrome
    sid 24154 F.... code

    You may get the PID for the process which is managing the memory of GPU, for example 24154.

    Let’s exclude it by the following command:

    sudo kill -9 MY_PID
    

    Finally, you need to substitute the appropriate PID for MY_PID.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report
  2. Léo Allard
    2022-05-25T21:19:41+00:00Added an answer on May 25, 2022 at 9:19 pm

    This issue occurred to me when I tried to run Keras/Tensorflow again after the first attempt was unsuccessful. The GPU memory cannot be allocated again because it is still being allocated. This was resolved by closing all python processes using the GPU or opening a new terminal window and running them again.

    • 16
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

Sidebar

Ask A Question
  • How to Split String by space in C++
  • How To Convert A Pandas DataFrame Column To A List
  • How to Replace Multiple Characters in A String in Python?
  • How To Remove Special Characters From String Python

Explore

  • Home
  • Tutorial

Footer

ITtutoria

ITtutoria

This website is user friendly and will facilitate transferring knowledge. It would be useful for a self-initiated learning process.

@ ITTutoria Co Ltd.

Tutorial

  • Home
  • Python
  • Science
  • Java
  • JavaScript
  • Reactjs
  • Nodejs
  • Tools
  • QA

Legal Stuff

  • About Us
  • Terms of Use
  • Privacy Policy
  • Contact Us

DMCA.com Protection Status

Help

  • Knowledge Base
  • Support

Follow

© 2022 Ittutoria. All Rights Reserved.

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.