Skip to main content

Using Cloud instances for your Cuda loads

·786 words·4 mins

Introduction #

Writing an algorithm employing Cuda does not require a state of the art GPU. However, to reap the benefits of massively parallel processing it is required a powerful one. How about using a cloud instance?.

Writen by: Santiago Hurtado

rack

Let’s do some quick numbers first:

An Intel i7-9700f processor should deliver 384 Gflops at a price of about 400 USD

Instance Creation #

Ok, enough of that, So how do we get an Azure VM that gives the promise of a large number of computations per second. * The setup we will be performing is only recommended for the bare usage of Cuda*. A depth tutorial can be found at the Nvidia site. Also, if you are looking for a * Data science VM* there is an easier way, follow this tutorial from Microsoft.

Most of the time I recommend using the Azure Cli cli to have repeatable tasks. I also assume you use mac or Linux since the variables used were written for Bash.

Don’t spin up a full-fledged GPU instance for setting up the machine, you can use a free version for the initial image and when done, copy or resize the instance.

Setup #

Some basic info for this tutorial, we use an initial Size of Standard_B2s and East US as the default location.

Lets first create a resource group, if you haven’t one already.

  • Find a location and verify the size you want exist in the location

      az account list-locations
    
      az vm list-sizes -l ${azure_region}
    
  • Create a resource group

      az group create --name ${resource_group} --location ${azure_region}       
    
  • Now create the VM, make sure you setup the storage-sku for later rezise compatibility.

          az vm create \
              --resource-group ${resource_group}  \
              --name ${vm_name} \
              --image UbuntuLTS \
              --admin-username ${USERNAME} \
              --ssh-key-values <put the path or paths to your .pub ssh key here> \
              --size Standard_B2ms --storage-sku StandardSSD_LRS
    

Connect to your VM #

To find the public IP address of your VM, the response of the cli has it, also you can get it as follows:

az vm show -d -g ${resource_group}  -n ${vm_name} --query publicIps -o tsv

Remember the machine the IP adress can change unless you set a pay static IP

  • SSH to your instance using you .pub key

      ssh <public_ip>
    

Install drivers #

  • Upgrade all, it is a good idea to restart after.

      sudo apt -qq update && sudo apt -yqq upgrade
      sudo restart
    
  • Setup the repositories and drivers, see that we dont need the actual GPU on the VM just yet. check the nvidia website for any updates on these steps:

      wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
      sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
      sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
      sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
      sudo apt-get -qq update
      sudo apt-get -yqq install cuda
    

While you wait you could write your cuda code on a new terminal.

  • Setup The Bash environment

      echo export PATH=/usr/local/cuda-10.2/bin:/usr/local/cuda-10.2/NsightCompute-2019.1'${PATH:+:${PATH}}' >> ~/.bashrc
      echo export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64'${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}'>>~/.bashrc
    
  • Power off

      sudo poweroff
    

Running on a GPU instance #

Warning: Please don’t forget to deallocate your instance it will keep billing you if you don’t.

The new size we will be using is:

    size = Standard_NC6_Promo
  • Deallocate the VM for resizing

      az vm deallocate --name ${vm_name} --resource-group ${resource_group} 
    
  • Resize the VM

      az vm resize --resource-group ${resource_group}  --name ${vm_name} --size ${size}
    
  • Start the VM

      az vm start --resource-group ${resource_group} --name ${vm_name}
    
  • Get the new IP

      az VM show -d -g ${resource_group}  -n ${vm_name} --query publicIps -o tsv
    

Testing it works #

  • Check what NVIDIA card you have

      nvidia-smi
    
  • Copy a cuda code, for example https://github.com/advt3/ParallelProgramming/blob/master/cuda/enumerate.cu

      wget https://raw.githubusercontent.com/advt3/ParallelProgramming/master/cuda/enumerate.cu
    
  • Compile

      nvcc -o enumerate enumerate.cu
    
  • Run

      ./enumerate
    
  • Shutdown and deallocate

      az vm deallocate --name ${vm_name} --resource-group ${resource_group}
    
  • Verify the status of your VM

      az vm list -d -o table
    

Cool we did it, I will recomend you scrip this steps so is a bit more natural an can be done often.

Final Thoughts #

In this post, we have explained the complete environment setup using the latest Cuda driver and ubuntu 18.04, however, you can find a more complete but outdated description on the Microsoft documentation.

It will be nice to automatize the code running on a CI/CD pipeline so you can run your code only for the specific needed time. We will see when we have the time to test it.

Happy coding!

References #