CUDA Unified Memory Matrix Multiplication

GPU Teaching Kit – Accelerated Computing

Objective

Implement a tiled dense matrix multiplication routine while using efficient Unified Memory practices.

Prerequisites

Before starting this lab, make sure that:

Instructions

Edit the code in the code tab to perform the following:

Instructions about where to place each part of the code is demarcated by the //@@ comment lines.

Local Setup Instructions

The most recent version of source code for this lab along with the build-scripts can be found on the Bitbucket repository. A description on how to use the CMake tool in along with how to build the labs for local development found in the README document in the root of the repository.

The executable generated as a result of compiling the lab can be run using the following command:

./UMMatrixMultiplication\_Template -i <A rows>,<A cols>,<B cols>

where <A rows>,<A cols> is the number of rows and columns for the matrix A and <B cols>is the number of columns for the matrix B.