The lab’s objective is to implement a tiled image convolution using both shared and constant memory. We will have a constant 5x5 convolution mask, but will have arbitrarily sized image (assume the image dimensions are greater than 5x5 for this Lab).
To use the constant memory for the convolution mask, you can first
transfer the mask data to the device. Consider the case where the
pointer to the device array for the mask is named M. You can use
const float * __restrict__ M
as one of the parameters
during your kernel launch. This informs the compiler that the contents
of the mask array are constants and will only be accessed through
pointer variable M
. This will enable the compiler to place
the data into constant memory and allow the SM hardware to aggressively
cache the mask data at runtime.
Convolution is used in many fields, such as image processing for
image filtering. A standard image convolution formula for a 5x5
convolution filter M
with an Image I
is:
\[ P_{i,j,c} = \sum_{x=-2}^{2} \sum_{y=-2}^2 I_{i+x,j+y,c} * M_{x,y} \]
where \(P_{i,j,c}\) is the output
pixel at position i,j
in channel c
, \(I_{i,j,c}\) is the input pixel at
i,j
in channel c
(the number of channels will
always be 3 for this MP corresponding to the RGB values), and \(M_{x,y}\) is the mask at position
x,y
.
Before starting this lab, make sure that:
The input is an interleaved image of
height x width x channels
. By interleaved, we mean that the
the element I[y][x]
contains three values representing the
RGB channels. This means that to index a particular element’s value, you
will have to do something like:
= (yIndex*width + xIndex)*channels + channelIndex; index
For this assignment, the channel index is 0 for R, 1 for G, and 2 for
B. So, to access the G value of I[y][x]
, you should use the
linearized expression
I[(yIndex*width+xIndex)*channels + 1]
.
For simplicity, you can assume that channels
is always
set to 3
.
Edit the code in the code tab to perform the following:
Instructions about where to place each part of the code is demarcated
by the //@@
comment lines.
A sequential pseudo code would look something like this:
= 5
maskWidth := maskWidth/2 # this is integer division, so the result is 2
maskRadius :for i from 0 to height do
for j from 0 to width do
for k from 0 to channels
= 0
accum :for y from -maskRadius to maskRadius do
for x from -maskRadius to maskRadius do
= j + x
xOffset := i + y
yOffset :if xOffset >= 0 && xOffset < width &&
>= 0 && yOffset < height then
yOffset = I[(yOffset * width + xOffset) * channels + k]
imagePixel := K[(y+maskRadius)*maskWidth+x+maskRadius]
maskValue :+= imagePixel * maskValue
accum end
end
end
# pixels are in the range of 0 to 1
[(i * width + j)*channels + k] = clamp(accum, 0, 1)
Pend
end
end
where clamp
is defined as
def clamp(x, lower, upper)
return min(max(x, lower), upper)
end
The most recent version of source code for this lab along with the build-scripts can be found on the Bitbucket repository. A description on how to use the CMake tool in along with how to build the labs for local development found in the README document in the root of the repository.
The executable generated as a result of compiling the lab can be run using the following command:
./Convolution_Template -e <expected.ppm> \
-i <input0.ppm>,<input1.raw> -o <output.ppm> -t image`.
where <expected.ppm>
is the expected output,
<input.ppm>
is the input dataset, and
<output.ppm>
is an optional path to store the
results. The datasets can be generated using the dataset generator built
as part of the compilation process.
The images are stored in PPM (P6
) format, this means
that you can (if you want) create your own input images. The easiest way
to create image is via external tools such as bmptoppm
. The
masks are stored in a CSV format. Since the input is small, it is best
to edit it by hand.