您的位置：首页 > 其它

32位window7的CUDA编译环境配置

2012-06-09 19:10 295 查看

1.搭建环境

1.1 安装Visual Studio IDE

首先确定计算机中已安装Visual Studio集成开发环境，本人选用了VS2008.

1.2 下载CUDA开发套件

登陆Nvidia官网（http://developer.nvidia.com/cuda-downloads ）下载驱动（driver），开发包（SDK），工具包（Toolkit）。

注：根据自己计算机的配置选择desktop / notebook的 64 / 32 位版本，并且driver，SDK和Toolkit版本要一致。

1.3 安装驱动

选择自定义（高级）选项，单击下一步

建议选择执行清洁安装

1.4 安装工具包（Toolkit）

执行安装，选择自定义安装，修改安装路径位 D：\Program Files\NVIDIA GPU Computing Toolkit\\CUDA\v4.2\

1.5 安装开发包（SDK）

执行安装，修改安装路径为 D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2

2.配置Visual Studio 2008

2.1 安装Visual Assist X，设置语法高亮

（1）语法高亮：

将D:\NVIDIA Corporation\NVIDIA CUDA SDK 4.2\C\doc\syntax_highlighting\visual_studio_8里面的usertype.dat文件复制到 Microsoft Visual Studio 8\Common7\IDE目录下面
（2）Assist关联.cu文件
关闭已经打开的Visual studio，之后进入注册表编辑：HKEY_CURRENT_USER\Software\Whole Tomato\Visual Assist X\VANet9\找到右边的ExtSource项，将其值添加.cu;.cuh;之后关闭。
（3）在VS的Tools|Options|Projects and solutions|VC++ project setting里面的rules file search path里面添加 CUDA SDK安装目录的common目录的CUDA.rules文件所在路径C:\ProgramData\NVIDIA Corporation\NVIDIA CUDA SDK\common\。
（4）新建项目->项目名上右键->选择custom build rules…->然后勾选新添加进去的Rule files。在工程中添加.cu文件，右键点击cu文件->选择Properties->Configuration Properties->General->Tool的下拉列表中选择上一步勾选的Build Rule的名称，例如“CUDA Build Rule v2.2.0”->确定。
（5）以上这些设置完成之后基本可以很方便的编写CUDA程序了，但是还有一个lib的问题，如果所有编译成功则忽略此步骤，没有需要将NVIDIA CUDA SDK\bin下的lib 复制到系统环境变量中的路径下（只需要一个文件夹里的就行，不用都copy），此处有两个文件夹，看好是32位还是64位系统，我就把他们copy到C:\CUDA\bin中去了。
注：新版本的visual assit x集成度很好，如果事先已经安装好了CUDA的集成开发环境，上述2~5步似乎可以省略，各位请根据自己情况进行设置。

2.2 设置Visual Studio 2008环境

打开VS2008，依次选择[工具(Tools)]->[选项(Options)]->[项目和解决方案(Projects and Solutions)]。

注：以下要求按照自己的CUDA开发套件安装目录更改路径

在 [可执行文件] 中添加：

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\common\bin

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\bin\win32\Release

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\bin\win32\Debug

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\shared\bin\win32\Release

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\shared\bin\win32\Debug

在 [包含文件] 中添加;

D:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\include

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\CUDALibraries\common\inc

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\common\inc

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\shared\inc

在 [库文件] 中添加：

D:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\lib\Win32

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\common\lib

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\common\lib\Win32

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\shared\lib\Win32

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\CUDALibraries\common\lib

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\CUDALibraries\common\lib\Win32

在 [源文件] 中添加;

D:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\src

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\common\src

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\shared\src

D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\CUDALibraries\common\src

选择 [VC++项目设置] ，在 [C/C++文件扩展名] 中添加 *.cu，在 [包括的扩展名] 中添加 .cuh。

选择 [文本编辑器] -> [文件扩展名]，在编辑框中填入cu，在 [编辑器] 下来菜单中选择Microsoft Visual C++，点击添加。

2.3 规则添加

此时运行 D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\SDK Browser\browser.exe，可以选择运行自带的演示例程。

例如选择Device Query，如果能够运行，则说明经过上述步骤，配置已完成。

否则将CUDA Toolkit的安装目录(D:\NVIDIA GPU Computing Toolkit\CUDA\v4.2\extra\visual_studio_intergration\rules)下的4个rules文件复制到 D:\Program Files\Microsoft Visual Studio 9.0\VC\VCProjectDefaults目录下。

2.4 编译cutil链接库以及环境变量设置

cutil链接库是CUDA程序运行必需的库文件，但CUDA v4.2没有提供现成的，需要自己编译并得到。进入D:\Program Files\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\common，找到cutil_vs2008.vcproj,打开，设置编译平台为Win32，然后分别编译Debug和Release版本就可以了。

3.创建项目

3.1 创建一个空项目

创建一个win32控制台应用程序，选中 [附加选项] 中的空项目，创建完成。

右击项目名称，选择 [自定义项目规则]，勾选 [CUDA Runtime API Build Rule(v4.2)]。

右击项目名称，选择 [属性]，选择 [属性配置] -> [链接器] -> [常规]，在 [附加库目录]中添加附加依赖项cudart.lib，cutil32D.lib等所在目录 $(CUDA_PATH)\lib\$(PlatformName);..\..\common\lib\$(PlatformName)。

在 [输入] -> [附加库依赖项] 中添加cudart.lib cutil32D.lib cuda.lib等。否则在编译时会出现类似"error LNK2019 无法解析的外部符号"的错误。

右击 [源文件]文件夹，选择 [添加] -> [新建项]，选择C++模板，填写名称时可为***.cu的形式，即后缀名为cu。

3.2 测试例程

以上项目创建成功，即可编写程序，编译，运行。下面给出一个测试程序代码，以供检测编译环境是否已配置成功。

#include <stdlib.h>

#include <stdio.h>

#include <string.h>

#include <cuda.h>

#include <cuda_runtime_api.h>

#include <cutil.h>

////////////////////////////////////////////////////////////////////////////////

// Program main

////////////////////////////////////////////////////////////////////////////////

int main( int argc, char** argv)

{

printf("CUDA Device Query (Runtime API) version (CUDART static linking)\n");

int deviceCount;

cudaGetDeviceCount(&deviceCount);

// This function call returns 0 if there are no CUDA capable devices.

if (deviceCount == 0)

printf("There is no device supporting CUDA\n");

int dev;

for (dev = 0; dev < deviceCount; ++dev) {

cudaDeviceProp deviceProp;

cudaGetDeviceProperties(&deviceProp, dev);

if (dev == 0) {

// This function call returns 9999 for both major & minor fields, if no CUDA capable devices are present

if (deviceProp.major == 9999 && deviceProp.minor == 9999)

printf("There is no device supporting CUDA.\n");

else if (deviceCount == 1)

printf("There is 1 device supporting CUDA\n");

else

printf("There are %d devices supporting CUDA\n", deviceCount);

}

printf("\nDevice %d: \"%s\"\n", dev, deviceProp.name);

#if CUDART_VERSION >= 2020

int driverVersion = 0, runtimeVersion = 0;

cudaDriverGetVersion(&driverVersion);

printf(" CUDA Driver Version: %d.%d\n", driverVersion/1000, driverVersion%100);

cudaRuntimeGetVersion(&runtimeVersion);

printf(" CUDA Runtime Version: %d.%d\n", runtimeVersion/1000, runtimeVersion%100);

#endif

printf(" CUDA Capability Major revision number: %d\n", deviceProp.major);

printf(" CUDA Capability Minor revision number: %d\n", deviceProp.minor);

printf(" Total amount of global memory: %u bytes\n", deviceProp.totalGlobalMem);

#if CUDART_VERSION >= 2000

printf(" Number of multiprocessors: %d\n", deviceProp.multiProcessorCount);

printf(" Number of cores: %d\n", 8 * deviceProp.multiProcessorCount);

#endif

printf(" Total amount of constant memory: %u bytes\n", deviceProp.totalConstMem);

printf(" Total amount of shared memory per block: %u bytes\n", deviceProp.sharedMemPerBlock);

printf(" Total number of registers available per block: %d\n", deviceProp.regsPerBlock);

printf(" Warp size: %d\n", deviceProp.warpSize);

printf(" Maximum number of threads per block: %d\n", deviceProp.maxThreadsPerBlock);

printf(" Maximum sizes of each dimension of a block: %d x %d x %d\n",

deviceProp.maxThreadsDim[0],

deviceProp.maxThreadsDim[1],

deviceProp.maxThreadsDim[2]);

printf(" Maximum sizes of each dimension of a grid: %d x %d x %d\n",

deviceProp.maxGridSize[0],

deviceProp.maxGridSize[1],

deviceProp.maxGridSize[2]);

printf(" Maximum memory pitch: %u bytes\n", deviceProp.memPitch);

printf(" Texture alignment: %u bytes\n", deviceProp.textureAlignment);

printf(" Clock rate: %.2f GHz\n", deviceProp.clockRate * 1e-6f);

#if CUDART_VERSION >= 2000

printf(" Concurrent copy and execution: %s\n", deviceProp.deviceOverlap ? "Yes" : "No");

#endif

#if CUDART_VERSION >= 2020

printf(" Run time limit on kernels: %s\n", deviceProp.kernelExecTimeoutEnabled ? "Yes" : "No");

printf(" Integrated: %s\n", deviceProp.integrated ? "Yes" : "No");

printf(" Support host page-locked memory mapping: %s\n", deviceProp.canMapHostMemory ? "Yes" : "No");

printf(" Compute mode: %s\n", deviceProp.computeMode == cudaComputeModeDefault ?

"Default (multiple host threads can use this device simultaneously)" :

deviceProp.computeMode == cudaComputeModeExclusive ?

"Exclusive (only one host thread at a time can use this device)" :

deviceProp.computeMode == cudaComputeModeProhibited ?

"Prohibited (no host thread can use this device)" :

"Unknown");

#endif

}

printf("\nTest PASSED\n");

CUT_EXIT(argc, argv);

}

运行结果如下图所示

参考文献：

[1] WIN7和VS2008条件下CUDA环境的搭建

[2] Windows7 64bit + VS2008 + CUDA 4.0 安装配置完全过程

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航