Vtune: amplxe-cl 命令行使用
2012-03-17 21:07
405 查看
参考文献
点击打开链接
http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/2011Update/lin/ug_docs/index.htm
amplxe-cl -collect hotspots -- ./driver /home/zxx/work_autumn_2011/matrices/rma10.mtx
Reading sparse matrix from file (/home/zxx/work_autumn_2011/matrices/rma10.mtx): done
Using 46835-by-46835 matrix with 2374001 nonzero values
------------------------------------------
#### Testing COO Kernels ####
creating coo_matrix:coo transform time elapsed 0.013690
do coo spmv time elapsed 5.434732 seconds
orignal do coo spmv time elapsed 5.429192 seconds
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a report
Summary
-------
Elapsed Time: 11.312
CPU Time: 11.280
Executing actions 100 % done
amplxe-cl -report hotspots -result-dir r001hs
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a report
Function Module CPU Time
__spmv_coo_serial_host_sse driver 5.420
__spmv_coo_serial_host<unsigned int, double> driver 5.410
read_coo_matrix<unsigned int, double> driver 0.350
test_coo_matrix_kernels<unsigned int, double> driver 0.060
coo_to_csr<unsigned int, double> driver 0.020
csr_to_coo<unsigned int, double> driver 0.020
Executing actions 100 % done
amplxe-cl -report summary -result-dir r001hs
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a report
Summary
-------
Elapsed Time: 11.312
CPU Time: 11.280
Executing actions 100 % done
同collect 后面的。
This example runs the hardware event-based sampling collector for the
sample application and displays the default summary report.
$ amplxe-cl -collect-with runsa -knob event-config=CPU_CLK_UNHALTED.CORE,CPU_CLK_UNHALTED.REF,INST_RETIRED.ANYhome/test/sample
比较常用的命令
collect
collect-with
event-config
knob
$ amplxe-cl -collect-with runsa -knob event-config=CPU_CLK_UNHALTED.CORE,CPU_CLK_UNHALTED.REF,INST_RETIRED.ANYhome/test/sample
查看报告时比较特殊
$amplxe-cl -report sfdump -result-dir r000rs
Currently, the only way to view the sample-after values is to display the results of a run with the default values using the 'sfdump' report type, e.g.,
sudo amplxe-cl -collect-with runsa -knob event-config=UOPS_EXECUTED.PORT2_CORE:sa=1000,UOPS_EXECUTED.PORT3_CORE:sa=1000,UOPS_EXECUTED.PORT4_CORE:sa=1000 -- ./driver
以我的经验,sa>=1000,否则机器容易跑死。
我设了100,1,死了2次。
$ amplxe-cl -report hw-events -r r010runsa/
这个report 类型对于原生事件查看结果比较好
This option enables multiple runs to achieve more precise results for hardware event-based collections.
When disabled, the collector uses event multiplexing.
sudo amplxe-cl -collect-with runsa -knob event-config=UOPS_EXECUTED.PORT2_CORE,UOPS_EXECUTED.PORT3_CORE,UOPS_EXECUTED.PORT4_CORE -- ./dr iver
用了 之后,不能跑第二次。
测的结果不太准啊, 郁闷。。。
不知道为什么,一定要学好architecture system and os system.
找出原因来。
点击打开链接
http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/2011Update/lin/ug_docs/index.htm
amplxe-cl -collect hotspots -- ./driver /home/zxx/work_autumn_2011/matrices/rma10.mtx
Reading sparse matrix from file (/home/zxx/work_autumn_2011/matrices/rma10.mtx): done
Using 46835-by-46835 matrix with 2374001 nonzero values
------------------------------------------
#### Testing COO Kernels ####
creating coo_matrix:coo transform time elapsed 0.013690
do coo spmv time elapsed 5.434732 seconds
orignal do coo spmv time elapsed 5.429192 seconds
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a report
Summary
-------
Elapsed Time: 11.312
CPU Time: 11.280
Executing actions 100 % done
amplxe-cl -report hotspots -result-dir r001hs
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a report
Function Module CPU Time
__spmv_coo_serial_host_sse driver 5.420
__spmv_coo_serial_host<unsigned int, double> driver 5.410
read_coo_matrix<unsigned int, double> driver 0.350
test_coo_matrix_kernels<unsigned int, double> driver 0.060
coo_to_csr<unsigned int, double> driver 0.020
csr_to_coo<unsigned int, double> driver 0.020
Executing actions 100 % done
amplxe-cl -report summary -result-dir r001hs
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a report
Summary
-------
Elapsed Time: 11.312
CPU Time: 11.280
Executing actions 100 % done
同collect 后面的。
This example runs the hardware event-based sampling collector for the
sample application and displays the default summary report.
$ amplxe-cl -collect-with runsa -knob event-config=CPU_CLK_UNHALTED.CORE,CPU_CLK_UNHALTED.REF,INST_RETIRED.ANYhome/test/sample
比较常用的命令
collect
collect-with
event-config
knob
$ amplxe-cl -collect-with runsa -knob event-config=CPU_CLK_UNHALTED.CORE,CPU_CLK_UNHALTED.REF,INST_RETIRED.ANYhome/test/sample
查看报告时比较特殊
$amplxe-cl -report sfdump -result-dir r000rs
Currently, the only way to view the sample-after values is to display the results of a run with the default values using the 'sfdump' report type, e.g.,
sudo amplxe-cl -collect-with runsa -knob event-config=UOPS_EXECUTED.PORT2_CORE:sa=1000,UOPS_EXECUTED.PORT3_CORE:sa=1000,UOPS_EXECUTED.PORT4_CORE:sa=1000 -- ./driver
以我的经验,sa>=1000,否则机器容易跑死。
我设了100,1,死了2次。
$ amplxe-cl -report hw-events -r r010runsa/
这个report 类型对于原生事件查看结果比较好
This option enables multiple runs to achieve more precise results for hardware event-based collections.
When disabled, the collector uses event multiplexing.
sudo amplxe-cl -collect-with runsa -knob event-config=UOPS_EXECUTED.PORT2_CORE,UOPS_EXECUTED.PORT3_CORE,UOPS_EXECUTED.PORT4_CORE -- ./dr iver
用了 之后,不能跑第二次。
测的结果不太准啊, 郁闷。。。
不知道为什么,一定要学好architecture system and os system.
找出原因来。
相关文章推荐
- C++ -> 在Window7 命令行下使用微软编译工具 cl.exe
- mysql>命令行下可以使用的各种命令解析(使用help或者help contents查看更多信息)
- 使用命令行创建Android工程报错:"Target id is not valid. Use 'android.bat list targets' to get the target ids"
- 彩色的命令行 —— 使用 ANSI 色彩代码(export PS1='\[\e[1;32m\][\u@\h \w]\$\[\e[0m\] ')
- Linux & Mac curl 命令行使用——POST&GET
- [转]CL & LINK的命令行用法
- C++ -> 在Window7 命令行下使用微软编译工具 cl.exe
- Linux & Mac curl 命令行使用——POST&GET
- CL & LINK的命令行用法
- Android下SQLite数据库学习笔记4——SQLite3工具的使用&用Genymotion模拟器时,在命令行上使用不了adb命令
- zookeeper命令行(zkCli.sh&zkServer.sh)使用及四字命令
- <车载物联网项目,视频采集传输部分,第二天> 使用ffserver输出视频文件流,并且使用使用命令行终端播放视频文件
- 在命令行界面使用vs2008的cl 进行编译
- <摘录>使用amplxe-cl 命令行进行性能数据收集和分析
- 使用putty登陆suse,命令行下无法用"home"、“end” 键
- 命令行下使用CL.exe编译多cpp文件工程
- 命令行下使用cl命令设置
- zookeeper命令行(zkCli.sh&zkServer.sh)使用及四字命令
- Git的基本使用方法和安装&心得体会(使用git命令行)
- VS下如何配置才能使用 cl 命令行方式编译 C/C++ 程序