MPI中可能会出现的错误
2010-05-20 14:11
253 查看
转自:
http://hi.baidu.com/linzch/blog/item/7e7d750e18329ec07acbe14f.html
1. p1_xxxxx: p4_error: interrupt SIGSEGV: 11
这个错误可能是因为某个进程中出现了段错误引起的,自己编程中曾出现过的错误:
a.只在一个进程中给指针申请空间,而在其他进程没有申请,所以在广播的时候出错。
b.数组内存的越界使用。
网上有个人说的很好:
"There are 2 things to check.
** Run one of the test programs like pi3.f or cpi.c to see whether your cluster's OK.
** if it is, the fault is in your code. See if you're exceeding array bounds or accessing memory which you haven't allocated, There's a SIGSEGV error - that's a segmentation violation. That might explain stuff like
bm_list_21829: p4_error: interrupt SIGINT: 2
Once you have a seg. violation, all the 4 processors are sent a signal to interrupt the process (SIGINT). Signals are defined in /usr/include/sys/signal.h (at least on the SGIs; might be
different on other systems). "
2. p1_10401: p4_error: : 14
1 - MPI_BCAST : Message truncated
[1] Aborting program !
[1] Aborting program!
这个也是由于mpi_bcast的接收空间不够引起的,要在mpi_bcast之前分配足够大的空间,这样就不会truncated了
3. p4_error: alloc_p4_msg failed:
p0_6773: (7.828703) xx_shmalloc: returning NULL; requested 1048616 bytesp0_6773: (7.828762) p4_shmalloc returning NULL; request = 1048616 bytes 内存空间没分配足,可以通过设置环境变量P4_GLOBMEMSIZE (in bytes)来增大程序需要的内存空间
export P4_GLOBMEMSIZE=32000000 (for bash users) setenv P4_GLOBMEMSIZE 32000000 (for csh or tcsh users)
4.libcprts.so.5: cannot open shared object file: No such file or directory
/home/jbrandt/tests/test.exe: error while loading shared libraries:libcprts.so.5: cannot open shared object file: No such file or directoryp0_792: p4_error: Child process exited while making connection to remoteprocess on compute-0-0.local: 0/opt/mpich/intel/bin/mpirun: line 1: 792 Broken pipe /home/jbrandt/tests/test.exe - p4pg /home/jbrandt/tests/PI646 -p4wd /home/jbrandt/tes
没有用-static静态的连接,用-static重新编译就好了
http://hi.baidu.com/linzch/blog/item/7e7d750e18329ec07acbe14f.html
1. p1_xxxxx: p4_error: interrupt SIGSEGV: 11
这个错误可能是因为某个进程中出现了段错误引起的,自己编程中曾出现过的错误:
a.只在一个进程中给指针申请空间,而在其他进程没有申请,所以在广播的时候出错。
b.数组内存的越界使用。
网上有个人说的很好:
"There are 2 things to check.
** Run one of the test programs like pi3.f or cpi.c to see whether your cluster's OK.
** if it is, the fault is in your code. See if you're exceeding array bounds or accessing memory which you haven't allocated, There's a SIGSEGV error - that's a segmentation violation. That might explain stuff like
bm_list_21829: p4_error: interrupt SIGINT: 2
Once you have a seg. violation, all the 4 processors are sent a signal to interrupt the process (SIGINT). Signals are defined in /usr/include/sys/signal.h (at least on the SGIs; might be
different on other systems). "
2. p1_10401: p4_error: : 14
1 - MPI_BCAST : Message truncated
[1] Aborting program !
[1] Aborting program!
这个也是由于mpi_bcast的接收空间不够引起的,要在mpi_bcast之前分配足够大的空间,这样就不会truncated了
3. p4_error: alloc_p4_msg failed:
p0_6773: (7.828703) xx_shmalloc: returning NULL; requested 1048616 bytesp0_6773: (7.828762) p4_shmalloc returning NULL; request = 1048616 bytes 内存空间没分配足,可以通过设置环境变量P4_GLOBMEMSIZE (in bytes)来增大程序需要的内存空间
export P4_GLOBMEMSIZE=32000000 (for bash users) setenv P4_GLOBMEMSIZE 32000000 (for csh or tcsh users)
4.libcprts.so.5: cannot open shared object file: No such file or directory
/home/jbrandt/tests/test.exe: error while loading shared libraries:libcprts.so.5: cannot open shared object file: No such file or directoryp0_792: p4_error: Child process exited while making connection to remoteprocess on compute-0-0.local: 0/opt/mpich/intel/bin/mpirun: line 1: 792 Broken pipe /home/jbrandt/tests/test.exe - p4pg /home/jbrandt/tests/PI646 -p4wd /home/jbrandt/tes
没有用-static静态的连接,用-static重新编译就好了
相关文章推荐
- MPI中可能会出现的错误
- MPI中可能会出现的错误
- MPI中可能会出现的错误
- ora-01730:指定的列名数无效,这个错误有哪些情况下可能会出现?
- VS2005中,软件集成中“堆释放错误”的解决方案,调用自己dll可能会出现的问题!!!
- 用memset对非字符型数组初始化可能会出现错误
- Android 错误 :TextView中属性ellipsize的 值为start、middle可能会出现错误
- 将aspx重写成.html后缀的伪静态地址后,如果后台需要调用Session,可能会出现如下错误:
- devExpress grid:父级grid与子grid关联时可能会出现错误:不能启用此约束,因为不是所有的值都具有相应的父值。
- 用VS2005打开原先其他版本的VS可能会出现错误
- 解决SSh中公共Dao使用泛型且Dao层无其他Dao,Service直接继承公共Dao,部署到tomcat可能会出现的错误。
- PHP在升级到5.4版本的php可能会出现这种错误
- 安装.net framework 4.0的时候有可能会出现错误码为0x80000222的错误
- 用VC2008打开由vc6.0编写的工程,可能会出现的错误
- 在xib中添加手势控件后运行可能会出现的错误
- win10_x64更新错误解决: 安装一些更新时出现问题,但我们稍后会重试。如果持续出现这些问题,并且你想要搜索Web或联系支持人员以获取相关信息,以下信息可能会对你有帮助:
- 新导入项目可能会出现的错误
- Sturts2和Hibernate整合可能会出现的错误
- Eclipse进行Java web开发时,可能会出现这样的错误:The superclass javax.servlet.http.HttpServlet was not found on the Java Build Path
- 安装PHP可能会出现的错误