您的位置:首页 > 运维架构 > Tomcat

centos下装jdk,nutch,tomcat

2008-05-08 17:26 381 查看
centos下装jdk
./jdk-6u6-linux-i586-rpm.bin
在当前目录下出现jdk-6u6-linux-i586.rpm;
rpm -ivh jdk-6u6-linux-i586.rpm // must use 'root'
then can run java in the shell. // no need to set the env
然后有了/usr/java/jdk1.6.0_06
======================
bin/nutch crawl urls -dir crawl.demo -depth 2 -threads 4 >& crawl.log
结果: error: JAVA_HOME not set.
export NUTCH_JAVA_HOME=/usr/java/jdk1.6.0_06/jre //这个只是在当前终端Terminar改了环境变量(另一个终端依旧没这个环境变量)
改为: export JAVA_HOME=/usr/java/jdk1.6.0_06 //对所有终端有效

在 nutch-0.9 目录中创建一个包含某网站顶级网址的文件 urls ,例如包含如下内容:
www.jiuyao123.com

再bin/nutch crawl urls -dir crawl.demo -depth 2 -threads 4 >& crawl.log
这次可以运行,但程序在抓取时遇到问题:
LinkDb: done
Indexer: starting
Indexer: linkdb: crawl.demo/linkdb
Indexer: adding segment: crawl.demo/segments/20080505214027
Indexer: adding segment: crawl.demo/segments/20080505214020
Optimizing index.
Indexer: done
Dedup: starting
Dedup: adding indexes in: crawl.demo/indexes
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:439)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
~
// PS: putty下复制选定部分方法: 选定后直接enter.

问题出在hadoop,不知是不是里面代码用到tomcat,tomcat还没装,就先装tomcat吧:

----------
centos下安装tomcat:
install the binary distribution (or use ant to build the src version)
export TOMCAT_HOME=/opt/tomcat6.0

../bin/catalina.sh start
出错,提示说找不到目录
查tomcat网站的setup说明:
CATALINA_HOME May point at your Catalina "build" directory.
export CATALINA_HOME=/opt/tomcat6.0
export PATH=$PATH:$CATALINA_HOME/bin
最后是:
JAVA_HOME=/usr/java/jdk1.6.0_06
TOMCAT_HOME=/opt/tomcat6.0
CATALINA_BASE=/opt/tomcat6.0
CATALINA_HOME=/opt/tomcat6.0
PATH=$PATH:$CATALINA_HOME/bin:$CATALINA_HOME:$JAVA_HOME:$TOMCAT_HOME
export PATH CATALINA_BASE CATALINA_HOME JAVA_HOME TOMCAT_HOME
//其中TOMCAT_HOME在6.0中可以不设

启动tomcat莫名其妙的报错:
The BASEDIR environment variable is not defined correctly This environment variable is needed to run this program
把$CATALINA_HOME/bin目录下所有.sh文件添加一个可执行权限即可

但下面方法仍访问不了:
http://221.4.245.71:8080
http://221.4.245.71:8080/index.html
http://121.10.119.71:8080

netstat -antp|grep 8080查看发现8080已经被tomcat(java)占用

本机windows下:
已有java
CATALINA_HOME=E:/software/tomcat6.0
added to path
http://localhost:8080就可以访问了

服务器上的jdk,tomcat和本机win上的都是同一版本。
奇怪了,再看tomcat网站说明,看到unix下装成daemon服务,就试了下面的:
./bin/jsvc -cp ./bin/bootstrap.jar /
-outfile ./logs/catalina.out -errfile ./logs/catalina.err /
org.apache.catalina.startup.Bootstrap
结果还是依旧不行。

后来发现catalina.out
when startup.sh is execurated, the result is recorded in tomcat6.0/logs/catalina.out
the log contains warning, so change "localhost" to the ip of the server,
restart tomcat, then the log is the same as the one in windows.
But, the default page is still unavailable.

然后在服务器上:
wget http://localhost:8080
wget http://221.4.245.71:8080
wget http://127.0.0.1:8080
发现都可以下载到index.html文件

这就奇怪了。
问题很大可能是权限问题、或者是外面机器和服务器的接口问题。

过了会jianl说想起来了,8080端口之前被他封了。
这家伙,回来打屁股!

然后就可以用http://221.4.245.71:8080http://121.10.119.71:8080访问了。

说明我那种解决不了,暂时放下,去做其他事情的方式,还是可以省很多时间的。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: