ant-ANT编译nutch编译时失败,请问大神们这是什么原因

ant-ANT编译nutch时失败,请问大神们这是什么原因_百度知道
ant-ANT编译nutch时失败,请问大神们这是什么原因
提问者采纳
重新编程nutch源码挺常见、重新编译nutch的demo的war包,没发现对应的源码包, 可以试用反编译软件反编译一下个别的核心类就可以了1,如换了个分词什么的。 2
其他类似问题
为您推荐:
等待您来回答
下载知道APP
随时随地咨询
出门在外也不愁Nutch安装完整步骤_百度文库
两大类热门资源免费畅读
续费一年阅读会员,立省24元!
Nutch安装完整步骤
上传于||文档简介
&&N​u​t​c​h​安​装​完​整​步​骤​-​总​结
阅读已结束,如果下载本文需要使用0下载券
想免费下载更多文档?
你可能喜欢hadoop - Can not compile Nutch1.4 with ant - Stack Overflow
to customize your list.
Announcing Stack Overflow Documentation
We started with Q&A. Technical documentation is next, and we need your help.
Whether you're a beginner or an experienced developer, you can contribute.
I'm trying to deploy Nutch1.4 to Hadoop cluster(following ). I got some problems when compiling Nutch with ant.
When I run ant command, I got the following error:
/home/xenserver/apache-nutch-1.4-bin/build.xml:71: invalid Date syntax in "01/25/ pm"
I remove attribute "datetime" from line 71 in file build.xml and run ant again. Then I got another problem.
The error is:
/home/xenserver/apache-nutch-1.4/build.xml:412: syntax errors in ivy file: java.text.ParseException:
in file:/home/xenserver/apache-nutch-1.4/ivy/ivy.xml
at org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorParser$Parser.parse(XmlModuleDescriptorParser.java:273)
What's wrong with the steps above? Is there any tutorial for compiling Nutch1.4?
Need your help.Thanks in advance.
For compiling nutch 1.4, all you have to do is run ant clean deploy from the nutch directory. The output is created in the directory named 'runtime' with 2 folders: one for local mode and other one for cluster mode.
please check the date settings and ant installation on your machine. I think that is casing the issue. Also have you tampered/ edited /home/xenserver/apache-nutch-1.4/ivy/ivy.xml ? Please check that file too.
There is some problem with the build file when executed on your Linux box.
Check these out:
These are the things that you should verify on your setup:
java version and ant version : dont use old ones. get the latest ones or ones that are compatible with your nutch release. FYI: for nutch-1.4 I am using apache-ant-1.8.3 and java jdk1.6.0_18. This combination works perfectly fine with me.
Check that you have installed a JDK and not a JRE
Check if your JAVA_HOME environment variable point to the JDK. System PATH variable must have $JAVA_HOME/bin and $ANT_HOME/bin appended to it. ANT_HOME variable must point to the ant installation directory.
Can you successfully run normal ant targets on any other build files ? try out with small ant build file.
Still facing the same issue, run ant command with -v option. This will provide more information about the error faced.
ant -v clean deploy
4,69311028
Your Answer
Sign up or
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Post as a guest
By posting your answer, you agree to the
Not the answer you're looking for?
Browse other questions tagged
Stack Overflow works best with JavaScript enabledant-ANT编译nutch时失败,请问大神们这是什么原因_百度知道
ant-ANT编译nutch时失败,请问大神们这是什么原因
没发现对应的源码包、重新编译nutch的demo的war包、重新编程nutch源码挺常见。 2,如换了个分词什么的1, 可以试用反编译软件反编译一下个别的核心类就可以了
知道智能回答机器人
我是知道站内的人工智能,可高效智能地为您解答问题。很高兴为您服务。
其他类似问题
为您推荐:
nutch的相关知识
等待您来回答
下载知道APP
随时随地咨询
出门在外也不愁下次自动登录
现在的位置:
& 综合 & 正文
Nutch中问支持bug修复
问题描述:
由于Nutch不是原生支持中文的,开发者没有考虑到中文的分词会存在token的交叉重叠的情况,导致在根据用户输入查询串的token获取页面summary时出现:StringIndexOutOfBoundsException的异常。比如:“教育方针”可能出现这样的分词“教育方针”、“教育”、“方针”,这几个token就交叉重叠了。
(网上有bupo.Jung写的另外一篇,我也进行过测试,但是其只能解决他所举的例子“ 比如:“可爱的小女生”可能出现这样的分词“可爱”、“小女”、“女生”,其中“小女”和“女生”这两个token就交叉重叠了。“这种非全包含的叠词情况。最后我会贴出他的解决方案。
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.StringIndexOutOfBoundsException: String index out of range: -4
at org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:316)
at org.apache.nutch.searcher.NutchBean.getSummary(NutchBean.java:357)
at org.apache.nutch.searcher.NutchBean.main(NutchBean.java:429)
Caused by: java.util.concurrent.ExecutionException: java.lang.StringIndexOutOfBoundsException: String index out of range: -4
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:311)
... 2 more
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -4
at java.lang.String.substring(String.java:1937)
at org.apache.nutch.summary.basic.BasicSummarizer.getSummary(BasicSummarizer.java:190)
at org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:275)
at org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:65)
at org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)不返回文在和url等详细信息。
从错误日志中可以跟踪到错误的根源为
org.apache.nutch.summary.basic.BasicSummarizer.getSummary
nutch/src/plugin/summary-basic/src/java/org/apache/nutch/summary/basic/BasicSummarizer.java
文件中的188行开始的如下:
if (highlight.contains(t.term())) {
excerpt.addToken(t.term());
//在连个token重叠的情况下,会使得offset&t.startOffset()
excerpt.add(newFragment(text.substring(offset, t.startOffset())));//这就是异常的地方当offset&t.startOffset()就会出错。
excerpt.add(newHighlight(text.substring(t.startOffset(),t.endOffset())));
offset = t.endOffset();
endToken = Math.min(j +sumContext, tokens.length);
所以把代码修改为:(可以将while部分全部修改,也可以改部分)
while ((j & endToken) && (j - startToken & sumLength)) {
Token t = tokens[j];
if (highlight.contains(t.term())) {
excerpt.addToken(t.term());
if(offset&t.startOffset()){
excerpt.add(new Fragment(text.substring(offset, t.startOffset())));
excerpt.add(new Highlight(text.substring(t.startOffset(),t.endOffset())));
if(offset&=t.startOffset()){
if(offset&t.endOffset()){
excerpt.add(new Highlight(text.substring(offset,t.endOffset())));
offset = Math.max(offset, t.endOffset());
endToken = Math.min(j + sumContext, tokens.length);
同时还要将下面的
if(j&tokes.length){
excerpt.add(new Fragment(text.subString(offset,tokens[j].endOffset())));
if(j&tokes.length){
if(offset& tokens[j].endOffset()){
excerpt.add(new Fragment(text.subString(offset,tokens[j].endOffset())));
重新编译,在nutch/目录下运行ant
到此修复结束。
下面是的bupo.Jung解决方案
文件中的188行开始的如下代码:
if (highlight.contains(t.term())) {
excerpt.addToken(t.term());
//在连个token重叠的情况下,会使得offset&t.startOffset()
excerpt.add(newFragment(text.substring(offset, t.startOffset())));//我是异常,我自责
excerpt.add(newHighlight(text.substring(t.startOffset(),t.endOffset())));
offset = t.endOffset();
endToken = Math.min(j +sumContext, tokens.length);
if (highlight.contains(t.term())) {
excerpt.addToken(t.term());
//在连个token重叠的情况下,会使得offset&t.startOffset()
//bupo changed the code to fix the chinese token overlap
if(offset & t.startOffset()){
excerpt.add(newFragment(text.substring(offset, t.startOffset())));
excerpt.add(newHighlight(text.substring(t.startOffset(),t.endOffset())));
excerpt.add(newHighlight(text.substring(offset,t.endOffset())));
重新编译,在nutch/目录下运行ant,在nutch/build/summary-basic/目录下生成了
summary-basic.jar,把它复制到nutch/plugins/summary-basic/目录下覆盖原来的文件。
到此修复介绍。
&&&&推荐文章:
【上篇】【下篇】

我要回帖

更多关于 nutch2.3.1编译 的文章

 

随机推荐