ant-ANT编译nutch编译时失败，请问大神们这是什么原因

羽毛球技术 | 体育赛事 | 英文歌曲 | 住宅风水 | 用户界面设计师 | 六爻 | 书籍改编电影 | 德国足球甲级联赛 | 欧美明星 | PLC | 中国足球 | aj1 | 国家队 | 拜仁慕尼黑足球俱乐部 | 小说创作 | 配音 | iOS应用 | NBA 2K | 古典音乐 | 面相 | 火影忍者 | 武汉大学 | 土拨鼠 | 营销策划 | 秦时明月之天行九歌 | 设计师 | 巴塞罗那足球俱乐部 | 尤文图斯 | 实况足球（游戏） | 少帅 | 罗玉凤 | 比利时 | 跑鞋 | 冷知识 | 肖战 | 李元胜 | 古琴 | 按键精灵 | 罗兰 | 徐波 | 激光手术 | 角色扮演 | 关晓彤 | 微电影 | safari | 北京国安 | 古汉语 | 曼彻斯特联 | 玄幻小说 | 科幻小说 | 双眼皮手术 | 主题曲 | 年会 | 检测仪 | 徒步 | 互联网公司 | 百度输入法 | 镜头 | 宜昌市 | 自拍 | 金蝶 | 电子烟 | 网站建设 | 广播体操 | 文身 | nba篮球 | 索尼(sony) | 天体物理学 | 痛风 | 象棋 | 牛皮癣 | 皮肤护理 | 周星驰（人物） | 试管婴儿 | 亚足联亚洲杯（AFC Asian Cup） | 健美 | 美术生 | 迅雷（软件） | 战斗机 | 穿越小说 | 张璐 | 姓氏 | 诸葛亮 | 后宫·甄嬛传（书籍） | 虎牙直播 | snh48 | 阿迪达斯 | 投影仪 | 组装机 | 微信群 | 阿迪达斯(adidas) | 网球王子 | 分子生物学 | 耽美 | 武磊 | 婚礼 | 表演 | 中国武术 | 动画电影 | Air Jordan | 张子枫 | 免费软件 | 相声演员 | 摩羯座 | 宿舍 | ansys | 法国足球甲级联赛 | 户外 | 剧场版 | 杨凡 | 科幻电影 | galgame | 融资 | 关节炎 | NBA季后赛 | 神话 | 王力宏（人物） | 建模 | 计算机病毒 | 广州恒大淘宝足球俱乐部 | 北京奥运会 | 电脑电源 | 百度翻译 | 字幕 | 讯飞输入法 | 海关 | 易烊千玺 | 深度学习 | 编辑器 | 澳门特别行政区 | 直播 | 流氓软件 | 事故 | 大片 | 李景亮 | 郭富城 | 日语歌曲 | 卡牌游戏 | 小品 | 东京 | 花卉 | 音乐剧 | 互联网创业 | 占卜 | 羽毛球拍 | 婆媳关系 | 日本动画 | 巴黎 | 拳击比赛 | 东南亚 | 足球经理（FM）（游戏） | youtube | 胡歌（演员） | 地铁跑酷 | 植发 | 张继科 | 三国 | 用户界面 | 演技 | 百度竞价 | 青梅竹马 | 移动硬盘 | 韩晓鹏 | 马龙 | 瘦腿 | 宠物医疗 | 巨蟹座 | 徐峥 | 天蝎座 | 胸肌 | 赵丽颖（演员） | adidas阿迪达斯 | 低音炮 | 星际争霸（游戏） | 豆瓣电影 | 微信开放平台 | 手绘 | 吉他学习 | 江苏卫视 | 模特 | 创意 | 团队管理 | 奢侈品 | 王源 | TANK | 笛子 | 偶像 | 莱斯特城 | 维生素 | 新百伦 | 国际物流 | 前女友 | 李小龙 | 华语流行音乐 | 猎头公司 | crm | 搏击项目 | 网站运营 | 鼻炎 | 篮球游戏 |

你的位置：网站首页 >> 频道首页 >>iWork >>ant-ANT编译nutch编译时失败，请问大神们这是什么原因

ant-ANT编译nutch编译时失败，请问大神们这是什么原因

来源：蜘蛛抓取(WebSpider) 时间：2016-05-08 06:14 标签： nutch2.3.1编译

ant-ANT编译nutch时失败，请问大神们这是什么原因_百度知道
ant-ANT编译nutch时失败，请问大神们这是什么原因
提问者采纳
重新编程nutch源码挺常见、重新编译nutch的demo的war包，没发现对应的源码包，可以试用反编译软件反编译一下个别的核心类就可以了1，如换了个分词什么的。 2
其他类似问题
为您推荐：
等待您来回答
下载知道APP
随时随地咨询
出门在外也不愁Nutch安装完整步骤_百度文库
两大类热门资源免费畅读
续费一年阅读会员，立省24元！
Nutch安装完整步骤
上传于||文档简介
&&Nutch安装完整步骤-总结
阅读已结束，如果下载本文需要使用0下载券
想免费下载更多文档？
你可能喜欢hadoop - Can not compile Nutch1.4 with ant - Stack Overflow
to customize your list.
Announcing Stack Overflow Documentation
We started with Q&A. Technical documentation is next, and we need your help.
Whether you're a beginner or an experienced developer, you can contribute.
I'm trying to deploy Nutch1.4 to Hadoop cluster(following ). I got some problems when compiling Nutch with ant.
When I run ant command, I got the following error:
/home/xenserver/apache-nutch-1.4-bin/build.xml:71: invalid Date syntax in "01/25/ pm"
I remove attribute "datetime" from line 71 in file build.xml and run ant again. Then I got another problem.
The error is:
/home/xenserver/apache-nutch-1.4/build.xml:412: syntax errors in ivy file: java.text.ParseException:
in file:/home/xenserver/apache-nutch-1.4/ivy/ivy.xml
at org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorParser$Parser.parse(XmlModuleDescriptorParser.java:273)
What's wrong with the steps above? Is there any tutorial for compiling Nutch1.4?
Need your help.Thanks in advance.
For compiling nutch 1.4, all you have to do is run ant clean deploy from the nutch directory. The output is created in the directory named 'runtime' with 2 folders: one for local mode and other one for cluster mode.
please check the date settings and ant installation on your machine. I think that is casing the issue. Also have you tampered/ edited /home/xenserver/apache-nutch-1.4/ivy/ivy.xml ? Please check that file too.
There is some problem with the build file when executed on your Linux box.
Check these out:
These are the things that you should verify on your setup:
java version and ant version : dont use old ones. get the latest ones or ones that are compatible with your nutch release. FYI: for nutch-1.4 I am using apache-ant-1.8.3 and java jdk1.6.0_18. This combination works perfectly fine with me.
Check that you have installed a JDK and not a JRE
Check if your JAVA_HOME environment variable point to the JDK. System PATH variable must have $JAVA_HOME/bin and $ANT_HOME/bin appended to it. ANT_HOME variable must point to the ant installation directory.
Can you successfully run normal ant targets on any other build files ? try out with small ant build file.
Still facing the same issue, run ant command with -v option. This will provide more information about the error faced.
ant -v clean deploy
4,69311028
Your Answer
Sign up or
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Post as a guest
By posting your answer, you agree to the
Not the answer you're looking for?
Browse other questions tagged
Stack Overflow works best with JavaScript enabledant-ANT编译nutch时失败，请问大神们这是什么原因_百度知道
ant-ANT编译nutch时失败，请问大神们这是什么原因
没发现对应的源码包、重新编译nutch的demo的war包、重新编程nutch源码挺常见。 2，如换了个分词什么的1，可以试用反编译软件反编译一下个别的核心类就可以了
知道智能回答机器人
我是知道站内的人工智能，可高效智能地为您解答问题。很高兴为您服务。
其他类似问题
为您推荐：
nutch的相关知识
等待您来回答
下载知道APP
随时随地咨询
出门在外也不愁下次自动登录
现在的位置:
& 综合 & 正文
Nutch中问支持bug修复
问题描述：
由于Nutch不是原生支持中文的，开发者没有考虑到中文的分词会存在token的交叉重叠的情况，导致在根据用户输入查询串的token获取页面summary时出现:StringIndexOutOfBoundsException的异常。比如：“教育方针”可能出现这样的分词“教育方针”、“教育”、“方针”，这几个token就交叉重叠了。
（网上有bupo.Jung写的另外一篇，我也进行过测试，但是其只能解决他所举的例子“ 比如：“可爱的小女生”可能出现这样的分词“可爱”、“小女”、“女生”，其中“小女”和“女生”这两个token就交叉重叠了。“这种非全包含的叠词情况。最后我会贴出他的解决方案。
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.StringIndexOutOfBoundsException: String index out of range: -4
at org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:316)
at org.apache.nutch.searcher.NutchBean.getSummary(NutchBean.java:357)
at org.apache.nutch.searcher.NutchBean.main(NutchBean.java:429)
Caused by: java.util.concurrent.ExecutionException: java.lang.StringIndexOutOfBoundsException: String index out of range: -4
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:311)
... 2 more
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -4
at java.lang.String.substring(String.java:1937)
at org.apache.nutch.summary.basic.BasicSummarizer.getSummary(BasicSummarizer.java:190)
at org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:275)
at org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:65)
at org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)不返回文在和url等详细信息。
从错误日志中可以跟踪到错误的根源为
org.apache.nutch.summary.basic.BasicSummarizer.getSummary
nutch/src/plugin/summary-basic/src/java/org/apache/nutch/summary/basic/BasicSummarizer.java
文件中的188行开始的如下：
if (highlight.contains(t.term())) {
excerpt.addToken(t.term());
//在连个token重叠的情况下，会使得offset&t.startOffset()
excerpt.add(newFragment(text.substring(offset, t.startOffset())));//这就是异常的地方当offset&t.startOffset()就会出错。
excerpt.add(newHighlight(text.substring(t.startOffset(),t.endOffset())));
offset = t.endOffset();
endToken = Math.min(j +sumContext, tokens.length);
所以把代码修改为：（可以将while部分全部修改，也可以改部分）
while ((j & endToken) && (j - startToken & sumLength)) {
Token t = tokens[j];
if (highlight.contains(t.term())) {
excerpt.addToken(t.term());
if(offset&t.startOffset()){
excerpt.add(new Fragment(text.substring(offset, t.startOffset())));
excerpt.add(new Highlight(text.substring(t.startOffset(),t.endOffset())));
if(offset&=t.startOffset()){
if(offset&t.endOffset()){
excerpt.add(new Highlight(text.substring(offset,t.endOffset())));
offset = Math.max(offset, t.endOffset());
endToken = Math.min(j + sumContext, tokens.length);
同时还要将下面的
if(j&tokes.length){
excerpt.add(new Fragment(text.subString(offset,tokens[j].endOffset())));
if(j&tokes.length){
if(offset& tokens[j].endOffset()){
excerpt.add(new Fragment(text.subString(offset,tokens[j].endOffset())));
重新编译，在nutch/目录下运行ant
到此修复结束。
下面是的bupo.Jung解决方案
文件中的188行开始的如下代码：
if (highlight.contains(t.term())) {
excerpt.addToken(t.term());
//在连个token重叠的情况下，会使得offset&t.startOffset()
excerpt.add(newFragment(text.substring(offset, t.startOffset())));//我是异常，我自责
excerpt.add(newHighlight(text.substring(t.startOffset(),t.endOffset())));
offset = t.endOffset();
endToken = Math.min(j +sumContext, tokens.length);
if (highlight.contains(t.term())) {
excerpt.addToken(t.term());
//在连个token重叠的情况下，会使得offset&t.startOffset()
//bupo changed the code to fix the chinese token overlap
if(offset & t.startOffset()){
excerpt.add(newFragment(text.substring(offset, t.startOffset())));
excerpt.add(newHighlight(text.substring(t.startOffset(),t.endOffset())));
excerpt.add(newHighlight(text.substring(offset,t.endOffset())));
重新编译，在nutch/目录下运行ant，在nutch/build/summary-basic/目录下生成了
summary-basic.jar，把它复制到nutch/plugins/summary-basic/目录下覆盖原来的文件。
到此修复介绍。
&&&&推荐文章:
【上篇】【下篇】

ant-ANT编译nutch编译时失败，请问大神们这是什么原因

我要回帖

更多关于 nutch2.3.1编译的文章

随机推荐

ant-ANT编译nutch编译时失败，请问大神们这是什么原因

我要回帖

更多关于 nutch2.3.1编译 的文章

随机推荐

更多关于 nutch2.3.1编译的文章