看了一下Moses,发现有了一些新变化,特别是Moses整个开源项目几个月之前从Sourceforge上迁移到github上,可见github近来的人气有多旺。另外Moses的编译方式有了很大的改变,之前是Make方式编译,现在改为了bjam;之前依赖的boost库是可选的,现在boost库是必选的,不安装boost库Moses基本上是无法编译成功的。

  具体到操作上,如果是在ubuntu上,可以通过"sudo apt-get install libboost-all-dev"的方式快速的安装boost库,然后check out源代码:
git clone git://github.com/moses-smt/mosesdecoder.git

  Check out下Moses代码之后,如果不考虑整套统计机器翻译平台的搭建,仅仅测试Moses,直接用bjam编译moses就可以了:
cd ~/mosesdecoder
./bjam -j2
-j后的数字代表多核并行编译;

如果一切顺利并允许几个无关紧要的错误的话,编译完成之后会在dist下面生成一个bin和一个lib目录,前者存放可执行的二进制程序,例如moses, moses_chart,后者存放相关的lib库,例如:libmose.a

Step to Step的编译方法可以参考Moses的官方文档:
http://www.statmt.org/moses_steps.html
这个文档的一个问题是没有提示boost的安装,不安装boost,用bjam编译后会遇到很多boost某个库找不到的错误,并且不会生成Moses的二进制文件及Lib库。

另一个重要新闻是Moese的目前的开发由欧盟下的MosesCore项目支持,查了一下这个项目,貌似是今年才立项的,从名字上看,与Moses紧密相关,并且致力于开源统计机器翻译系统在学术界和工业界的推广:

MosesCore is an EU funded Coordination Action, which aims to encourage the development and usage of open source machine translation.

MosesCore draws together academic and commercial partners sharing a common interest in open source machine translation, and will:

Provide coordination and stewardship of the development of open source software for machine translation, notably the Moses statistical MT toolkit. This will result in at least three major releases of Moses, one in each year of the project.

Outreach to the research community through academic workshops, evaluation campaigns and the machine translation marathons.

Outreach to current and potential users of MT by providing a well maintained web presence, an active newsletter, and three annual outreach events for knowledge sharing and tutorial.

Improve interaction between academic and industrial MT stakeholders through both the outreach events and tutorials, and the marathons.

作者 52nlp

《Moses的一些新变化》有23条评论
  1. 请问用-hierarchical and -glue-grammar.训练的出的rule-table在进行tuning的时候和普通tuning命令有什么区别吗
    我在tuning时总是提示“ERROR:Lexicalized distortion model: Not enough weights, add to [weight-d]
    > Exit code: 1
    > Failed to run moses with the config filtered/moses.ini at
    > /home/ltx/moses/mosesdecoder/scripts/training/mert-moses.pl line 1169.
    ”错误
    求助!

    [回复]

    52nlp 回复:

    没有玩过rule-table的tuning,抱歉这个问题不清楚。

    [回复]

  2. 您好,请教一个问题。我现在64位的ubuntu系统下安装moses出错了,出错信息是:
    warning: No toolsets are configured.
    warning: Configuring default toolset "gcc".
    warning: If the default is wrong, your build may not work correctly.
    warning: Use the "toolset=xxxxx" option to override our guess.
    warning: For more configuration options, please consult
    warning: http://boost.org/boost-build2/doc/html/bbv2/advanced/configuration.html
    ...patience...
    ...found 3361 targets...
    SUCCESS
    这个错误应该是boost的,但是网上搜了一些方法也没能解决。有人遇到过这个问题吗?

    [回复]

    夏天 回复:

    难道是gcc的版本不行吗?我现在的是gcc-4.6

    [回复]

    52nlp 回复:

    仅仅是警告吗?没看到error?

    [回复]

    夏天 回复:

    没有 error,就是这几个警告

  3. 成功了,与gcc版本无关,boost一定要安装全了。然后在moses文件夹下写: ./bjam toolset=gcc-4.6 (之前一直写 toolset=gcc-4.6,看来不行 )。乱七八糟的问题前前后后花了我一周的时间啊啊啊,不过总算解决了。 PS:ubuntu32位系统和64位系统上的问题真不一样···以前没碰到过···总之谢谢52nlp

    [回复]

  4. 我又来了,我在Ubuntu 12.04.2 LTS (GNU/Linux 3.2.0-29-generic x86_64)
    服务器上运行moses的时候MERT这一步出现了错误,错误信息如下:
    nohup: ignoring input
    Using SCRIPTS_ROOTDIR: /home/tempadmin/mtdir/moses/scripts/
    Assuming the tables are already filtered, reusing filtered/moses.ini
    Asking moses for feature names and values from filtered/moses.ini
    Executing: /home/tempadmin/mtdir/moses/bin/moses -threads 4 -config filtered/moses.ini -inputtype 0 -show-weights > ./features.list
    Defined parameters (per moses.ini or switch):
    config: filtered/moses.ini
    distortion-limit: 6
    feature: UnknownWordPenalty WordPenalty PhraseDictionaryMemory name=TranslationModel0 num-features=5 path=/home/tempadmin/mtdir/corpus/mert/filtered/phrase-table.0-0.1.1.gz input-factor=0 output-factor=0 LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/home/tempadmin/mtdir/corpus/mert/filtered/reordering-table.wbe-msd-bidirectional-fe Distortion SRILM name=LM0 factor=0 path=/home/tempadmin/mtdir/lm/mg-lm.txt order=3
    input-factors: 0
    inputtype: 0
    mapping: 0 T 0
    show-weights:
    threads: 4
    ttable-limit: 20
    weight: UnknownWordPenalty0= 1 WordPenalty0= -1 TranslationModel0= 0.2 0.2 0.2 0.2 0.2 LexicalReordering0= 0.3 0.3 0.3 0.3 0.3 0.3 Distortion0= 0.3 LM0= 0.5
    ERROR:Unknown parameter ttable-limit
    Exit code: 1
    Failed to run moses with the config filtered/moses.ini at /home/tempadmin/mtdir/moses/scripts/training/mert-moses.pl line 1271.
    未知参数ttable-limit有问题?以前没遇到过,有人碰到过吗?大家帮帮忙··

    [回复]

    52nlp 回复:

    如果着急的话,可以发个微博at我一下,我帮你在微博上转一下?

    [回复]

    夏天 回复:

    解决啦,新版的moses可能会出现这个问题,大家有人遇到此类错误的话参考下面的网页修改配置就行了:https://github.com/moses-smt/mosesdecoder/commit/029110c2451be2ecafbb1ed5f912e573647b3401

    [回复]

    52nlp 回复:

  5. 大家好,我在Ubuntu 14.04 LTS (amd-64)
    VMware虚拟机上运行moses的时候MERT这一步出现了错误,错误信息如下:
    nohup: ignoring input
    Using SCRIPTS_ROOTDIR: /home/kkdown/work/mt/moses/scripts
    filtering the phrase tables... Thu Jul 23 13:47:12 CST 2015
    exec: /home/kkdown/work/mt/moses/scripts/training/filter-model-given-input.pl ./filtered /home/kkdown/work/mt/working/train/model/moses.ini /home/kkdown/work/mt/corpus/news-test2008.true.fr
    Executing: /home/kkdown/work/mt/moses/scripts/training/filter-model-given-input.pl ./filtered /home/kkdown/work/mt/working/train/model/moses.ini /home/kkdown/work/mt/corpus/news-test2008.true.fr > filterphrases.out 2> filterphrases.err
    Exit code: 255
    ERROR: Failed to run '/home/kkdown/work/mt/moses/scripts/training/filter-model-given-input.pl ./filtered /home/kkdown/work/mt/working/train/model/moses.ini /home/kkdown/work/mt/corpus/news-test2008.true.fr'. at /home/kkdown/work/mt/moses/scripts/training/mert-moses.pl line 1723.

    有人碰到过这个问题吗?请大家帮帮忙··

    [回复]

    gg 回复:

    我也遇到这个额问题了,请问你的问题解决了吗?

    [回复]

    Jocelyn 回复:

    我也遇到了这个问题,请问你的问题解决了吗?

    [回复]

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注