BERT最近太火,蹭个热点,整理一下相关的资源,包括Paper, 代码和文章解读。
1、Google官方:
1) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
一切始于10月Google祭出的这篇Paper, 瞬间引爆整个AI圈包括自媒体圈: https://arxiv.org/abs/1810.04805
2) Github: https://github.com/google-research/bert
11月Google推出了代码和预训练模型,再次引起群体亢奋。
3) Google AI Blog: Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing
2、第三方解读:
1) 张俊林博士的解读, 知乎专栏:从Word Embedding到Bert模型—自然语言处理中的预训练技术发展史
我们在AINLP微信公众号上转载了这篇文章和张俊林博士分享的PPT,欢迎关注:
5) BERT Explained: State of the art language model for NLP
6) BERT介绍
9) 干货 | BERT fine-tune 终极实践教程: 奇点智能BERT实战教程,在AI Challenger 2018阅读理解任务中训练一个79+的模型。
10) 【BERT详解】《Dissecting BERT》by Miguel Romero Calvo
Dissecting BERT Part 1: The Encoder
Understanding BERT Part 2: BERT Specifics
Dissecting BERT Appendix: The Decoder
11)BERT+BiLSTM-CRF-NER用于做ner识别
12)AI赋能法律 | NLP最强之谷歌BERT模型在智能司法领域的实践浅谈
3、第三方代码:
1) pytorch-pretrained-BERT: https://github.com/huggingface/pytorch-pretrained-BERT
Google官方推荐的PyTorch BERB版本实现,可加载Google预训练的模型:PyTorch version of Google AI's BERT model with script to load Google's pre-trained models
2) BERT-pytorch: https://github.com/codertimo/BERT-pytorch
另一个Pytorch版本实现:Google AI 2018 BERT pytorch implementation
3) BERT-tensorflow: https://github.com/guotong1988/BERT-tensorflow
Tensorflow版本:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
4) bert-chainer: https://github.com/soskek/bert-chainer
Chanier版本: Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
5) bert-as-service: https://github.com/hanxiao/bert-as-service
将不同长度的句子用BERT预训练模型编码,映射到一个固定长度的向量上:Mapping a variable-length sentence to a fixed-length vector using pretrained BERT model
这个很有意思,在这个基础上稍进一步是否可以做一个句子相似度计算服务?有没有同学一试?
6) bert_language_understanding: https://github.com/brightmart/bert_language_understanding
BERT实战:Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
7) sentiment_analysis_fine_grain: https://github.com/brightmart/sentiment_analysis_fine_grain
BERT实战,多标签文本分类,在 AI Challenger 2018 细粒度情感分析任务上的尝试:Multi-label Classification with BERT; Fine Grained Sentiment Analysis from AI challenger
8) BERT-NER: https://github.com/kyzhouhzau/BERT-NER
BERT实战,命名实体识别: Use google BERT to do CoNLL-2003 NER !
9) BERT-keras: https://github.com/Separius/BERT-keras
Keras版: Keras implementation of BERT with pre-trained weights
10) tbert: https://github.com/innodatalabs/tbert
PyTorch port of BERT ML model
11) BERT-Classification-Tutorial: https://github.com/Socialbird-AILab/BERT-Classification-Tutorial
12) BERT-BiLSMT-CRF-NER: https://github.com/macanv/BERT-BiLSMT-CRF-NER
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning
13) bert-Chinese-classification-task
bert中文分类实践
14) bert-chinese-ner: https://github.com/ProHiryu/bert-chinese-ner
使用预训练语言模型BERT做中文NER
15)BERT-BiLSTM-CRF-NER
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning
16) bert-sequence-tagging: https://github.com/zhpmatrix/bert-sequence-tagging
基于BERT的中文序列标注
持续更新,BERT更多相关资源欢迎补充,欢迎关注我们的微信公众号:AINLP
注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:https://www.52nlp.cn
本文链接地址:BERT相关论文、文章和代码资源汇总 https://www.52nlp.cn/?p=10870
5) bert-as-service: https://github.com/hanxiao/bert-as-service
将不同长度的句子用BERT预训练模型编码,映射到一个固定长度的向量上:Mapping a variable-length sentence to a fixed-length vector using pretrained BERT model
have tried but the results are not good enough comparing with the gensim word2vec
[回复]
52nlp 回复:
17 11 月, 2018 at 11:50
thanks
[回复]
https://github.com/xu-song/bert-as-language-model
[回复]
52nlp 回复:
24 12 月, 2018 at 14:54
谢谢
[回复]