Paper Reading List

Posted on 2021-07-10 Edited on 2021-07-14 In Paper Reading Views: Valine: 2k 2 mins.

GNN

Graph Neural Networks for Natural Language Processing: A Survey, 2021

Term Weight

Measuring Fine-Grained Domain Relevance of Terms, ACL2021

Knowledge Graph

Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network

Information Extraction

Text Generation

Data Augmentation for Text Generation Without Any Augmented Data, ACL2021
Capturing Relations between Scientific Papers: An Abstractive Model for Related Work Section Generation, 2021ACL
Employing Argumentation Knowledge Graphs for Neural Argument Generation
Metaphor Generation with Conceptual Mappings, 2021ACL
AugNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation, 2021ACL

NLP Machine Learning

DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
Dynamic Contextualized Word Embeddings
Unsupervised Out-of-Domain Detection via Pre-trained Transformers ❤
Tree-Structured Topic Modeling with Nonparametric Neural Variational Inference
Obtaining Better Static Word Embeddings Using Contextual Embedding Models
TAN-NTM: Topic Attention Networks for Neural Topic Modeling ❤
Learning Dense Representations of Phrases at Scale

Chat-box

Topic-Driven and Knowledge-Aware Transformer for Dialogue Emotion Detection

多模态

Multimodal Sentiment Detection Based on Multi-channel Graph Neural Networks

References

[object Object]

Posted on 2021-07-10 Edited on 2021-08-24 In Research Method Views: Valine: 1.2k 1 mins.

General Tools

NLTK - 自然语言工具包 :+1:
spacy - 使用 Python 和 Cython 的高性能的自然语言处理库 :+1:
gensim - 用于对纯文本进行无监督的语义建模的库，支持 word2vec 等算法 :+1:
StanfordNLP - 适用多语言的 NLP Library ，包含 Java 和 Python 语言 :+1:
OpenNLP - 基于机器学习的自然语言处理的工具包，使用 Java 语言开发 :+1:
TextBlob - 为专研常见的自然语言处理（NLP）任务提供一致的 API
Jieba 结巴分词 - 强大的Python 中文分词库 :+1:
HanLP - 面向生产环境的多语种自然语言处理工具包
SnowNLP - 中文自然语言处理 Python 包，没有用NLTK，所有的算法都是自己实现的
FudanNLP - 用于中文文本处理的 Java 函式库
THULAC - 包括中文分词、词性标注功能。

Term Extraction

Bag of What Simple Noun Phrase Extraction for Text 2016. It is a pattern-based phrase extraction tool, written in Python and R.

Basic usage of phrasemachine

pip install phrasemachine

import phrasemachine
text = "Barack Obama supports expanding social security."
phrasemachine.get_phrases(text)
{'num_tokens': 7, 'counts': Counter({'barack obama': 1, 'social security': 1})}

It can support other higher accuracy spaCy tagger, or with Stanford CoreNLP.
The position of each token can be obtained.

Ontology Query Endpoints

wikidata sparql 在线查询
SparqlEndpoints 列表（部分不能访问）
北大 gStore SPARQL Endpoint （dbpeida、freebase等）
http://dbpedia.org/sparql
Automated Phrase Mining from Massive Text Corpora 2017. This tool can be easily run by a .sh file, but needs g++, and Java as back tool.

Measuring Fine-Grained Domain Relevance of Terms ACL2021

Posted on 2021-07-09 Edited on 2021-07-12 In Paper Reading , Term Weight Views: Valine: 4k 4 mins.

What's the Problem?

This paper is targeted to measure the domain relevance of terms.

How to write a quilified academic paper?

Posted on 2021-07-05 Edited on 2021-08-04 In Research Method Views: Valine: 4.6k 4 mins.

This blog is a water-tight version of Zhiyuanliu's tutorial on how to write a qualified NLP paper.

Process of Paper Publication

Usually, a classic process of paper publication is :

Proposal \(\longrightarrow\) Model Design \(\longrightarrow\) Coding \(\longleftrightarrow\) Parameter Tuning \(\longrightarrow\) Paper Writing \(\longleftrightarrow\) Paper Reediting \(\longrightarrow\) Paper Sumbmitting \(\longrightarrow\) Presentation

An excellet paper = An excellent work (step 1) + An excellent Writting (step 2)