Research progress of |TTIC on QA task in one week


jiqizhixin· 2016-11-24 20:32:09

PaperWeekly has introduced many related work Question Answering. There are DeepMind Attentive Reader, FAIR Memory Networks, Danqi s Stanford Reader Attention ", Sum Reader, Gated Attention Sum Reader Attention Over Attention Reader etc., the model of great relevance, there are more or less similar. This paper introduce the Toyota Technological Institute at Chicago (TTIC) Question related work in Answering, a total of 3 paper:

1, Who did What: A Large-Scale Person-Centered Cloze Dataset, 2016
2, Broad Context; "Language Modeling as Reading Comprehension, 2016
3, Emergent Logical;" Structure in Vector Representations of Neural Readers, 2016

Takeshi Onishi, Hai Wang, Mohit Bansal, Kevin Gimpel, David McAllester


EMNLP 2016

Article construction a new Question Answering dataset Who did What."".

sample instance as shown below.

question sentence always cut out some of the named entities, and then appeared in the other named entities as an option. This dataset is more difficult than before the dataset CNN/DM, can be used as a reference data set to create a new model.

construction of this data set method is different from CNN/DM, a summary is not context passge. Problems with context are from Corpus Gigaword, they are two very relevant articles.

specifically, we first find an article as question article. Then extract the text of the first sentence of the entities named, delete one of the entity named as the answer will be predicted. Then use this sentence sentence question, we can use some of the Retrieval Information system from Corpus Gigaword to find an article related to the article as a passage. This article is different from the question article, but contains information that is very similar to sentence question.

passage, we find named from passage entities as candidate answers.

in order to make the task more difficult, we use some simple baseline (First person in passage, etc) will be very easy to make out some problems, leaving only the more difficult instances. This data is more difficult to build a lot of CNN/DM.


believe the author creates a new data set for Machine comprehension brings some new problems and challenges, is a valuable resource. Article using suppresion baseline method can be used to increase the difficulty of the problem, it is worth the reference.

Broad; " Context Language Modeling as Reading Comprehension

100%; min-height: 1em; color: rgb (62, 62, 62); font-size: 16px; line-height: 25.6px; box-sizing: important; word-wrap: border-box! Break-word! Important; background-color: RGB (255, 255, 255); ">Zewei Chu, Hai Wang, Kevin Gimpel, David McAllester



shortly before the release of LAMBADA dataset, the author tries to the baseline models gives poor results.

every LAMBADA instance as shown below.

observed after LAMBADA dataset, we think that we can use Reading comprehe Models nsion to enhance the accuracy rate, and do not have to use the traditional model language.

reading comprehension models candidate answers is required, then select as a prediction of the answer, we will all appear in context the words as a candidate answer.

LAMBADA given training set is some novel text. In order to make the training set consistent with the data type of the test set, we construct a training set biased. Specific approach is that we will set training divided into 4-5 sentence context, and then ensure that word target in passage context appear, only to retain such a training data. We have trained a variety of based models attention on the newly constructed set training, which is much better than the original author.

< / strong>

in this article, the author uses the method and model of simple LAMBADA dataset the accuracy rate from 7.3% Increase to 45.4%, very simple and effective.

Emerge; " NT Logical Structure in Vector Representations of Neural Readers

Hai Wang, Takeshi Onishi, Kevin Gimpel, David McAllester


ICLR 2017 Submission

style=" max-width: min-height: 1em; color: 100%; RGB (62, 62, 62); font-size: 16px; line-height: 25.6px; box-sizing: important; word-wrap: border-box! Break-word! Important; background-color: RGB (255, 255, 255); "> recently proposed a variety of the attention based reader models, the author of this paper makes a comprehensive summary and analysis, and through mathematical analysis and experiments demonstrate the correlation model between.

the author believes that the current attention based models can be divided into two categories (including readers, aggregation attentive readers and Stanford Readers) and explicit reference readers (sum reader and gated att including attention Sum reader ention).

two reader can use the following formula together.

to satisfy the above equations, only need to meet the following formula.

, that is to say, inner product can only give the correct answer in the hidden vector and question vector are not zero constant. The following experimental results support the hypothesis.

CNN/DM after anonymization because in training and testing, the authors believe that this inner product actually can be divided into two parts, one part is related to anonymized token ID, the other part has nothing to do with ID. The part that is related to ID in the product inner should give a 0 answer. As shown in the following formula.

features to improve the accuracy of reading various data sets in attention readers, not carefully described here.


this is for each attetion based neural reader models a good summary, it is well connected different model, explain why seemingly different model can give very similar results.


Q system is a big problem, is the current research focus of NLP application. In this paper, the author describes some of the results of TTIC in QA research, including second of which is the author of the recent paper. Thanks to @Zewei Chu from the University of Chicago's hard work.

PaperWeekly is a sharing of knowledge and knowledge exchange of academic organization, Guan Note the field is the various directions of NLP. If you often read paper, also love to share knowledge, but also love together to discuss and study it with everyone, please quickly come and join us.

WeChat public number: PaperWeekly

micro-blog account: PaperWeekly ( 

machine of the heart (almosthuman2014)

The lastest articles of jiqizhixin

Invitation |2017 national robotics Forum

Speech, Tong Xin, chief researcher: from interactive to intelligent network...

Research progress of |TTIC on QA task in one week

Application of column depth learning in face recognition -- the "evolution"...

Column of the Tencent excellent figure Garyhuang: excellent map, not just...

CVPR2016| Shang Tang scientific papers analysis: Dress recognition search...