Svore, K., Vanderwende, L., and Burges, C. (2007). Enhancing single-document summarization by combining RankNet and third-party sources. In Proceedings of the EMNLP-CoNLL, pages 448-457.
Abstract
We present a new approach to automatic summarization based on neural nets, called NetSum. We extract a set of features from each sentence that helps identify its importance in the document. We apply novel features based on news search query logs and Wikipedia entities. Using the RankNet learning algorithm, we train a pair-based sentence ranker to score every sentence in the document and identify the most important sentences. We apply our system to documents gathered from CNN.com, where
each document includes highlights and an article. Our system significantly outperforms the standard baseline in the ROUGE-1 measure on over 70% of our document set.
= = = = = = = = = =
[das2007survey]
In 2001-02, DUC issued a task of creating a 100-word summary of a single news article. However, the best performing systems in the evaluations could not outperform the baseline with statistical signi cance. This extremely strong baseline has been analyzed by Nenkova (2005) and corresponds to the selection of the first n sentences of a newswire article. This surprising result has been attributed to the journalistic convention of putting the most important part of an article in the initial paragraphs. After 2002, the task of single-document summarization for newswire was dropped from DUC.
Svore et al. (2007) propose an algorithm based on neural nets and the use of third party datasets to tackle the problem of extractive summarization, outperforming the baseline with statistical signicance.
The authors used a dataset containing 1365 documents gathered from CNN.com, each consisting of the title, timestamp, three or four human generated story highlights and the article text. They considered the task of creating three machine highlights. The human generated highlights were not verbatim extractions from the article itself. The authors evaluated their system using two metrics: the first one concatenated the three highlights produced by the system, concatenated the three human generated highlights, and compared these two blocks; the second metric considered the ordering and compared the sentences on an individual level.
Svore et al. (2007) trained a model from the labels and the features for each sentence of an article, that could infer the proper ranking of sentences in a test document. The ranking was accomplished using RankNet (Burges et al., 2005), a pair-based neural network algorithm designed to rank a set of inputs that uses the gradient descent method for training. For the training set, they used ROUGE-1 [BD](Lin, 2004) to score the similarity of a human written highlight and a sentence in the document. These similarity scores were used as soft labels during training, contrasting with other approaches where sentences are "hard-labeled", as selected or not.
Some of the used features based on position or n-grams frequencies have been observed in previous work. However, the novelty of the framework lay in the use of features that derived information from query logs from Microsoft's news search engine7 and Wikipedia8 entries. The authors conjecture that if a document sentence contained keywords used in the news search engine, or entities found in Wikipedia
articles, then there is a greater chance of having that sentence in the highlight. The extracts were evaluated using ROUGE-1 and ROUGE-2, and showed statistically signi cant improvements over the baseline of selecting the rst three sentences in a document.
= = = = = = = = = =
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment