Sunday, April 11, 2010

luhn1958automatic The automatic creation of literature abstracts

Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of Research Development, 2(2):159-165.


= = = = = = = = = =
[das2007survey]
"In his work, Luhn proposed that the frequency of a particular word in an article provides an useful measure of its significance. There are several key ideas put forward in this paper that have assumed importance in later work on summarization. As a first step, words were stemmed to their root forms, and stop words were deleted. Luhn then compiled a list of content words sorted by decreasing frequency, the index providing a signi cance measure of the word.

On a sentence level, a signi cance factor was derived that reflects the number of occurrences of significant words within a sentence, and the linear distance between them due to the intervention of non-signi cant words. All sentences are ranked in order of their significance factor, and the top ranking sentences are finally selected to form the auto-abstract.
= = = = = = = = = =

No comments:

Post a Comment