Sunday, April 11, 2010

lin2002manual Manual and automatic evaluation of summaries

Lin, C.-Y. and Hovy, E. (2002). Manual and automatic evaluation of summaries. In Proceedings of the ACL-02 Workshop on Automatic Summarization, pages 45-51, Morristown, NJ, USA.

Abstract
In this paper we discuss manual and automatic evaluations of summaries using data from the Document Understanding Conference 2001 (DUC-2001). We first show the instability of the manual evaluation. Specifically, the low inter human agreement indicates that more reference summaries are needed. To
investigate the feasibility of automated summary evaluation based on the recent BLEU method from machine translation, we use accumulative n-gram overlap scores between system and human summaries. The initial results provide encouraging correlations with human judgments, based on the Spearman rank-order correlation coefficient. However, relative ranking of systems needs to take into account the
instability.

= = = = = = = = = =

= = = = = = = = = =

No comments:

Post a Comment