Saturday, April 10, 2010

chen2008tsinghua Tsinghua University at the summarization track of TAC 2008

Shouyuan Chen, Yuanming Yu, Chong Long, Feng Jin, Lijing Qin, Minlie Huang, Xiaoyan Zhu, "Tsinghua University at the summarization track of TAC 2008.", Text Analysis Conference(TAC 2008). 2008.11.17-19


Apa yang spesifik dengan metoda tsb unt menangani 'update' summarization (dibandingkan misalnya dengan generic summarization)









abstract

proposed two novel methods,
- based on the information distance theory, and
- based on the sentence centrality .. from the centrality concept in the graph theory.
results .. very competitive to generate extractive summaries.

  Introduction


TAC 2008 update summarization
write a short (~ 100 words) summary of a set of newswire articles

evaluation: readability and content (based on Pyramid Method)

1st system: based on Kolmogorov complexity and information distance theory.
- optimal summary:  a summary with the smallest information distance to all original news articles
- text summarization problem is converted into an optimization problem limited by the summary's information content
- to solve this optimization problem, we proposed an approach to approximate K(.) and D(.,.).

2nd system: centrality concepts within the graph theory
nodes: sentences,  edges: similarities between sentences calculated by an LSI algorithm.

2 The first system: information-distance based update summarization
Kolmogorov complexity and information distence






3 The second system: sentence centrality based update summarization

No comments:

Post a Comment