In this paper, we investigate an approach for creating a comprehensive textual overview of a subject composed of information drawn from the Internet. We use the high-level structure of human-authored texts to automatically induce a domain-specific template for the topic structure of a new overview. The algorithmic innovation of our work is a method to learn topic-specific extractors for content selection jointly for the entire template. We augment the standard perceptron algorithm with a global integer linear programming formulation to optimize both local fit of information into each topic and global coherence across the entire overview. The results of our evaluation confirm the benefits of incorporating structural information into the content selection process.
Flip KornXuezhi WangYou WuCong Yu
Yuang ChengYue DingDamián PascualOliver RichterMartin VolkRoger WattenhoferRoger Wattenhofer
Marija ŠakotaMaxime PeyrardRobert West
Yoshihiro TamuraYutaka TakaseYuki HayashiYukiko Nakano