In this paper we presented the use of multi-document summarization as postprocessing step in information retrieval (IR). We examined the differences between requirements for general multi-document summarization and requirements when it is applied for IR, and highlighted the requirements for clustering and context information extraction, which is much helpful to the users for browsing and searching relative results. To generate this type of summary, we first cluster the retrieved documents by their topics using a repeated bisection algorithm, and extract the centroid words for each cluster. The final summary is generated on the base of the query words and the cluster centroids, containing query-centered information as well as context information.
Chong LongMinlie HuangXiaoyan ZhuMing Li
Fu Lee WangChristopher C. YangXiaodong Shi