Abstract:
This thesis discusses an sentence extraction approach to multi-document
summarization that builds on singledocument summarization methods by using
additional, available information about the document set as a whole and the
relationships between the documents. Multidocument summarization differs from
single in that the issues of compression, speed, redundancy and passage selection
are critical in the formation of useful summaries. Our approach addresses these
issues by using Agglomerative cluster sentence, GloVe, TextRank, Cosine
Similarity. Also this thesis use NLTK as library to filter word such as stopwords,
numeric, punctuation, multiple_whitespaces, short_words in order Vectorizing
the sentence when using GloVe.