RankSum—An unsupervised extractive text summarization based on rank fusion

Joshi, Akanksha; Fidalgo Fernández, Eduardo; Alegre Gutiérrez, Enrique; Alaiz Rodríguez, Rocío

doi:10.1016/J.ESWA.2022.116846

Título

RankSum—An unsupervised extractive text summarization based on rank fusion

dc.contributor	Escuela de Ingenierias Industrial, Informática y Aeroespacial	es_ES
dc.contributor.author	Joshi, Akanksha
dc.contributor.author	Fidalgo Fernández, Eduardo
dc.contributor.author	Alegre Gutiérrez, Enrique
dc.contributor.author	Alaiz Rodríguez, Rocío
dc.contributor.other	Ingenieria de Sistemas y Automatica	es_ES
dc.date	2022-08-15
dc.date.accessioned	2024-01-18T13:57:16Z
dc.date.available	2024-01-18T13:57:16Z
dc.identifier.citation	Joshi, A., Fidalgo, E., Alegre, E., & Alaiz-Rodriguez, R. (2022). RankSum—An unsupervised extractive text summarization based on rank fusion. Expert Systems with Applications, 200. https://doi.org/10.1016/J.ESWA.2022.116846	es_ES
dc.identifier.issn	0957-4174
dc.identifier.uri	https://hdl.handle.net/10612/17676
dc.description.abstract	[EN] In this paper, we propose Ranksum, an approach for extractive text summarization of single documents based on the rank fusion of four multi-dimensional sentence features extracted for each sentence: topic information, semantic content, significant keywords, and position. The Ranksum obtains the sentence saliency rankings corresponding to each feature in an unsupervised way followed by the weighted fusion of the four scores to rank the sentences according to their significance. The scores are generated in completely unsupervised way, and a labeled document set is required to learn the fusion weights. Since we found that the fusion weights can generalize to other datasets, we consider the Ranksum as an unsupervised approach. To determine topic rank, we employ probabilistic topic models whereas semantic information is captured using sentence embeddings. To derive rankings using sentence embeddings, we utilize Siamese networks to produce abstractive sentence representation and then we formulate a novel strategy to arrange them in their order of importance. A graph-based strategy is applied to find the significant keywords and related sentence rankings in the document. We also formulate a sentence novelty measure based on bigrams, trigrams, and sentence embeddings to eliminate redundant sentences from the summary. The ranks of all the sentences – computed for each feature – are finally fused to get the final score for each sentence in the document. We evaluate our approach on publicly available summarization datasets — CNN/DailyMail and DUC 2002. Experimental results show that our approach outperforms other existing state-of-the-art summarization methods.	es_ES
dc.language	eng	es_ES
dc.publisher	Elsevier	es_ES
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Documentación	es_ES
dc.subject	Lingüística	es_ES
dc.subject.other	Text summarization	es_ES
dc.subject.other	Extractive	es_ES
dc.subject.other	Topic	es_ES
dc.subject.other	Embeddings	es_ES
dc.subject.other	Keywords	es_ES
dc.title	RankSum—An unsupervised extractive text summarization based on rank fusion	es_ES
dc.type	info:eu-repo/semantics/article	es_ES
dc.identifier.doi	10.1016/J.ESWA.2022.116846
dc.description.peerreviewed	SI	es_ES
dc.rights.accessRights	info:eu-repo/semantics/embargoedAccess	es_ES
dc.journal.title	Expert Systems with Applications	es_ES
dc.volume.number	200	es_ES
dc.page.initial	116846	es_ES
dc.type.hasVersion	info:eu-repo/semantics/acceptedVersion	es_ES
dc.subject.unesco	5701.02 Documentación Automatizada	es_ES
dc.subject.unesco	5701.05 Lenguajes Documentales	es_ES
dc.description.project	Investigación realizada dentro del marco de colaboración entre la Universidad de León y el INCIBE (Instituto Nacional de Ciberseguridad)

Ficheros en el ítem

Nombre:: RankSum_Unsupervised_Extractive.pdfEmbargado hasta: 2024-08-15
Tamaño:: 488.7 xmlui.dri2xhtml.METS-1.0.size-kilobytes
Formato:: application/pdf

Visualizar/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos [5503]

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional