Compartir
Título
On Detecting and Removing Superficial Redundancy in Vector Databases
Autor
Facultad/Centro
Área de conocimiento
Título de la revista
Mathematical Problems in Engineering
Cita Bibliográfica
Mathematical Problems in Engineering, Vol. 2018,
Editorial
Hindawi
Fecha
2018-05
ISSN
1024-123X
Resumen
A mathematical model is proposed in order to obtain an automatized tool to remove any unnecessary data, to compute the level of the redundancy, and to recover the original and filtered database, at any time of the process, in a vector database. This type of database can be modeled as an oriented directed graph. Thus, the database is characterized by an adjacency matrix. Therefore, a record is no longer a row but a matrix. Then, the problem of cleaning redundancies is addressed from a theoretical point of view. Superficial redundancy is measured and filtered by using the 1-norm of a matrix. Algorithms are presented by Python and MapReduce, and a case study of a real cybersecurity database is performed.
Materia
Palabras clave
Peer review
SI
URI
DOI
Versión del editor
Aparece en las colecciones
- Artículos [4625]
Ficheros en el ítem
Nombre:
Tamaño:
2.267
xmlui.dri2xhtml.METS-1.0.size-megabytes
Formato:
Adobe PDF
Descripción:
PDF editor