Detecting Textual Information in Images from Onion Domains Using Text Spotting

Blanco Medina, Pablo; Fidalgo Fernández, Eduardo; Alegre Gutiérrez, Enrique; Al Nabki, Mohamed Wesam

Título

Detecting Textual Information in Images from Onion Domains Using Text Spotting

Autor

Blanco Medina, Pablo

Fidalgo Fernández, Eduardo

Alegre Gutiérrez, Enrique

Al Nabki, Mohamed Wesam

Facultad/Centro

Escuela de Ingenierias Industrial, Informática y Aeroespacial

Área de conocimiento

Ingenieria de Sistemas y Automatica

Es parte de

XXXIX Jornadas de Automática: actas. Badajoz, 5-7 de septiembre de 2018

Cita Bibliográfica

Blanco Medina, P., Fidalgo Fernández, E., Alegre Gutiérrez, E., & Wesam Al Nabki, M. (2018). Detecting Textual Information in Images from Onion Domains Using Text Spotting. En I. Tejado Balsera, E. Pérez Hernández, A. J. Calderón Godoy, I. González Pérez, P. Merchán García, J. S. Lozano Rogado, S. Salamanca Miño, & B. M. Vinagre Jara (eds.), XXXIX Jornadas de Automática: actas. Badajoz, 5-7 de septiembre de 2018. https://doi.org/10.17979/SPUDC.9788497497565.0975

Editorial

Universidad de Extremadura

Fecha

2018

Resumen

[EN] Due to the efforts of different authorities in the fight against illegal activities in the Tor networks, the traders have developed new ways of circumventing the monitoring tools used to obtain evidence of said activities. In particular, embedding textual content into graphical objects avoids that text analysis, using Natural Language Processing (NLP) algorithms, can be used for watching such onion web contents. In this paper, we present a Text Spotting framework dedicated to detecting and recognizing textual information within images hosted in onion domains. We found that the Connectionist Text Proposal Network and Convolutional Recurrent Neural Network achieve 0.57 F-Measure when running the combined pipeline on a subset of 100 images labeled manually obtained from TOIC dataset. We also identified the parameters that have a critical influence on the Text Spotting results. The proposed technique might support tools to help the authorities in detecting these activities.

Materia

Informática

Palabras clave