-
Marciel Mario Degasperi
Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo, Campus Serra, ES
-
Daniel Cruz Cavalieri
Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo, Campus Serra, ES
-
Fidelis Zanetti de Castro
Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo, Campus Serra, ES
Keywords:
Machine Learning, Natural Language Processing, Service Desk Systems, Classification
Abstract
Service Desk systems have a rich information base made up of the history of calls made, which can and should be used as a reference base for subsequent calls. Standard search tools, such as keyword searches, prove to be unfeasible for searching large databases, due to the long query time and the return of results unrelated to the problem. This work aims to investigate the ability of some classical classification algorithms to find the characteristic defined here as “relevance”: the characteristic of texts with some knowledge that can be reused. The motivation is that non-relevant texts can be removed early from the dataset, allowing complex algorithms to be employed on a smaller amount of information. In the tests performed, the Naive-Bayes, Adaptive Boosting, Random Forest, Stochastic Gradient Descent, Logistic Regression, Support Vector Machine, and Light Gradient Boosting Machine classifiers were used. The classifiers showed accuracy below 0.8, indicating that, in this scenario, other more efficient approaches should be used.