Resumen
Copyright © 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). We study the automatic reply of email business messages in Brazilian Portuguese. We present a novel corpus containing messages from a real application, and baseline categorization experiments using Naive Bayes and Support Vector Machines. We then discuss the effect of lemmatization and the role of part-of-speech tagging filtering on precision and recall. Support Vector Machines classification coupled with non-lemmatized selection of verbs and nouns, adjectives and adverbs was the best approach, with 87.3% maximum accuracy. Straightforward lemmatization in Portuguese led to the lowest classification results in the group, with 85.3% and 81.7% precision in SVM and Naive Bayes respectively. Thus, while lemmatization reduced precision and recall, part-of-speech filtering improved overall results.
Idioma original | Inglés estadounidense |
---|---|
Páginas | 496-501 |
Número de páginas | 6 |
Estado | Publicada - 1 ene. 2016 |
Publicado de forma externa | Sí |
Evento | AAAI Workshop - Technical Report - Duración: 1 ene. 2016 → … |
Conferencia
Conferencia | AAAI Workshop - Technical Report |
---|---|
Período | 1/01/16 → … |