Skip to main navigation Skip to search Skip to main content

Back to the Roots of Genres: Text Classification by Language Function

Research output: Chapter in book/report/conference proceedingConference contributionResearch

Abstract

The term “genre” covers different aspects of both texts and documents, and it has led to many classification schemes. This makes different approaches to genre identification incomparable and the task itself unclear. We introduce the linguistically motivated text classification task language function analysis, LFA, which focuses on one well-defined aspect of genres. The aim of LFA is to determine whether a text is predominantly expressive, appellative, or informative. LFA can be used in search and mining applications to efficiently filter documents of interest. Our approach to LFA relies on fast machine learning classifiers with features from different research areas. We evaluate this approach on a new corpus with 4,806 product texts from two domains. Within one domain, we correctly classify up to 82% of the texts, but differences in feature distribution limit accuracy on out-of-domain data.

Original languageEnglish
Title of host publicationProceedings of the 5th International Joint Conference on Natural Language Processing
EditorsHaifeng Wang, David Yarowsky
PublisherAssociation for Computational Linguistics (ACL)
Pages632-640
Number of pages9
ISBN (Electronic)9789744665645
Publication statusPublished - Nov 2011
Externally publishedYes
Event5th International Joint Conference on Natural Language Processing, IJCNLP 2011 - Chiang Mai, Thailand
Duration: 8 Nov 201113 Nov 2011

Conference

Conference5th International Joint Conference on Natural Language Processing, IJCNLP 2011
Abbreviated titleIJCNLP 2011
Country/TerritoryThailand
CityChiang Mai
Period8 Nov 201113 Nov 2011

ASJC Scopus subject areas

  • Language and Linguistics
  • Artificial Intelligence
  • Software
  • Linguistics and Language

Cite this