dc.contributor.author |
Paijmans, H
|
|
dc.date.accessioned |
2018-06-06T13:46:54Z |
|
dc.date.available |
2018-06-06T13:46:54Z |
|
dc.date.issued |
1998 |
|
dc.identifier.citation |
Paijmans H (1998) Text categorization as an information retrieval task. South African Computer Journal, Number 21, 1998 |
en |
dc.identifier.issn |
2313-7835 |
|
dc.identifier.uri |
http://hdl.handle.net/10500/24285 |
|
dc.description.abstract |
A number of methods for feature reduction and feature selection in text classification and information retrieval systems are compared. These include feature sets that are constructed by Latent Semantic Indexing, 'local dictionaries' in the form of the words that score highest in frequency in positive class examples and feature sets that are constructed by relevance feedback strategies such as Rocchio's feedback algorithm or Genetic algorithms. Also, different derivations from the normal Recall and Precision performance indicators are discussed and compared. It was found that categorizers consisting of the
words with highest tf .idf values scored best. |
en |
dc.language.iso |
en |
en |
dc.publisher |
South African Computer Society (SAICSIT) |
en |
dc.subject |
Machine learning |
en |
dc.subject |
Classification |
en |
dc.title |
Text categorization as an information retrieval task |
en |
dc.type |
Article |
en |