The task of information filtering is to classify texts from a
stream of documents into relevant and non relevant, respectively, with respect
to a particular category or user interest, which may change over time. A
filtering system should be able to adapt to such concept changes. This paper
explores methods to recognize concept changes and to maintain windows on the
training data, whose size is either fixed or automatically adapted to the
current extent of concept change. Experiments with two simulated concept drift
scenarios based on real-world text data and eight learning methods are
performed to evaluate three indicators for concept changes and to compare
approaches with fixed and adjustable window sizes, respectively, to each other
and to learning on all previously seen examples. Even using only a simple
window on the data already improves the performance of the classifiers
significantly as compared to learning on all examples. For most of the classifiers,
the window adjustments lead to a further increase in performance compared to
windows of fixed size. The chosen indicators allow to reliably recognize
concept changes.
No comments:
Post a Comment