This paper analyzes the performance of semi-supervised learning of mixture models. We show that unlabeled data can lead to an increase in classification error even in situations where additional labeled data would decrease classification error. We present a mathematical analysis of this "degradation" phenomenon and show that it is due to the fact that bias may be adversely affected by unlabeled data. We discuss the impact of these theoretical results to practical situations.
|Original language||American English|
|Number of pages||8|
|State||Published - 1 Dec 2003|
|Event||Proceedings, Twentieth International Conference on Machine Learning - |
Duration: 1 Dec 2003 → …
|Conference||Proceedings, Twentieth International Conference on Machine Learning|
|Period||1/12/03 → …|