
[full paper] 
Roberto Esposito
In this paper we focus on the relationship between concept complexity and the generalization error of learned concept descriptions. After introducing the concept of compressibility, we suggest how it could be usefully exploited in order to estimate from the training data the Kolmogorov complexity of the concept to be learned. Then, we present an empirical apparatus which allows us to study the relationship between the estimated target concept complexity and the generalization error of different learning algorithms. Results show a linear relationship between the two variates: the generalization error appears to increase as the target concept becomes more complex. While this is expected, quite interesting is the fact that the relationship seems to be (only) linear. Known theoretical bounds, in fact, show a superlinear behavior. Another interesting finding is that, while the degree of correlation changes for different learners, the “linear” relationship seems not to be affected by the particular learning algorithm.
Keywords: machine learning, generalization error, complexity
Citation: Roberto Esposito: Empirical Evaluation of the Effects of Concept Complexity on Generalization Error. In R.López de Mántaras and L.Saitta (eds.): ECAI2004, Proceedings of the 16th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2004, pp.10091010.