Title: Fast and far generalization from sparse data
Abstract: Much learning is slow, incremental, and limited in generalizability. However, there are clear cases of fast learning with far generalization from quite limited input data. In this talk, I will present a case from a real-world learning domain that provides insights into both incremental learning and the origins of fast learning and far generalization from minimal data. Focusing on the case of multi-digit numbers, I will show that the multiple predictive relations in the names and written forms of number symbols enable young learners from experiences with just a few examples to robustly generalize to new instances. I will present studies of early symbol knowledge independent of the referred to physical quantities and studies that demonstrate the rapid learning of this knowledge by both preschoolers and a deep learning neural network. I will argue that this early learning has the usual characterizations of incremental learning and is not rule-based. Instead, rapid learning and far generalization emerge because the surface properties of multi-digit number names and their written forms present many redundant, overlapping, and co-predicting features that provide imperfect but multiple pathways to the same generalizable principles. I will conjecture that this form of data structure characterizes many of the knowledge domains that support “few-shot” learning and far generalization. I will also propose and present initial evidence that this early implicit earning about multi-digit numbers sets the stage for later learning of explicit and generative rules.