Researchers at the Allen Institute for AI have developed a novel framework called I2D2 that can generate high-quality commonsense knowledge using much smaller AI models compared to state-of-the-art systems.
The key innovation in I2D2 is combining constrained text generation techniques with self-imitation learning to iteratively improve a small language model like GPT-2. During constrained decoding, lexical constraints are imposed to generate simple, generic statements about concepts. A supervised critic model then filters out invalid statements. In self-imitation learning, the model is fine-tuned on its own high-quality generations to steer its distribution toward better samples.
After two self-imitation iterations, I2D2 achieved 92% precision in identifying valid commonsense generics, handily beating GPT-3 at 82% precision. GPT-3 is orders of magnitude larger in scale compared to GPT-2 used by I2D2. The research demonstrates that bigger models are not the only path to improve AI capabilities. With the right training algorithms, smaller models can surpass their larger counterparts.
Using I2D2, the researchers created Gen-A-tomic, a knowledge base of over 30 million commonsense statements about 40,000 concepts. Peer review by humans found Gen-A-tomic to be more diverse and accurate than existing resources like GenericsKB. Moreover, I2D2 can generate such knowledge on-demand for any concept.
The ability to acquire commonsense and general world knowledge is a long-standing challenge for AI. While large models have shown impressive improvements, their opacity, carbon footprint and potential for harm has been concerning. I2D2 shows that developing better training methods is a promising path to overcome these limitations. If perfected, small yet capable commonsense AI models can expand access and enable safer deployment.