Why the AGI discussion is heating up again
Every now and then arguments crop up about artificial general intelligence (AGI) being just around the corner. And right now we are in the midst of one of those cycles. Tech entrepreneurs warn of the alien invasion of AGI. The media is inundated with reports of AI systems that master language and towards generalization. And social media is full of heated discussions about deep neural networks and consciousness.
In recent years, we’ve made some really impressive advances in AI, and scientists have been able to make headway in some of the most challenging areas of the field.
But as has happened several times during the decades-long history of AI, some of the current rhetoric around AI progress could be unwarranted hype. And there are areas of research that haven’t received much attention, in part because of the growing influence of large tech companies on artificial intelligence.
Overcoming the limits of deep learning
In early 2010, a group of researchers managed to win the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) by a wide margin using a deep learning model. Since then, deep learning has become the main focus of AI research.
Deep learning has succeeded in making progress on many tasks that were previously very challenging for computers, including image classification, object detection, speech recognition and natural language processing.
However, the growing interest in deep learning also revealed some shortcomings, including limited generalizability, struggles with causality and lack of interpretability. In addition, most deep learning applications required tons of manually annotated training examples, which became a bottleneck.
Interesting developments have taken place in some of these areas in recent years. An important innovation is the transformer model, a deep learning architecture introduced in 2017. An important characteristic of transformers is their scalability. Researchers have shown that the performance of transformer models continues to improve as they grow larger and use more data. Transformers can also be pre-trained through unsupervised or self-directed learningmeaning they can use terabytes of untagged data available on the web.
Transformers have spawned a generation of major language models (LLMs), such as OpenAI’s GPT-3, DeepMind’s Gopher, and Google’s PaLM. In some cases, researchers have shown that LLMs can perform many tasks without additional training or with very few training samples (also known as zero-, one- or few-shot learning† While transformers were initially designed for language tasks, they have expanded into other areas, including computer vision, speech recognition, drug research, and source code generation.
More recent work has focused on bringing multiple modalities together. For example the CLAMP, a deep learning architecture developed by researchers at OpenAI, trains a model to find relationships between text and images. Instead of carefully annotated images used in previous deep learning models, CLIP is trained on images and captions that are abundantly available on the web. This allows it to learn a wide variety of vision and language tasks. CLIP is the architecture used in OpenAI’s DALL-E 2, an AI system that can create stunning images from text descriptions. DALL-E 2 seems to have overcome some of the limitations of previous generative DL models, including semantic consistency (i.e., understanding the relationship between different objects in an image).
gato, DeepMind’s latest AI system, takes the multimodal approach one step further by bringing text, images, proprioceptive information and other types of data into a single transformer model. Gato uses one model to learn and perform many tasks, including playing Atari, captioning images, chatting, and stacking blocks with a real robotic arm. The model has mediocre performance on many of the tasks, but DeepMind’s researchers believe it’s only a matter of time before an AI system like Gato can do it all. The Research Director of DeepMind recently tweeted, “It’s all about scale now! The game is over!” meaning that creating larger versions of Gato will eventually achieve general intelligence.
Is deep learning the definitive answer to AGI?
Recent developments in deep learning seem to be in line with the vision of its main proponents. Geoffrey Hinton, Yoshua Bengio and Yann LeCun, three Turing Award-winning scientists known for their pioneering contributions to deep learning, have suggested that better neural network architectures will eventually overcome the current limits of deep learning. LeCun, in particular, is a proponent of self-supervised learning, which is now widely used in transformer training and CLIP models (although LeCun is working on a more refined strain of self-supervised learning, and it’s worth noting that LeCun is a nuanced opinion on the topic of AGI intelligence and prefers the term “human-level intelligence”).
On the other hand, some scientists point out that, despite its progress, deep learning still lacks some of the most essential aspects of intelligence. Among them are Gary Marcus and Emily M. Benderboth of which have thoroughly documented the limits of major language models such as GPT-3 and text-to-image generators such as DALL-E 2.
Marcus, who has written a book about the limits of deep learningbelongs to a group of scientists who hybrid approach that brings together different AI techniques. A hybrid approach that has recently gained popularity is neuro-symbolic AI, which combines artificial neural networks with symbolic systems, a branch of AI that fell by the wayside with the rise of deep learning.
There are several projects showing that neurosymbolic systems address some of the limits current AI systems suffer from, including: lack of common sense and causality, composition and intuitive physics. It has also been found that neurosymbolic systems require much less data and computing resources than pure deep learning systems.
The role of great technology
The drive to solve AI problems with larger deep learning models has increased the power of companies that can afford the rising costs of research.
In recent years, AI researchers and research labs have been drawn to large tech companies with deep pockets. UK-based DeepMind was acquired by Google in 2014 for $600 million. OpenAI, which started in 2015 as a not-for-profit research lab, switched to a capped-profit outfit in 2019 and received $1 billion in funding from Microsoft. Today, OpenAI no longer releases its AI models as open source projects and has licensed them exclusively to Microsoft. Other big tech companies like Facebook, Amazon, Apple and Nvidia have set up their own money-burning AI research labs and are using lucrative salaries to draw scientists from academia and smaller organizations.
This, in turn, has given these companies the power to steer AI research in a direction that gives them the advantage (i.e. large and expensive deep learning models that only they can fund). While the wealth of big tech has helped deep learning immensely, it has come at the expense of other areas of research, such as neurosymbolic AI.
Nevertheless, for now it seems that throwing more data and computing power at transformers and other deep learning models is still yielding results. It will be interesting to see how far the understanding can be stretched and how close it will bring us to solving the ever elusive conundrum of thinking machines.
VentureBeat’s mission is to be a digital city square for tech decision makers to learn about transformative business technology and transactions. Learn more about membership.