Sun | Dec 1, 2024

Tech Times | Making AI smarter, faster

Published:Friday | March 10, 2017 | 12:00 AM

Most of us can recognise an object after seeing it once or twice. But the algorithms that power computer vision and voice recognition need thousands of examples to understand a new image or word.

Researchers at Google DeepMind now have a way around this. They made a few clever tweaks to a deep-learning algorithm that allows a system to recognise objects in images and other things from a single example - something known as 'one-shot learning.' The team demonstrated the trick on a large database of tagged images, as well as on handwriting and language.

The best algorithms can recognise things reliably, but their need for data makes building them time-consuming and expensive. An algorithm trained to spot cars on the road, for instance, needs to ingest many thousands of examples to work reliably in a driverless car. Gathering so much data is often impractical - a robot that needs to navigate an unfamiliar home, for instance, can't spend countless hours wandering around learning.

Oriol Vinyals, a research scientist at Google DeepMind, a UK-based subsidiary of Alphabet that's focused on artificial intelligence, added a memory component to a deep-learning system - a type of large neural network. Such systems need to see lots of images to fine-tune the connections between their virtual neurons. The team demonstrated the capabilities of the system on a database of labelled photographs called ImageNet. The software still needs to analyse several hundred categories of images, but after that it can learn to recognise new objects from just one picture. It effectively learns to recognise the characteristics in images that make them unique. The algorithm was able to recognise images of dogs with an accuracy close to that of a conventional data-hungry system after seeing just one example.

Vinyals says the work would be especially useful if the system could quickly recognise the meaning of a new word. This could be important for Google's core search engine, since it could enable it to quickly learn the meaning of a search term it hasn't seen before.