Scripps Research scientists create AI that 'watches' videos by mimicking the brain

A new, more sustainable AI model recognizes visual scenes by mirroring brain processes, opening doors for applications in medical diagnostics, drug discovery and beyond

14 Jan 2025

Scientists at Scripps Research have created MovieNet, an innovative artificial intelligence (AI) model that can watch and understand moving images and process the videos much like how our brains interpret real-life scenes as they unfold over time.

Left to right: Hollis Cline, PhD, and Masaki Hiramoto, PhD. Credit: Scripps Research.

This brain-inspired AI model, detailed in a study published in the Proceedings of the National Academy of Sciences can perceive moving scenes by simulating how neurons — or brain cells — make real-time sense of the world. Conventional AI excels at recognizing still images, but MovieNet introduces a method for machine-learning models to recognize complex, changing scenes.

“The brain doesn’t just see still frames; it creates an ongoing visual narrative,” says senior author Dr. Hollis Cline, the director of the Dorris Neuroscience Center and the Hahn Professor of Neuroscience at Scripps Research. “Static image recognition has come a long way, but the brain’s capacity to process flowing scenes — like watching a movie — requires a much more sophisticated form of pattern recognition. By studying how neurons capture these sequences, we’ve been able to apply similar principles to AI.”

To create MovieNet, Cline and first author Dr. Masaki Hiramoto, a staff scientist at Scripps Research, examined how the brain processes real-world scenes as short sequences, similar to movie clips. Specifically, the researchers studied how tadpole neurons responded to visual stimuli.

“Tadpoles have a very good visual system, plus we know that they can detect and respond to moving stimuli efficiently,” explains Hiramoto.

He and Cline identified neurons that respond to movie-like features — such as shifts in brightness and image rotation — and can recognize objects as they move and change. Located in the brain’s visual processing region known as the optic tectum, these neurons assemble parts of a moving image into a coherent sequence. Different neurons process various 'puzzle pieces' of a real-life moving image, which the brain then integrates into a continuous scene.

The researchers also found that the tadpoles’ optic tectum neurons distinguished subtle changes in visual stimuli over time, capturing information in roughly 100 to 600 millisecond dynamic clips rather than still frames. These neurons are highly sensitive to patterns of light and shadow, and each neuron’s response to a specific part of the visual field helps construct a detailed map of a scene to form a 'movie clip'.

Cline and Hiramoto trained MovieNet to emulate this brain-like processing and encode video clips as a series of small, recognizable visual cues. This permitted the AI model to distinguish subtle differences among dynamic scenes.

To test MovieNet, the researchers showed it video clips of tadpoles swimming under different conditions. Not only did MovieNet achieve 82.3 percent accuracy in distinguishing normal versus abnormal swimming behaviors, but it exceeded the abilities of trained human observers by about 18 percent. It even outperformed existing AI models which achieved just 72 percent accuracy despite its extensive training and processing resources.

Beyond its high accuracy, MovieNet is an eco-friendly AI model. Conventional AI processing demands immense energy, leaving a heavy environmental footprint. MovieNet’s reduced data requirements offer a greener alternative that conserves energy while performing at a high standard.

In addition, as the technology advances, MovieNet could become a valuable tool for identifying subtle changes in early-stage conditions, such as detecting irregular heart rhythms or spotting the first signs of neurodegenerative diseases like Parkinson’s. For example, small motor changes related to Parkinson’s that are often hard for human eyes to discern could be flagged by the AI early on, providing clinicians valuable time to intervene.

Furthermore, MovieNet’s ability to perceive changes in tadpole swimming patterns when tadpoles were exposed to chemicals could lead to more precise drug screening techniques, as scientists could study dynamic cellular responses rather than relying on static snapshots.

Want the latest science news straight to your inbox? Become a SelectScience member for free today>>

Tags