The businesses make more videos than ever before. From years of broadcast archives to thousands of store cameras and countless hours of production footage, much of it unused on servers, unmonitored and unanalyzed. This is dark data: a vast, untapped resource that companies automatically collect but almost never use in a meaningful way.
To solve the problem, Aza Kai (CEO) and Hiraku Yanagita (COO), two former Googlers who spent nearly a decade working together at Google Japan, decided to create their own solution. The duo co-founded InfiniMindis a Tokyo-based startup developing infrastructure that converts petabytes of unseen video and audio into structured, queryable business data.
“My co-founder, who spent a decade leading brand and data solutions at Google Japan, and I saw this turning point coming while we were still at Google,” Kai said. By 2024, the technology had matured, and the market demand had become so clear that the co-founders felt compelled to build the company themselves, he added.
Kai, who previously worked at Google Japan across the cloud, machine learning, ad systems, and video recommendation models and later led data science teams, explained that current solutions force a trade-off. Earlier methods could place objects in individual frames, but they could not track narratives, understand causality, or answer complex questions about video content. For clients with decades of broadcast archives and petabytes of footage, even basic questions about their content often go unanswered.
What really changed was the development of vision-language models between 2021 and 2023. That’s when video AI began to move beyond simple object tagging, Kai said. Falling GPU costs and annual performance gains of roughly 15% to 20% over the past decade have helped, but the bigger story is capability — until recently, the models couldn’t do the job, he told TechCrunch.
InfiniMind recently secured $5.8 million in seed funding, led by UTEC and joined by CX2, Headline Asia, Chiba Dojo, and an AI researcher at a16z Scout. The company moved its headquarters to the US, while it continues to operate an office in Japan. Japan provides the perfect testbed: solid hardware, talented engineers, and a supportive startup ecosystem, which allows the team to fine-tune its technology with demanding customers before going global.
Its first product, TV Pulse, launched in Japan in April 2025. The AI-powered platform analyzes television content in real time, helping media and retail companies “track product exposure, brand presence, customer sentiment, and PR impact,” per the startup. After pilot programs with major broadcasters and agencies, it now has paying customers, including wholesalers and media companies.
Techcrunch event
Boston, MA
|
June 23, 2026
Now, InfiniMind is ready for the international market. Its flagship product, DeepFrame, a long-form video intelligence platform capable of processing 200 hours of footage to identify specific scenes, speakers, or events, is scheduled for a beta release in March, followed by a full launch in April 2026, Kai said.

The video analysis space is very fragmented. Companies like TwelveLabs provide general-purpose video understanding APIs for a wide range of users, including consumers, prosumers, and businesses, Kai said, while InfiniMind focuses specifically on business use cases, including monitoring, safety, security, and analyzing video content for deeper understanding.
“Our solution requires no code; clients bring in their data, and our system processes it, providing actionable insights,” says Kai. “We also integrate understanding of audio, voice, and speech, not just visuals. Our system can handle unlimited video length, and cost efficiency is a big differentiator. Most existing solutions focus on accuracy or specific use cases but do not solve cost challenges.”
The seed funding will help the team continue to develop the DeepFrame model, expand the engineering infrastructure, hire more engineers, and reach more customers across Japan and the US
“This is an exciting space, one of the pathways to AGI,” Kai said. “Understanding video general intelligence is about understanding reality. Industrial applications are important, but our ultimate goal is to push the boundaries of technology to better understand reality and help people make better decisions.”







