NeuralGarage Harnesses Generative AI to Enhance the Naturalness of Dubbed Shows

Startup Stories Sep 27, 2023 0 219 Add to Reading List

Imagine Spanish actors effortlessly speaking Tamil, or at least appearing to do so naturally. This seemingly improbable feat becomes feasible with the intervention of generative AI, according to the founders of Bengaluru-based deep tech startup NeuralGarage.

In today's world, content owners and distributors often dub content in multiple languages to expand their audience reach. However, this approach frequently results in a disjointed viewing experience, as the lip and jaw movements of the actors don't align with the dubbed words. For instance, picture a Spanish show dubbed in Tamil; the audio is in Tamil, but the actors on screen still appear to be speaking Spanish. This mismatch can be off-putting for viewers, potentially causing them to lose interest in the content.

NeuralGarage seeks to solve this problem with its flagship product, VisualDub. The concept for VisualDub emerged from a personal experience. Anjan Banerjee, one of NeuralGarage's founders, is a passionate fan of Korean shows and movies. While watching the Korean movie "Train to Busan" dubbed in English, he experienced a disconnect because the dubbed audio didn't synchronize with the actors' facial movements. This disconnect prevented him from fully immersing himself in the stories and appreciating their visual aspects.

Driven by curiosity and a desire to bridge the gap between audio and video, Banerjee began exploring the potential of AI. He teamed up with his batchmates from IIT Kanpur, Subhabrata Debnath and Subhashish Saha, to research the possibilities of leveraging AI to address this issue. They reached out to Mandar Natekar, a media and entertainment veteran, for guidance and mentorship, leading to the creation of NeuralGarage.

NeuralGarage's flagship product, VisualDub, aims to reduce audio-visual disparities in dubbed content by synchronizing the lip and jaw movements of actors with the audio. VisualDub employs proprietary algorithms that map phonemes (the smallest units of human sound) to visemes (corresponding lip shapes). These mappings hold true for all languages universally, ensuring accuracy in lip synchronization. The technology uses generative AI to harmonize lip, jaw, and chin movements, making dubbed content visually realistic and natural.

NeuralGarage offers VisualDub through API integration, SaaS, and desktop software. The technology layer is added on top of the dubbed content without interfering with the actual dubbing process. The startup uses Amazon Web Services for client delivery, prioritizing security and privacy. Complex AI and computer vision algorithms further enhance content consumption, delivery, and creation.

VisualDub has been tested in over 30 languages worldwide, including many Indian languages and international languages like Italian, German, Spanish, Japanese, Korean, and Mandarin.

NeuralGarage caters to various verticals, including advertising, influencer marketing, content creation, OTT, and films, with film and edtech content contributing to over 90% of its revenues. The startup boasts a client list that includes Amazon India, Microsoft, Hippo Video, and Pixis.

Backed by institutional VC fund Exfinity Venture Partners and prominent angel investors like Amit Patni, NeuralGarage has secured $1.45 million in seed funding. India's generative AI landscape comprises more than 60 startups offering solutions and services across diverse industries. The startup aims to drive commercial testing and adoption of VisualDub in various verticals this year, targeting a revenue of $1 million and 50 clients by year-end.