Web Analytics
Bangla
Loading date...
RECENT THREADS SOCIAL PAGE LOGIN

Qwen3.5-Omni has been introduced as the next generation of the Qwen model, designed for native understanding of text, image, audio, and video. The system features major improvements in intelligence and real-time interaction. A key highlight is its 'Audio-Visual Vibe Coding' capability, which allows users to describe a vision to the camera and have Qwen3.5-Omni-Plus instantly create a functional website or game. The model family includes Plus, Flash, and Light variants.

Offline, Qwen3.5-Omni offers script-level captioning that generates detailed video scripts with timestamps, scene cuts, and speaker mapping. It reportedly surpasses Gemini-3.1 Pro in audio performance and matches its audio-visual understanding. The model can handle up to 10 hours of audio or 400 seconds of 720p video and has been trained on more than 100 million hours of data. It recognizes 113 spoken languages and can speak 36.

Real-time features include fine-grained voice control for emotion and pace, built-in web search, complex function calling, and voice cloning from short samples. The system also supports human-like conversation with smart turn-taking that filters background noise.

Card image

News Source

x.com 31 Mar 26

Qwen3.5-Omni is here! Scaling up to a native omni-modal AGI

Meet the next generation of Qwen, designed for native text, image, audio, and video understanding, with major advances in both intelligence and real-time interaction. A standout feature: 'Audio-Visual Vibe Coding'. Describe your vision to the came


The ‘1 Nojor’ media platform is now live in beta, inviting users to explore and provide feedback as we continue to refine the experience.