After ChatGPT, Microsoft working on AI model that takes images as cues

By Binu MathewMarch 4, 2023

As the war over artificial intelligence (AI) chatbots heat up, Microsoft has unveiled Kosmos-1, a new AI model that can also respond to visual cues or images, apart from text prompts or messages.

The multimodal large language model (MLLM) can help in an array of new tasks, including image captioning, visual question answering and more.

Kosmos-1 can pave the way for the next-stage beyond ChatGPT’s text prompts.

“A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence. In this work, we introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context and follow instructions,” said Microsoft’s AI researchers in a paper.

More in Business Standard

Broadcasting

After ChatGPT, Microsoft working on AI model that takes images as cues

More in Business Standard

YouTube to block Hong Kong’s protest anthem videos after court order

Centre to launch ‘decent’ OTT platform: What’s in store for viewers?

Sony Group posts 7% fall in annual profit, narrowly misses PS5 target

Must Read Articles

Poor Show

Zomato & Swiggy

Regulating Big Tech

Interview with Preetham Uthaiah, EVP – Marketing & Strategy Saankhya Labs

DoT launches portal for centralised RoW approvals

NCAER Working Paper

Real-Time-Bidding of your Data

DoT to get 33 pc stake of Vi for Rs 16,133 cr interest dues

National Data Governance Framework

BSNL 4G

Archives

You may also like

More in Business Standard

Must Read Articles

Archives