click to view more

Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Name: Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Price: 83.46 USD
Availability: InStock
Author: Li, Chunyuan
ISBN: 9781638283362

by Li, Chunyuan

$83.46

List Price: ~~$99.00~~

Save: $15.54 (15%)

In Stock - Ship in 24 hours with Free Online tracking.
FREE DELIVERY by Tuesday, July 22, 2025

24/24 Online
Yes High Speed
Yes Protection

Description

This monograph presents a comprehensive survey of the taxonomy and evolution of multimodal foundation models that demonstrate vision and vision-language capabilities, focusing on the transition from specialist models to general-purpose assistants.
The focus encompasses five core topics, categorized into two classes; (i) a survey of well-established research areas: multimodal foundation models pre-trained for specific purposes, including two topics - methods of learning vision backbones for visual understanding and text-to-image generation; (ii) recent advances in exploratory, open research areas: multimodal foundation models that aim to play the role of general-purpose assistants, including three topics - unified vision models inspired by large language models (LLMs), end-to-end training of multimodal LLMs, and chaining multimodal tools with LLMs.
The target audience of the monograph is researchers, graduate students, and professionals in computer vision and vision-language multimodal communities who are eager to learn the basics and recent advances in multimodal foundation models.

Last updated on 2025-07-01 22:13:44.669

Product Details

Now Publishers Brand
May 6, 2024 Pub Date:
1638283362 ISBN-10:
9781638283362 ISBN-13:
English Language
9.21 in * 0.48 in * 6.14 in Dimensions:
1 lb Weight:

Money Back

Love it! Use it! Reuse it!

Free Shipping

Shipping is on us

Free Support

24/24 available

Best Deal

Quality guaranteed

Science

Math

General

The New York Times® Bestsellers