- Published on
Adept has released its new multimodal model, Fuyu-Heavy, which has been designed specifically for digital agents.
The model is the third-most capable of its kind in the world, and is only outranked by GPT4-V and Gemini Ultra, both of which are 10 to 20 times larger.
Fuyu-Heavy excels at multimodal reasoning and UI understanding, and scores higher on the MMMU benchmark than even Gemini Pro.
The model matches or exceeds the performance of those in its compute class on text-based benchmarks, despite having to devote part of its capacity to image modelling.