Performance comparison of different closed-and open-source LMMs on CAMEL-Bench.
# | Model | Source | ALL | MM Understand. & Reasoning | OCR & Document Understanding | Charts & Diagrams Understanding | Video Understanding | Cultural Specific Understanding | Medical Imaging | Agro Specific | Remote Sensing |
1 | GPT-4o 🥇 | Link | 62.40 | 57.90 | 59.11 | 73.57 | 74.27 | 80.86 | 49.90 | 80.75 | 22.85 | 2 | GPT-4o-mini 🥈 | Link | 53.42 | 48.82 | 42.89 | 64.98 | 68.11 | 65.92 | 47.37 | 79.58 | 16.93 |
3 | Gemini-1.5-Pro 🥉 | Link | 44.06 | 46.67 | 36.59 | 47.06 | 42.94 | 56.24 | 33.77 | 72.12 | 17.07 |
4 | Gemini-1.5-Flash | Link | 45.14 | 45.58 | 33.59 | 48.25 | 53.31 | 46.54 | 42.86 | 76.06 | 14.95 |
4 | Qwen2-VL | Link | 32.62 | 40.59 | 25.68 | 27.83 | 38.90 | 34.27 | 29.12 | 52.02 | 12.56 |