👀 Putting GPT-4's new rivals to the test

Can Claude 3 and Gemini Ultra beat GPT-4 at everyday business tasks?

Mar 13, 2024

∙ Paid

For the past year, GPT-4 has reigned supreme, with no other model even coming close to its prowess. However, in recent weeks, an oligarchy has emerged, challenging its dominance. Gemini Ultra and Claude 3 (Opus) have risen as formidable rivals, at least according to abstract benchmarks, threatening to dethrone the once-uncontested ruler of the AI landscape.

However, benchmarks often rely on questions that don’t align with how people practically use these models in everyday scenarios. So I decided to test the models on tasks I often perform day-to-day, including summarisation, providing a critique of an article and market sizing. These are very typical tasks for anyone in business. To ensure a fair and comparable evaluation, I refrained from any attempt at prompt engineering, keeping the prompts plain and simple, allowing each model’s inherent capabilities to shine through.

PROMPT 1: Summarise the key points, action items and decisions made during the meeti…

Continue reading this post for free, courtesy of Azeem Azhar.

Or purchase a paid subscription.

Exponential View

👀 Putting GPT-4's new rivals to the test

Can Claude 3 and Gemini Ultra beat GPT-4 at everyday business tasks?

Continue reading this post for free, courtesy of Azeem Azhar.