๐ Putting GPT-4's new rivals to the test
Can Claude 3 and Gemini Ultra beat GPT-4 at everyday business tasks?
For the past year, GPT-4 has reigned supreme, with no other model even coming close to its prowess. However, in recent weeks, an oligarchy has emerged, challenging its dominance. Gemini Ultra and Claude 3 (Opus) have risen as formidable rivals, at least according to abstract benchmarks, threatening to dethrone the once-uncontested ruler of the AI landscape.
However, benchmarks often rely on questions that donโt align with how people practically use these models in everyday scenarios. So I decided to test the models on tasks I often perform day-to-day, including summarisation, providing a critique of an article and market sizing. These are very typical tasks for anyone in business. To ensure a fair and comparable evaluation, I refrained from any attempt at prompt engineering, keeping the prompts plain and simple, allowing each modelโs inherent capabilities to shine through.
PROMPT 1: Summarise the key points, action items and decisions made during the meetiโฆ
Keep reading with a 7-day free trial
Subscribe to Exponential View to keep reading this post and get 7 days of free access to the full post archives.