10 Comments
User's avatar
Christian Graham's avatar

I so enjoyed this, Andrew. Not only a great piece of analysis, but some damn fine storytelling too!

Expand full comment
Andrew McAfee's avatar

Thanks!

Expand full comment
Daniel Tonkin's avatar

Outstanding. Learned so much useful and interesting stuff in reading this. Thanking Azeem, Andrew.

Expand full comment
Andrew McAfee's avatar

Thanks! Glad you liked it.

Expand full comment
Tanj's avatar

If your trick is different time period, same people, second time period was when AI arrived, then what happened to the non-AI group over those same time periods? The graphs suggest there was a steady improvement over time in both groups (perhaps some other technology was happening at the same time). In other words, where is your control group?

Expand full comment
Tanj's avatar

Why should we believe the counterfactual selection was valid? You gave no explanation for how it was done. Considering how little thought was applied to explaining the programmers (for example, you never considered if the ones who did no use AI were doing different kinds of programming which were inherently slower changing, or were doing things like more testing before pulls, etc.) why is your magic alternative universe and its error bar more trusted than a hallucination?

Expand full comment
rick davies's avatar

Where does the "difference in difference" method fall in your analysis? Does it deal with selection effects? https://chatgpt.com/share/67dad7c2-784c-8013-8e14-5c71da2fa48c

Expand full comment
Andrew McAfee's avatar

Not entirely, no. As your chat with ChatGPT shows, if the two groups are on different trends (if, for example, one group of people in a weight loss study are dieting and exercising more than the other) then a DiD analysis (did taking a pill make the people in group A lose more weight than those in group B?) will not give you reliable answers.

Expand full comment
rick davies's avatar

And other common problem, in my field at least, is that before-measures can be very different according to when the measure was taken, as seems likely with your scatter plot above - behavior was very variable from moment to moment

Expand full comment
Andrew McAfee's avatar

Yes, exactly. This is why it's important to gather data for a long time.

Expand full comment