Iβve been asking most of the LLM tools I use (Perplexity, foundation LLMs, etc.) to provide a confidence score out of 100 when responding to my questions, as a way to gauge the accuracy of their answers. Interestingly, Iβve never seen a score lower than 90/100β¦ Either theyβre calibrated to be overconfident, or my questions are just too mundane. :-)
I think most helpful when its response is grounded against an actual source. I get a wide range when I ask it to fact check against a doc. Don't think I've experienced a 90% fact being wrong - but I have on facts where it was 50-70% sure.
Iβve been asking most of the LLM tools I use (Perplexity, foundation LLMs, etc.) to provide a confidence score out of 100 when responding to my questions, as a way to gauge the accuracy of their answers. Interestingly, Iβve never seen a score lower than 90/100β¦ Either theyβre calibrated to be overconfident, or my questions are just too mundane. :-)
I think most helpful when its response is grounded against an actual source. I get a wide range when I ask it to fact check against a doc. Don't think I've experienced a 90% fact being wrong - but I have on facts where it was 50-70% sure.