Have you ever read a paper? You can consider yourself lucky if they have error bars and repeated their measurements more than once. The quality of “benchmarking papers” is comically bad (on average).
I don’t know and I won’t pretend it does have any statistical significance. I will just say that I have read dozens of papers and anecdotally, the results were questionable in almost all cases. And not because of the possibility that they might have missed something, but because of basic shortcomings. Some don’t even state how often they repeated their experiments, software versions, whether they accounted for caching effects, (system) temperature, hardware characteristics, you name it.
That’s why I wouldn’t name papers a prime example for clean benchmarking. The quality on YT news outlets like Gamers Nexus or Hardware Unboxed is higher than most of them by far.
Have you ever read a paper? You can consider yourself lucky if they have error bars and repeated their measurements more than once. The quality of “benchmarking papers” is comically bad (on average).
What’s the error bar on that statement of yours?
I don’t know and I won’t pretend it does have any statistical significance. I will just say that I have read dozens of papers and anecdotally, the results were questionable in almost all cases. And not because of the possibility that they might have missed something, but because of basic shortcomings. Some don’t even state how often they repeated their experiments, software versions, whether they accounted for caching effects, (system) temperature, hardware characteristics, you name it.
That’s why I wouldn’t name papers a prime example for clean benchmarking. The quality on YT news outlets like Gamers Nexus or Hardware Unboxed is higher than most of them by far.