Imperial Should Have Called Winston Wolf
In the film Pulp Fiction, moronic hoodlums Jules (Samuel L. Jackson) and Vincent (John Travolta) pick up a guy who had stolen a briefcase from the back of their boss Marcellus Wallace’s car. While driving him away, Vincent accidentally shoots him, leaving the back of the car splattered with blood and brains. In a panic, they drive to friend Jimmy Dimmick’s (Quentin Tarantino’s) house. Dimmick tells them his wife will be home in an hour and they can’t stay. In a panic they call Wallace, who calls in Winston Wolf. Wolf says: “It’s an hour away. I’ll be there in 10 minutes.” In 9 minutes and 37 seconds, Wolf’s car squeals to a halt in front of Jimmy’s house. Wolf rings the doorbell, and when Jimmy answers, Wolf says: “I’m Winston Wolf. I solve problems.” Within 40 minutes, Wolf solves Jules’ and Vincent’s problem. The car is cleaned up with the body is in the trunk, ready to be driven to the wrecking yard to be crushed.
The Imperial team that relied on Microsoft/Github to fix its code should have called Winston Wolf instead, because MS/Github left behind some rather messy evidence. “Sue Denim,” who wrote the code analysis I linked to yesterday, has a follow up describing what Not Winston Wolf left behind:
The hidden history. Someone realised they could unexpectedly recover parts of the deleted history from GitHub, meaning we now have an audit log of changes dating back to April 1st. This is still not exactly the original code Ferguson ran, but it’s significantly closer.
Sadly it shows that Imperial have been making some false statements.
- ICL staff claimed the released and original code are “essentially the same functionally”, which is why they “do not think it would be particularly helpful to release a second codebase which is functionally the same”.
In fact the second change in the restored history is a fix for a critical error in the random number generator. Other changes fix data corruption bugs (another one), algorithmic errors, fixing the fact that someone on the team can’t spell household, and whilst this was taking place other Imperial academics continued to add new features related to contact tracing apps.
The released code at the end of this process was not merely reorganised but contained fixes for severe bugs that would corrupt the internal state of the calculations. That is very different from “essentially the same functionally”. - The stated justification for deleting the history was to make “the repository rather easier to download” because “the history squash (erase) merged a number of changes we were making with large data files”. “We do not think there is much benefit in trawling through our internal commit histories”.
The entire repository is less than 100 megabytes. Given they recommend a computer with 20 gigabytes of memory to run the simulation for the UK, the cost of downloading the data files is immaterial. Fetching the additional history only took a few seconds on my home WiFi.
Even if the files had been large, the tools make it easy to not download history if you don’t want it, to solve this exact problem.
I don’t quite know what to make of this. Originally I thought these claims were a result of the academics not understanding the tools they’re working with, but the Microsoft employees helping them are actually employees of a recently acquired company: GitHub. GitHub is the service they’re using to distribute the source code and files. To defend this I’d have to argue that GitHub employees don’t understand how to use GitHub, which is implausible.
I don’t think anyone involved here has any ill intent, but it seems via a chain of innocent yet compounding errors – likely trying to avoid exactly the kind of peer review they’re now getting – they have ended up making false claims in public about their work.
My favorite one is “a fix for a critical error in the random number generator.” In 2020? WTF? I remember reading in 1987 in the book Numerical Recipes by William H. Press, Saul A. Teukolsky, William T. Vetterling and Brian P. Flannery a statement to the effect that libraries could be filled with papers based on faulty random number generation. (I’d give you the exact quote, but the first edition that I used is in my office which I cannot access right now. Why is that, I wonder?). And they were using a defective RNG 33 years later? Really?
“Algorithmic errors” is another eye popper. The algorithms weren’t doing what they were supposed to?
Read the rest. And maybe you’ll conclude that this was a mess that even Winston Wolf could have cleaned up in 40 days, let alone 40 minutes.
“I don’t think anyone involved here has any ill intent”: there’s no need for you to say that – you are far beyond the reach of English defamation law.
Word to the wise: if you want to defame someone in Britain do it in Scotland. Their laws and courts are much more sensible about defamation than those of England & Wales.
Comment by dearieme — May 11, 2020 @ 6:19 pm
As far as Ill intent goes , this excercise is best summed up by the immortal Governor LePetomain’s quote that we have to do something to protect our phoney balonga jobs.
Comment by Sotosy1 — May 11, 2020 @ 6:45 pm
As late as 2007, the default random number generator was, at best, crap. You had to ask for the better version which was not deployed since it might slow code execution.
I’ve always wondered how much garbage out came of that.
Of course, I only knew to ask because of my early 90s edition of Numerical Recipes in C.
Comment by Highgamma — May 11, 2020 @ 7:20 pm
Matlab was the program with the bad default RNG.
Comment by Highgamma — May 11, 2020 @ 7:22 pm
Sheesh Craig, you’re like a dog with a bone. Wassamatter, has the whole “blame China” thing been taken out of the playbook now that you’re talking trade again?
I’m intrigued – did the US really follow our lead on this? Seems fanciful to say the least. Also, how much “better” would the situation be in the US had you not done so?
Coincidentally, I recall one of your unis (Washington?) did a report on the UK’s likely Covid trajectory, with even gloomier projections than Imperial. We did have a good laugh when we saw it.
Final point. You’re probably unaware that Harvey Keitel has made a good living from his Winston persona here in the UK, having starred in the role in a series of TV commercials for several years. He’s kind of an honorary Brit now.
Comment by David Mercer — May 12, 2020 @ 4:33 am
Fergusen’s code was never validated and verified. And yet its output was used to make critical decisions.
Validated means checking to see that all the numerical code is correct. Do the numerical approximations reproduce the outputs of the continuous differential equations. Verified means checking to see whether the code output is correct — that it correctly produces what was intended to be calculated.
Any engineering program used for economically important simulations must go through seriously rigorous validation and verification checks. Hughes and Associates is a company that performs this service for high tech. So is Fauske LLC.
No modeler should be allowed to offer public policy advice without rigorously vetting their model through a professional validation and verification outfit.
Neglect of V&V by a modeler offering public policy advice should be grounds for professional decertification. Use of a non V&V model for public policy ought to be a crime, the penalties of which should be determined by the injuries caused.
Government officials who use non V&V models for public policy should be removed from office — bureaucrats fired, elected officials recalled.
Comment by Pat Frank — May 12, 2020 @ 6:31 pm
I wonder if seeing all of our most important COVID models fail to predict even within the 95% confidence interval is making anyone question our climate models.
Of course not, who am I kidding?
Comment by Ryan — May 14, 2020 @ 8:20 am
@David–Wow. Two posts. About an issue that relates directly to the biggest economic catastrophe in modern history. That makes me obsessive? You are projecting–again. Your ankle biting is obsessive, to the extreme.
Yes, the US place considerable credence in the ICL model. Its predictions were the staple of early government justifications–at both the federal and state levels. Only when the predictions proved disastrously wrong was there a shift to the home-grown IHME model (which originated at the University of Washington, and was basically funded by Bill Gates). Which was horrible, but not so horrible. But the initial panic in the US can be tied directly to Ferguson et al.
Comment by cpirrong — May 14, 2020 @ 5:58 pm
No Craig, specifically its 2 posts on ICL in the past week (3 actually, as I see Neil gets a shout-out in your latest. More on that later). You’ve almost gnawed his ankle off.
BTW I did point out the existential nature of the looming economic crisis way back in early March, in response to your ‘progressives’ post.
Comment by David Mercer — May 15, 2020 @ 2:53 am
@SWP: only apropos of the “Pulp Fiction” clip. I saw this during the week of its premiere. It was billed as ‘terrifying’, but I started laughing once I got Tarantinno’s running joke.
My question is:
Who is the most terrifying character in Pulp Fiction?
I will come back next week with the answer.
Comment by Richard Whitney — May 15, 2020 @ 8:33 am