Hi all, I got bored over the weekend, so I finished the analysis for the May pilot board using the ML model I built, and there’s a decent amount to unpack.
First, I updated the training data to include results from the beginning of 2020, and quickly retrained the model.
Running the May board scores through, the model accurately predicted True 72% of the time, and False at 28%.
Here’s the prediction/actual matrix.
| Prediction | |
Actual | Prorec-Y | Prorec-N |
Prorec-Y | 33 | 4 |
Prorec-N | 12 | 8 |
Taking a look at the false predictions:
View attachment 35927
- There are definitely weird exceptions. A 60 8/8/8/ and 67 8/9/8 were not selected. Maybe it was the age, but there must’ve been something else going on with the package, or this was bad data.
- Some people got hyper lucky getting in with a 5/5/5. By all estimates, this would have been a N based on the model, but they somehow got it. The “N” prediction also probably has to do with the fact that only the nerds post their scores here so the model is biased to higher scores.
- Long story short, the model is doing a pretty decent job right now.
Next, I added the May board results to the training data (sans ISPP) and it should be much more robust than the first version I made last December. If I get really bored before shipping out, I might see about making a simple web app.
A couple of things I’ve noticed after retraining + analysis of the new model:
- OAR and PFAR have the most impact on prorec status with each contributing about 24% to the final result. This isn’t to say the OAR is important to getting a Y, but that it might mean you’re more likely to do better on PFAR if you have a higher OAR. Correlation != causation and all that jazz.
- Age and GPA are the next most important, with each contributing 10% to the final result. Age obviously is a negative impact, but it isn’t a huge amount.
- Flight exp, prior service, and sex still have basically no impact on the outcome (combined, only contributes ~9% to the final result)
And to those of you who requested that I run through your scores, I’ll be PMing you the results!
*disclaimer: this analysis is pretty half-assed and there are big holes in it. But, if you still want me to run your scores though, please let me know.