Maze_soba
Well-Known Member
I only did one round of proper testing, and had an 80-20 randomized split between train and test data using sklearn's train test split. Like I said in the original post, the data itself is pretty iffy so I didn't spend a whole lot of time optimizing the model.Was there an attempt to do some cross-validation on your model? Did you try other models? You mentioned training data. How did you allocate the data to a training set e.g., top N-rows, % random sample of the data, half-train/half-test, etc.
Also, I was wondering if you wouldn't mind DMing me the dataset you've been able to compile? I was thinking about creating a predictive model myself today, and when I searched the forum, I came across your post.
And it's good that you did this project!
If time permits, I think I'd be able to create something using R shiny.
But I'd be happy to send over the dataset via DM!! It would be cool to see what a proper model looks like. I was just looking for an excuse to mess around with Sagemaker Canvas to be honest, and was pretty impressed with no-code ML.