Two Days of Tinkering: The October Changelog
I wanted to write down the story behind the latest SportingCP.ai update in plain English, the same way I explained it to friends and family. This isn’t a commit log-it’s the play-by-play of how the model and site took shape over two fast-paced October days.
October 30 - Starting From Zero
The day began with a blank project folder, a pile of CSVs, and a simple goal: “Can I get a basic machine-learning model to give me honest odds for Sporting matches?” I stitched together a lightweight pipeline that cleaned match histories, calculated straightforward stats like recent goal differential, and trained a logistic regression model. Nothing fancy yet-just enough to produce a baseline probability for a Sporting win, loss, or draw. By sunset I pushed those numbers through a scrappy UI so the homepage could show the next opponent with a confidence score. It worked, but it was obvious the model needed more context than a handful of averages.
October 31 (The start) - Teaching the Model Nuance
First, I tackled the “xyz” list of improvements I had scribbled the night before:
- Form momentum: a rolling five-match expected-goals swing so the model sees how hot (or cold) Sporting is entering a match.
- Travel and rest: a simple fatigue proxy using rest days and whether the club just played away in Europe.
- League signal: ELO-style ratings blended with opponent strength so the probabilities respect who is on the other side of the pitch.
Those new features instantly made the training set feel richer, and retraining the model gave smoother curves-no more wild leaps when only one stat shifted.
October 31 (Afternoon) - Feature Engineering Cleanup
With the “what” upgraded, I rewired the “how.” Feature engineering needed guardrails so I rebuilt the transforms into a single notebook-like script that:
- Locks column names and units.
- Runs sanity checks on missing data.
- Stores processed features for both the model and the API cache.
Cleaning this up meant the web app now reads the exact same signals the model used during training, which keeps the predictions consistent from notebook to production.
October 31 (Evening) - The Realization
While replaying recent matches alongside the model output, I noticed a gap: the predictions still ignored context that fans and bookmakers naturally factor in-injury whispers, lineup rotations, even weather quirks. That’s why some of the probabilities still felt “robotic.” The obvious fix is to let smart, curated odds act as a teacher, lending the model the qualitative insight it can’t gather from historical stats alone.
What’s Next
The next build will weave detailed pre-match odds directly into the training data. Odds are a condensed scoreboard for everything I can’t track in real time-lineups, travel straw polls, tactical rumors, micro-injuries, and weather switches. Feeding that signal into the feature set should anchor the model to reality while still letting it surface Sporting-specific edges. Think of it as pairing intuition with instrumentation.
Thanks for following along! If you have thoughts on other signals the model should learn from, I’m all ears.