When the Free Lunch Is Slightly Poisoned

For reasons I have mostly stopped trying to justify, I spend my spare time building little AI systems that pretend to gamble. One picks horses. One picks shares. Both run on a small computer in my living room, both bet entirely in imaginary money, and both, I am now fairly sure, were built largely so I would have something to be wrong about in public.

Buckle in.

A fortnight ago I wrote a cheerful little update on the horse-racing one, a paper-money tipster I call Twinkle Toes, and told everyone it had quietly fought its way to a draw with the market. Honest at last. Knew when to keep quiet. I was rather pleased with it, and with myself. The snag is that the scoreboard I was reading those figures off had a hole in it, and the hole happened to be exactly the shape of all the losing bets.

So consider this a gentle correction, wrapped around a slightly larger point.

Both of my systems run on free data, the kind you scrape off the internet for nothing, and free data has a way of turning on you that I had not fully appreciated until it did it to me twice in the same year, in two completely opposite directions. One system used it to flatter me. The other used it to quietly bleed. Same cheap data, same blind spot, two very different ways of paying for it.

None of this is real money, I should say up front, in case anyone is reaching for a phone to stage an intervention. The horses are real, the prices are real, the bets are make-believe. The only thing ever at stake was whether I could trust my own results, which turned out to be the expensive thing after all.

The data that flattered me

It surfaced on a Saturday morning. The system had settled a bet on a horse called Crystal Glance and marked it VOID, stake returned, nothing happened, move along. A void is a non-event. The race ran without your horse, the money comes back, no harm done.

Except Crystal Glance had run. It had gone off at Bangor at a price of 2.10, jumped a few hurdles, and fallen. That is not a void. That is a loss. “Nothing happened” and “you lost” are not the same sentence, and the gap between them turned out to be load-bearing.

Here is the bug, and it is a small one, which is the worst kind. A racing result records where each horse finished as a number. But a horse that does not complete is filed under a letter: F for fell, PU for pulled up, UR for unseated rider, BD for brought down. The parser wanted a number, found a letter, and shrugged out a blank. A horse withdrawn before the race also has a blank where its finishing position would be. So at settlement, a faller and a non-runner looked identical. Both blank, both read as “never ran,” both quietly voided. The tell was there the whole time: a horse that runs has a starting price, and a withdrawn one has none. Crystal Glance went off at 2.10. It ran. One field, already in the data, would have told the difference, and nobody had thought to ask it.

Then I added it up. Eight hundred and sixty-three settled bets, every one of them re-graded from void to loss. Nearly seven and a half thousand runners wrongly filed as non-runners. That week’s report had been glowing: £125 of profit across twelve settled bets, a win rate of fifty-eight per cent, a positive closing-line number that I had read as proof of a genuine edge. It even congratulated the four-way agreement bets on a perfect hundred per cent win rate and advised me to put more money on them. Once the fallers were graded as the losses they were, the glow went out. The healthy two hundred or so I thought I was up restated itself down through nothing and out the other side, into the red. The whole of that comfortable margin had been losing bets the system was filing as things that never happened.

Now sit with the direction of the error for a second, because the direction is the entire point. This bug could only ever delete losses. It was constitutionally incapable of touching a win. Winners finish first, and first is a number, so a winner always had a finishing position on file and sailed through untouched. Only the fallers and the pulled-ups dissolved into blanks. And because voids are thrown out of the win-rate maths entirely, every loss the bug erased simply ceased to exist. Not counted as a loss. Not counted at all.

A machine grading its own homework with an eraser that only rubs out the wrong answers will come top of the class every time. Mine did. Every win rate I had quoted was inflated by the exact losses the system had been quietly disappearing, and I had gone and put the inflated number in an article.

There was a second sting, and this one I deserved. Weeks earlier I had written a regression test, the little automated check that screams when behaviour changes. That test took a real faller, a 2.25 favourite that had run and pulled up, and asserted that the correct thing to do was treat it as a non-runner. I had caught the bug once, decided it was a feature, and nailed it down with a test so that no future fix could sneak past me. The green tick was not protecting correct behaviour. It was guarding the wound. Fixing the bug meant overruling a decision my past self had been confident enough about to write down.

One thing came through it clean. Closing line value, the measure of whether you beat the price the market settled on, is computed purely from prices and never touches the grubby business of who finished where. While every other number was a fiction, that one stayed honest, and it has anchored everything since. Not by luck. It simply never depended on the part of the data that was broken.

The data that bled me

The trading system, Gekko, had the same disease pointing the other way. Where Twinkle Toes lied to flatter me, Gekko’s data told the truth too slowly to be any use, and that cost real paper losses rather than imaginary paper gains.

Gekko stands on a ladder of price feeds: Trading 212, then Polygon, then Finnhub, then Yahoo, each catching what the one above drops. The only rung that reaches every stock I care about, US and UK alike, is Yahoo, fetched through a library that is really just a polite scraper pointed at a webpage Yahoo can rearrange any morning it likes. The paid feeds cover the US and stop at the Atlantic. All my UK coverage hangs off the scraper.

Free data does not fail loudly. It fails silently, which is much worse. When Yahoo changed something, every analysis came back with the same useless error and the scans just quietly died. The system could not tell the difference between “no opportunities today” and “the feed has evaporated,” and a system that cannot tell those apart will sit there looking calm while it goes blind. I had to build a tripwire that shouts when more than seventy per cent of analyses fail in a session, purely so I would know the silence was a fault and not a quiet day. We later cancelled the one paid feed at twenty-nine dollars a month: an audit showed it was redundant, the same fetches already covered elsewhere, and it was a daft cost against a notional three-grand pot anyway.

But the data delay was not really the disease. It was a symptom of a worse decision, which was mine. I had built Gekko as a chaser. Scan the market, rank what is moving, pounce. That only pays if you act before the edge rots, and on a feed you poll, built from a page you scrape, you are structurally a beat behind everyone trading on a real feed. You are reading yesterday’s weather and dressing for it.

Lateness wore a lot of costumes. One stock dropped twelve per cent in forty-one minutes while the position sat there unprotected, because the stop could not be placed as fast as the price was moving. Pre-market orders filled hours later into collapses that were already well underway, so I had to build a routine whose only job was to cancel stale orders before the system acted on a decision the market had already torn up. The stop checks fought a lag in the broker’s own order feed and kept double-placing against a picture that had not caught up with itself.

The economics of being late are not subtle. The UK book ran eighty-six trades over two months and lost money at a win rate of twenty-eight per cent, with barely one trade in forty ever reaching its target. I killed UK entries entirely. A wider audit of the chasing era found something I did not want to see: the damage was not one missed crash I could shrug off as bad luck. It was hundreds of small, fast, adverse moves, scattered across nearly every trading day in the window, each one already finished by the time a free feed could show it to me. By the middle of May the portfolio was down to a single share. One. A monument.

Never give up, never surrender

Thirty-odd years of designing games teaches you one reflex above all others. When a thing will not work the way you wanted, you do not keep hammering it, and you do not give up either. You design around it. The constraint stops being the obstacle and becomes the brief.

So I did not try to buy my way out. Better data was not affordable, and more to the point it would not have been enough, because you cannot scrape your way to the front of a queue the professionals are paying to skip. The feed was the constraint. The job was to design a strategy that did not need it to be fast. So I stopped asking “what is moving right now,” which only a fast feed can answer, and started asking “what has fallen further than it usually does,” which a slow one can answer perfectly well.

The new strategy buys statistical dips, a stock that has dropped more than one and four-fifths standard deviations below its twenty-day average, and sizes the bet with a simulation run over a year of that stock’s daily moves. The crucial property, the one that makes the whole thing work, is that every input is a daily figure. The signal is twenty daily closes. The sizing is two hundred and fifty-two daily closes. Nothing that happens inside a trading day touches the decision at all. A bet on a stock returning to its average does not care whether you are three minutes late or three hours late, because it is waiting days for the reversion to arrive. The clock that killed the chaser simply does not apply.

And then we stopped depending on the live feed for the decision at all. We built a local database: three years of daily prices, around eight hundred and fifty names, something like six hundred and forty-seven thousand rows, sitting on our own machine and refilled by one batched pull overnight instead of five thousand frantic scraper calls a day. The scan reads our own copy. If the data is stale, it skips the day rather than trading on a guess. Which is, I noticed afterwards, exactly the same instinct Twinkle Toes arrived at from the opposite direction: own the data, trust the price, and refuse to bet on the part you cannot stand behind.

What free data costs

Twinkle Toes was optimistic and erased my losses. Gekko was pessimistic in outcome and kept acting on a stale picture until the picture cost it everything. Both failed in the same place: at the hard edge the strategy most depended on. The fall, the pull-up, the unseated rider in racing. The fast gap, the catalyst, the collapse in trading. Free data is weakest exactly where your strategy needs it strongest, because the common, nothing-happening cases are cheap to get right and the rare, decisive ones are the ones that fall through the cracks.

Both systems only came good once they stopped fighting that. Racing anchored on the closing price, computed from numbers the mess could not corrupt. Gekko anchored on the end-of-day close, the one thing a free feed delivers reliably, and quietly stopped betting on everything it could not. You do not beat a data constraint by wishing for cleaner data. You redesign the question until the data you can actually get is good enough to answer it honestly.

There is a habit I have taken from all of this. When results look bad you go hunting for the bug, because bad news makes you suspicious, and suspicion is healthy. The trap is that good news does not. When the numbers come back glowing, the first question is not how much edge do we have. It is which of these numbers is a recording artefact. I learned that the expensive way, in public, in a paragraph I have now had to walk back. The market was honest with me the whole time. It was my own bookkeeping that lied.

The figures here are drawn from the systems’ own paper-trading databases. Both are paper accounts throughout, and no real money has been staked or lost. The Twinkle Toes performance figures in my previous article were inflated by the settlement bug described above and are corrected here.

The views expressed in this article are my own and do not represent the views of my employer.