The story of building a baseball betting simulator.

Forums

Where sports bettors talk!

The story of building a baseball betting simulator.

  • rated by 0 users
  • This post has 10 Replies |
  • 5 Followers
  • StatLabSports.com

    The story of building a baseball betting simulator.

    By Dave Basinger, StatLabSports

    Since I began gambling, one of the things I've been most fascinated with is the concept of simulations and simulators. Human beings have made some amazing machines. Machines that move us around and make our lives easier, machines that help cure diseases and make our lives safer. Can man create a machine that can predict the future? Only the most egotistical and arrogant individual would even think that this was possible.  Therefore it is not surprising the number of people who have attempted such an endeavor in the quest of betting profits.  The quest for easy cash and the High Roller lifestyle notwithstanding, building a sports predicting simulator is a Don Quixote-esque endeavor. So let's tilt at some windmills. (Please forgive the classic literature reference if you do not get it)

    I can remember the first time I went on to Accuscore site, I was really impressed with the layout the format and the concept. I made a point of checking out other sites such as Prediction Machine.com and some other smaller individual efforts to create a simulator. There is no “Building Sports Predictive Algorithms and Simulators for Dummies” or at least I couldn’t find one at the major book distributors. (Hmm...that’s a good idea.)  So with the help of several friends, I began to brainstorm what kind of effort would be needed to create a simulator. First we needed a mathematical algorithm.  Anything that dealt with runs per game we immediately added into the formula, weighting the statistic appropriately as I saw fit. We decided that our focus would be baseball totals. The ultimate nightly question being how many runs could we expect a team to score on any given night? To determine this we would have to combine statistical performances of the past, blended with a current assessment of the team's health ability and energy and winning momentum. Other factors such as the umpires, ballpark effect, weather, and travel schedules that could be quantitatively broken down and correlated to a runs per game factor. It got so detailed after awhile that we were assigning individual run per game numbers for individual players.  This allowed for specific line up runs per game numbers.  As such the nine players could be combined separately for every contest. Thus creating a slightly different RPG number for each contest depending upon substitutions and situations. 

    So with your basic Microsoft Excel spreadsheet we went about creating a huge database of various information focused on the statistic of runs per game. One of the questions we had early was how to value information. Last night's game has significantly more fact on tonight's game than a game played way earlier in the season or even years before. Initially we decided on a three tier system focusing on the last seven days last 30 days and for the whole season. The system worked well up until the All-Star break. After the All-Star break we used a last 7 days, last 15 days and the last 45 days for more accuracy. Factoring in the runs per game allowed by a starting pitcher was relatively easy; the bullpen though was trickier with availability of players becoming a serious factor in bullpen effectiveness.  By keeping track of bullpen use patterns, we could add the values of pitchers likely to play on a given day and remove those ruled out because they were used earlier in the week.

      I was amazed at the amount of information about umpires that is readily available to any bettor. I wonder how many people truly look at that information. You can be assured the sportsbooks do, though they must stay within reason when adjusting games due to the umpire’s tendencies.  We also began to study the set-up positions for umpires and how it affects certain types of pitchers. (Which is a topic for another article.)  Overall, the majority of major league baseball umpires are all alike. But there are small groups of umpires who consistently called games either over or under the average run per game for mutually baseball contest. Many you have heard of. Midway through the season guys like Jerry Davis and Sam Holbrook and Tim McClelland were well on their way to calling significantly more over games than their counterparts. McClellan was hitting 81% on the over. Meanwhile at the bottom end Laz Diaz hitting 74% on the under situational bettors love these guys. I also had to factor them into the simulator they run per game number was easy to find on a variety of websites. Other information such as ballpark effect were equally as easy. Daily weather and wind additions to the model began to become cumbersome, but were necessary especially in Chicago and Colorado where wind plays a dominant role in scoring. This being the first year of information that I was inputting into the system, it became quite labor-intensive. And then the September 1st call ups happened. That was when the world turned into a living hell. You see at this point I was trying to give every player an individual run per game factor.  Add a bunch of new players of which I have no history of, and it was about September 5 when both the machine and the owner melted down. One thing could not be disputed, during the months of May and June program was extremely effective hitting at over 65%. It seemed to most of us that the Microsoft Excel Predictor system could work efficiently within certain limitations of numbers and variables.  There seemed to be a point of diminishing return, where more data was not always better.   

     I could only imagine the size of the computer that was doing the work predicting games day in and day out at AccuScore. Something from one of those sleek 1960s Sci-Fi futuristic TV shows like Lost in Space or The Man from U.N.C.L.E. where the computers were the size of large walk-in closets with reel to reel tape players at the top of the machine for visual effect. As an amateur I wondered: Could I with basic software available to the majority of Americans create a sports forecasting simulator? We concluded that with my current Gateway laptop computer and Microsoft Excel 2007 the answer is a decided NO.  In the end we believe what we have created had several flaws. (These do not include the fire hazard of a very hot laptop!!)  Consider just the statistic of runs per game to start.  At various times throughout the 2011 season the average runs per game varied anywhere between 8.2 and 8.5 depending when and where you were looking. A low-scoring 1-0 game is the least number of runs that could be scored in any contest, this being 7.2 from the mean average. Meanwhile enough blowout games go well over 7.2 from the mean average at 2.1% to slowly start affecting the system around July 10th.  Eventually we determined that the simulator spit out numbers that were slightly high due to the fact that there was no limit to the number of runs that could be scored in a game. While there are about an equal number of low-scoring to high-scoring games, the effect of 16+ run blowouts began to accumulate within the system. I began to try to adjust the statistic by never adding more than 7.2 runs over the league mean score. I should not have done this. In my emotional attachment to the project I added one extra variable, by trying to adjust to the high trend.  Also in the end I think what we had created was a program that determined a run per game number more aligned to the mean average. The simulator could never project or predict an anomaly or blowout game. The numbers were tight to the center, even if slightly high. It seemed to work better as an overall predictor of average runs per game for longer periods of time.  We were not exactly sure what we created. That was when we switched software systems and retired the first program.

    The building of the second-generation system has begun and promises to be more complicated than the previous Microsoft Excel system. We are using new software that should make the data entry easier and the algorithm more accurate and predictive, allegedly. I'm hoping that we can use the system for hockey also, but I must admit that I will miss using that first spreadsheet.  Despite the countless hours of inputting data, that I know will be easily streamlined in this new system, I still feel that old Microsoft spreadsheet could've been perfected. It's that first girlfriend and the “what if?”  That first love. 

    At the end of this year in conjunction with the launch of our new website StatLabSports.com January 1st 2012, we have decided to release the old Microsoft Excel spreadsheet system of baseball totals for download for anyone to use or abuse as they see fit.

    Next Time from StatLabSports: The building of the new baseball totals simulator... the Math teacher goes back to school, Statistics 201 Reviewed…and the early standings of the Junior High School Hilton Contest.

    Dave Basinger

    On Twitter @StatLabSports

     

  • Great read Dave, looking forward to your posts and information, forthcoming.

  • Thanks Midori...for now I'm just hoping for a good end to NFL Sunday picks.  I'm split even going into tonight.  Need Jets to cover...

    Dave Basinger

    On Twitter @StatLabSports

     

  • Hi Dave, Hope 2011 was a great Year for you! Where is the baseball excel spreadsheet on statlabsports.com, I was unable to locate it. Thanks.

  • great read stats.

    Rule #1. Bankroll Management!

    Rule #2. Discipline!

  • Most any computer can build a simulator. Lot's of speculation as to whether Accuscore or Prediction Machine are actually doing as they say. It's the Ole "If they say they are, they must be." Otherwise they could not say that right?

    Nice thoughts tho SLB...

    I am a Pregame.com Director of the Boards

  • Thanks again, I will post a link to the spreadsheet and its formulas as we transition to Baseball at  StatLabSports.  I still need to load 2012 projections for it, which takes time.  Shouldn't be too much longer, but gotta finish strong in NFL first, thus my focus. I appreciate your interest and patience...it wont be much longer.

    Dave Basinger

    On Twitter @StatLabSports

     

  • hell ya, accuscore is crap, hope you can get a better reading than that.  I swear that computer gets a heart set on a team and completely skews the numbers in their favor.  Every sport, Accuscore does it I notice it favors a certain teams numbers and continues to favor that team as it underperforms.  I use it a reference, but not as a tool.

    Make a bet and clinch those butt cheeks!   Ick!

     

  • Excellent material. Get 'em!

    Bruno Bets's last 7 days record in MLB
    SportWinsLossesTiesWin %$ Won
    MLB6663251.16-106.00
  • skyler,

    Sounds like they aren't changing the input as the season goes along. It's as if they use the preseason projections all season if I'm reading you correctly. I've never used it.

    Money management, line shopping and reading the betting markets are just as important as picking the right side.

Page 1 of 2 (11 items) 12