Wednesday, March 23, 2016

March Madness in a shell script

I don't really follow the sportsball. But I do like to engage with others at my office who do. So every Spring, I feel somewhat left out as everyone at my office gets wrapped up in college basketball fever. I watch them fill out their NCAA March Madness brackets, and I always think about participating, but I know nothing about the sport other than you have to dunk the orange ball in the other team's hoopy net.

I'd like to take part in the fun, maybe put my five dollars into the office pool, but I just don't know enough about the teams to make an informed decision on my own March Madness bracket. So a few years ago, I found another way: I wrote a little program to do it for me.

Computer models that predict the outcome of matches isn't a new topic. You can find lots of models, including the Pythagorean Expectation Model or other algorithms that build on previous team performance to predict future game outcomes. But that requires some research into sports statistics and following each team throughout the season. That's too much work for me. I used a different method that should be familiar to many of my fellow nerds: the "Dungeons and Dragons Model."

That's right! You can apply a simple D16 method to build a March Madness basketball bracket. How scientific is this? Probably not very, but it's enough to give me a stake to follow March Madness basketball, but not enough that I feel particularly saddened if my bracket doesn't perform well. That's good enough for me.

Let me show you how to build a Bash shell script to build your own NCAA March Madness bracket. I'll use the following three simple assumptions:
  1. The NCAA March Madness basketball brackets are seeded with the NCAA's ranking of 64 college basketball teams, divided into four regions, and ranked #1 through #16.
  2. The NCAA March Madness basketball brackets are always initialized with the same contests: #1 vs #16, #8 vs #9, #5 vs #12, and so on.
  3. A #1 ranked team should perform better than a #16 team, but a #8 team should perform about the same as a #9 team.
Using these assumptions, let's examine the D16 method. In a tabletop role-playing game, you might throw a 1D16 to determine the outcome of an encounter. You would compare the value of the 1D16 to a player's statistic, such as Dexterity or Strength. This kind of throw assumes a "probability" or "likelihood" of an outcome based on the relative strength of the player. A player with a high Dexterity is more likely to dodge a blow than a player with a lower Dexterity. Usually, I see the DM compare the 1D16 to the player's statistic to determine the outcome.

Similarly, we can compare the outcome of a 1D16 to a team's NCAA ranking to determine the outcome of a team's performance. A #1 team should be a strong team, so let's assume the #1 team has fifteen out of sixteen "chances" to win, and one out of sixteen "chance" to lose. Without any other inputs, the #1 team would win if the 1D16 value is two or greater, and the #1 team would lose if the 1D16 value is one.

Using this assumption, we can throw a 1D16 to determine if team "A" wins, and a 1D16 to determine if team "B" loses, or vice versa. If the two throws agree, we know the outcome of the game.

In Bash, we generate a random number every time you reference the $RANDOM environment variable. The variable returns a value between 0 and 32,767, but we want a number between one and sixteen. We can reduce the random number's range to sixteen values by using the modulo operator. Using modulo 16 returns a value between zero and fifteen. Adjusting that to a number between one and sixteen is simple addition:
d16=$(( ( $RANDOM % 16 ) + 1 ))
Here's a Bash function that assumes two inputs are the NCAA rankings of two teams, team "A" and team "B." Using the D16 method, the function predicts the winner of a game and returns the winning team in the function's exit value.
function guesswinner {
 rankA=$1
 rankB=$2

 d16A=$(( ( $RANDOM % 16 ) + 1 ))
 d16B=$(( ( $RANDOM % 16 ) + 1 ))

 if [ $d16A -gt $rankA -a $d16B -le $rankB ] ; then
         # team A wins and team B loses
         return $rankA
 elif [ $d16A -le $rankA -a $d16B -gt $rankB ] ; then
         # team A loses and team B wins
         return $rankB
 else
         # no winner
         return 0
 fi
}
Of course, the D16 method assumes the two outcomes agree. While this method works most of the time, it's possible that neither results in a winner. A simple workaround is to try again. I find the outcomes agree within one or a few throws, but for an evenly-matched game, such as a #1 team against a #2 team, you might have to give up after too many attempts.

With that assumption, let's write a Bash function to repeatedly call guesswinner until the two outcomes agree. The function prints the match-up, prints the winner, and returns the winning team via the exit value.
function winner {
 teamA=$1
 teamB=$2

 echo -n "$teamA vs $teamB : "

 count=0

 # iterate and return winner, if found

 while [ $count -lt 10 ] ; do
         guesswinner $teamA $teamB
         win=$?

         if [ $win -gt 0 ] ; then
                 # winner found
                 echo $win
                 return $win
         fi

         count=$(( $count + 1 ))
 done

 # no winner found, return a default winner

 echo "=$teamA"
 return $teamA
}
The = in the last echo statement helps you see if the function was unable to determine a winner after ten attempts.

With these two functions, it's very simple to run through all the first-round games to determine winners, then iterate through those winners to build the rest of the basketball bracket. A few echo statements help us to follow each round in the bracket. The function returns the winner of the bracket via the return value.
function playbracket {
 echo -e '\nround 1\n'

 winner 1 16
 round1A=$?

 winner 8 9
 round1B=$?

 winner 5 12
 round1C=$?

 winner 4 13
 round1D=$?

 winner 6 11
 round1E=$?

 winner 3 14
 round1F=$?

 winner 7 10
 round1G=$?

 winner 2 15
 round1H=$?

 echo -e '\nround 2\n'

 winner $round1A $round1B
 round2A=$?

 winner $round1C $round1D
 round2B=$?

 winner $round1E $round1F
 round2C=$?

 winner $round1G $round1H
 round2D=$?

 echo -e '\nround 3\n'

 winner $round2A $round2B
 round3A=$?

 winner $round2C $round2D
 round3B=$?

 echo -e '\nround 4\n'

 winner $round3A $round3B

 return $?
}
Finally, we need only call the playbracket function for each of the four regions. We are left with the "Final Four" with the winners of each bracket, but I'll leave the final determination of those contests for you to resolve on your own.
#!/bin/bash

function guesswinner {
 …
}
function winner {
 …
}
function playbracket {
 …
}

echo -e '\n___ MIDWEST ___'

playbracket

echo -e '\n___ EAST ___'

playbracket

echo -e '\n___ WEST ___'

playbracket

echo -e '\n___ SOUTH ___'

playbracket
Every time you run the script, you will generate a fresh NCAA March Madness basketball bracket. It's entirely random, based on a D16 prediction similar to Dungeons and Dragons, so each iteration of the bracket will be different. In my experience, the D16 prediction works pretty well for the first few rounds, but often predicts the #1 team will make it to the fourth round. It's not a very scientific method, but I'll share that my computer-generated brackets usually fare well compared to others in my office.

The point of using a script to build your NCAA March Madness basketball bracket isn't to take away the fun of the game. On the contrary, since I don't have much familiarity with basketball, building my bracket programmatically allows me to participate in the office basketball pool. It's entertaining without requiring much familiarity with sports statistics. My script gives me a stake to follow the games, but without the emotional investment if my bracket doesn't perform well. And that's good enough for me!
Curious to see my brackets? The output isn't in "bracket" format, but you can see my bracket below, as predicted by my script (at least, as of today):
$ ./basketball.sh

___ MIDWEST ___

round 1

1 vs 16 : 1
8 vs 9 : 8
5 vs 12 : 5
4 vs 13 : 4
6 vs 11 : 6
3 vs 14 : 3
7 vs 10 : 10
2 vs 15 : 2

round 2

1 vs 8 : 1
5 vs 4 : 4
6 vs 3 : 3
10 vs 2 : 2

round 3

1 vs 4 : 1
3 vs 2 : 3

round 4

1 vs 3 : 1

___ EAST ___

round 1

1 vs 16 : 1
8 vs 9 : 9
5 vs 12 : 5
4 vs 13 : 4
6 vs 11 : 6
3 vs 14 : 3
7 vs 10 : 7
2 vs 15 : 2

round 2

1 vs 9 : 9
5 vs 4 : 5
6 vs 3 : 3
7 vs 2 : 2

round 3

9 vs 5 : 5
3 vs 2 : 3

round 4

5 vs 3 : =5

___ WEST ___

round 1

1 vs 16 : 1
8 vs 9 : 9
5 vs 12 : 5
4 vs 13 : 4
6 vs 11 : 11
3 vs 14 : 3
7 vs 10 : 10
2 vs 15 : 2

round 2

1 vs 9 : 1
5 vs 4 : 4
11 vs 3 : 3
10 vs 2 : 2

round 3

1 vs 4 : 4
3 vs 2 : 2

round 4

4 vs 2 : 4

___ SOUTH ___

round 1

1 vs 16 : 1
8 vs 9 : 9
5 vs 12 : 5
4 vs 13 : 4
6 vs 11 : 6
3 vs 14 : 3
7 vs 10 : 7
2 vs 15 : 2

round 2

1 vs 9 : 1
5 vs 4 : 4
6 vs 3 : 6
7 vs 2 : 2

round 3

1 vs 4 : 1
6 vs 2 : 6

round 4

1 vs 6 : 1
So for my Final Four, I'm left with Midwest #1, East #5, West #4, and South #1. I wonder how I'll do in my office pool?
image: Wikimedia (public domain)

2 comments:

  1. The next round of Sweet Sixteen kicks off tomorrow (March 24) but I originally wrote my "March Madness in a shell script" article before the first round started. (It's been queued up.)

    I'll probably write a wrap-up article when the NCAA championship is over, to compare how my brackets did to the actual games. But with Sweet Sixteen yet to play, here's how my brackets stand in first and second rounds:

     First round → Second round:

     South: 6 contests correct, 2 not → 2 contests correct, 2 not.
     West: 6 correct, 2 not → 4 correct.
     East: 7 correct, 1 not → 1 correct, 3 not.
     Midwest: 4 correct, 4 not → 2 correct, 2 not.

    Following the standard method for How to score March Madness brackets, each round has 320 possible points. In round one, assign ten points for each correctly selected outcome. In round two, assign twenty points for each correct outcome. And so on.

    So by my math:

     Round 1: 23 × 10 = 230 pts.
     Round 2: 9 × 20 = 180 pts.

    That's 420 pts so far.

    Going into the Sweet Sixteen, 5 of my teams are still "alive" (3 are not). So far, all of the teams I predicted for the semifinals, Final Four, and final game are still in play.

    If the rest of my games go as predicted by my shell script, my bracket can be a maximum of 1580 pts.

    Did you run the shell script for yourself? How are your brackets doing?

    ReplyDelete
  2. If you're following my brackets, you probably noticed my brackets have wiped out. I'll post a wrap-up after the championships are over, if you want to compare how your brackets fared vs mine.

    ReplyDelete