Board Game Geek Data - Sliced and Diced

Inspired by the discovery of the API for the BoardGameGeek website, I decided to play around with some of its data by some Ruby and JavaScript scripting. One thing I wanted to look at was the interconnectedness of game mechanics, but first I wanted to find alternative ways to look at how to measure ones "use" of ones game collection, and the associated question of "can a board game collection ever be too big?".


AkvariefiskH-index and play per game

For example, the concept of H-index is being discussed quite a bit on the ’geek. In short this entails sorting your games by number of plays and finding the last entry in the list with a number of plays higher than (or equal to) the position in the list. It’s rank is your H-index. (This does, of course require one to meticulously log ones plays, but that is another story… For example my overall BGG H-index for the last 12 months is (or seems to be) a measly 5.) Another concept used is to take the number og games owned and divide that by number of total plays (in for example the last 12 months).

Introducing play time

The problem with both those measures is, obviously, that it disregards play time. A game that takes 10 minutes to play, would (probably) more easily get more play time than a game that takes 4+ hours. So, my first experiment entailed sorting games by play time. To do this I just used the, admittedly rather crude, notion of “play time” in the bgg database and multiply that by the logged number of plays.
Here’s my top 7 by play time: 
  1. [“Pandemic”, 405],
  2. [“The Settlers of Catan”, 270],
  3. [“Puerto Rico”, 270],
  4. [“Nations”, 240], 
  5. [“Coloretto”, 210], 
  6. [“Terra Mystica”, 200], 
  7. [“Dominant Species”, 180]

Compared to my top 7 by number of plays: 
  1. [“Pandemic”,9], 
  2. [“Coloretto”,7], 
  3. [“Martian Dice”,7], 
  4. [“7 Wonders”,6], 
  5. [“King of Tokyo”,6], 
  6. [“Love Letter”,6], 
  7. [“Dominion”,5]

Nothing shocking. (Except that it does highlight how flaky my BGG logging has been at times…)

And now, mechanics

OK, next I explored the game mechanics elements in the BGG database. Each game can be assigned many game mechanics, so I wanted to break down playtime of my own games by mechanics, instead of by game. The formula I came up with is simply the following:
time for mechanics equals sum of play time of game containing mechanic divided by number of mechanics in that particular game
That gives me this list of top 7 game mechanics by play time: 
  1. [“Variable Player Powers”, 517], 
  2. [“Hand Management”, 468], 
  3. [“Card Drafting”, 397], 
  4. [“Set Collection”, 346], 
  5. [“Route/Network Building”, 308], 
  6. [“Variable Phase Order”, 295], 
  7. [“Area Control / Area Influence”, 242]
Akvariesjøløve

The rating of mechanics

One of the main interests of BGG is to be able to rate games, I guess, so by extending my code to generate a list of mechanics by rating. To do this I basically used the same formula as above, but replaced playtime with rating - and normalized it afterwards. So:
rating for mechanics equals sum of ratings of game containing mechanic divided by sum of one over the number of mechanics in each game containing that mechanic.
So, my top 7 mechanics by rating is: 
  1. [“Variable Phase Order”, 9.46, “# of games: 5 (Weight: 2.92)”], 
  2. [“Commodity Speculation”, 9.0, “# of games: 1 (Weight: 0.33)”], 
  3. [“Worker Placement”, 9.0, “# of games: 2 (Weight: 1.17)”], 
  4. [“Area Enclosure”, 8.75, “# of games: 5 (Weight: 2.67)”], 
  5. [“Voting”, 8.5, “# of games: 2 (Weight: 0.67)”], 
  6. [“Deck / Pool Building”, 8.5, “# of games: 4 (Weight: 1.33)”], 
  7. [“Auction/Bidding”, 8.32, “# of games: 4 (Weight: 1.48)”]

The weight is simply the sum of 1/number of mechanics in each game containing that mechanic and gives an indication of “trustwortyness” of the rating, I guess.
You can also look at the bottom 7 by rating, of course: 
  1. [“Dice Rolling”, 6.06, “# of games: 8 (Weight: 2.76)”], 
  2. [“Pattern Recognition”, 5.75, “# of games: 2 (Weight: 1.33)”], 
  3. [“Hex-and-Counter”, 5.0, “# of games: 1 (Weight: 0.5)”], 
  4. [“Roll / Spin and Move”, 2.0, “# of games: 3 (Weight: 1.64)”], 
  5. [“Betting/Wagering”, 2.0, “# of games: 1 (Weight: 0.33)”], 
  6. [“Paper-and-Pencil”, 2.0, “# of games: 1 (Weight: 1.0)”], 
  7. [“Stock Holding”, 2.0, “# of games: 1 (Weight: 0.14)”]

Combos

But these mechanics doesn’t operate on their own, so I wanted to see what combinations of mechanics appear often - or never, in my games and generated the following lists.
Top 7 game mechanics combined with “Hand Management”: 
  1. [“Set Collection”, 20], 
  2. [“Card Drafting”, 15], 
  3. [“Variable Player Powers”, 10], 
  4. [“Area Control / Area Influence”, 9], 
  5. [“Simultaneous Action Selection”, 9], 
  6. [“Deck / Pool Building”, 8], 
  7. [“Route/Network Building”, 7]

Top 7 game mechanics combined with “Set Collection”: 
  1. [“Hand Management”, 20], 
  2. [“Card Drafting”, 15], 
  3. [“Tile Placement”, 10], 
  4. [“Variable Player Powers”, 9], 
  5. [“Area Control / Area Influence”, 8], 
  6. [“Worker Placement”, 7], 
  7. [“Simultaneous Action Selection”, 6]

Top 7 game mechanics combined with “Card Drafting”: 
  1. [“Set Collection”, 15], 
  2. [“Hand Management”, 15], 
  3. [“Variable Player Powers”, 7], 
  4. [“Route/Network Building”, 6], 
  5. [“Tile Placement”, 5], 
  6. [“Deck / Pool Building”, 5], 
  7. [“Simultaneous Action Selection”, 4]

And so on… 

And (inter)action

Fascinating. However it is always fun to visualise data. To get an interactive visual apreciation of the interconnectedness of these mechanics I decided to try to visualise them using D3. First with a force graph. Each node represents a game mechanic, their colours reflect the ratings, their size their weights, and the edges in the graph is the number of times they appear together in the same game.
This is what that entangled mess looks like:
Next I wanted to see if a matrix view of it would make more sense, so I ended up with this:

The Possible Futures

Simplifications

There's still too much going on in these visualisations, so some simplification could be interesting to look into. Like PCA - principle component analysis, why not?

Look at arbitrary user data

For now my scripts are locked to my own user data, but there is of course nothing hindering me in writing this as a kind of web service where users can explore their own data.

Improving the visualisations

There are many ways to improve the visualisations. The ones I presented above are not-very-heavily-modified versions of Mark Bostock's D3 example code.

Conclusion

The moral of the story is that I'll try to be better at logging my board game plays...
Also, if anyone wants to improve upon my "work" I've left all my Ruby code as a gist on github.