Friday, July 17, 2009

SC Engine: Part 5 - Linking the Video to the Game

- Introduction
- Part One: System Overview
- Part Two: System Overview: Messages and Applications
- Part Three: Screen Scraping
- Part Four: YouTube Parsing
- Part Five: Linking the Video to the Game
- Part Six: Messaging Middleware
- Part Seven: The Console
- Part Eight: The Site

The main part of SC Engine is the ability to take the scheduling data, including the names of players and teams and what games they've played, and linking it with the youtube data. The app name I use for this is called "Video Game Linking". It allows me to specify what youtube id corresponds to what game in the schedule.

Let's just jump right into the code:

Video Game Linking Module

First off, some notes on strange stuff. "repo" stands for "repository", and it's basically my generic dictionary wrapper. The dictionary is wrapped in order to ensure that after I'm done using it, it can be saved to a file. It's not wrapped in a proxy-object sense, rather it is stored so that the proper way to access it is within a context. On it's own, a repository could work like this...


store = PickleFileStore()
repo = Repository(store)

with repo.use() as data:
data['a'] = 1


When repo.use() is called, it creates a context manager that, on closing, saves off the data. The use decorator that I've utilized on my app just wraps this functionality and adds the resulting data as arguments to the function. The proleague_match_id_announcement method, without the decorator, would look like this...


def proleague_match_id_announcement(self, msg):
with self.repo.use('games') as games:
games.add_proleague_match(msg.match_id, msg.team_one, msg.team_two)



...so really, all it helps in doing is making the function use one less tab, which is always a plus in my book.

Also, the "add_repository" method just allows for custom types to be used as the object you're working with, where without this your data would just be a dictionary. By allowing for a custom type (that takes in the dictionary as the argument to the constructor), I can easily wrap complex logic into other objects while still utilizing the implicit saving of the store. I probably did a horrible job explaining this, but I think the important part is what the application is doing anyway.

The typical workflow for this app is as follows:


  1. Somewhere else, a new match is found for the schedule, and an id is given to it. The ProleagueMatchIdAnnouncement is sent out, which gives info regarding the match and the id given to it. The app saves the information that it needs.

  2. A little while after, the GameDataAnnouncement message arrives. It has the match id, and the game number to determine what game in the match it is. The rest of the data is game specifics (id of players, id of map, winner, etc.). Once again, the important data is recorded, joined with the match data.

  3. A few hours to days later, someone posts up a youtube clip of the game. We receive the YouTubeParseAnnouncement message, with data such as ids of the possible players and teams, dates found and game numbers found. We go through our collection of games and find any that match. If more than one does, we send out a "Partial Parse" message (which will eventually allow me to look at these videos manually), but if we only find one, we're (mostly) sure it's it, so we send out a VideoGameLinkAnnouncement to signify this.



You'll also notice that there is this idea of a "manual link". Sometimes I stumble across a video that has something different or wrong with it that makes it very difficult for the engine to find. For example...

Siz)KaL vs Midas [30 November, 2008] 32set @ Proleague

This video's title reads "Siz)KaL vs Midas [30 November, 2008] 32set @ Proleague". I can gather from the title that Siz)Kal and Midas are the players involved, and that it takes place on November 30, 2008. The game in the video is the 3rd game of the match, but a typo (32Set instead of 3Set) means the parser see it as the 32nd game. A future goal is to have the search algorithm better handle such problems, but in the mean-time and for very extraordinary situations, I can manually choose the game.

As for the game search repository itself...

Game Search Repository



As you can see, the GameSearchRepository takes into the constructor the store, which is simply a dictionary object. It saves info about matches and games, and when trying to find games, will run the spec through all the known games and return the results. The repository needs to save the match data so that when the game data comes, it can combine the two together. The GameData object looks like this:

Game Data

Really, there's not much to say here. The game search consists of finding all the game data that return True from matches_spec, meaning that all the info in the spec correlates to the data it has on the game. As O(n), this has some performance potential, but with an entire season's worth of games in memory I haven't noticed a visible slowdown enough to start messing with it now.

No comments:

Post a Comment