The first assignment is all about applying Value Iteration and Policy Iteration to the Gridworld Problem. The PDF describes the solutions and the policies thoroughly.
The second assignment is on game theory. The goal is to design a player for three player's prisoner's dilemma. There are a variety of bots provided to test against and the pdf describes thoroughly the strategy I employed to top the leaderboard. Despite their being a dominant strategy of non-cooperating. We have to use our history and that of the other agents to be able to win. This is because at times we don't know if we are playing against our own player. The detailed architecture and rules followed by the agent are on page two of the pdf.