Date of Award

Spring 1-1-2017

Document Type


Degree Name

Master of Science (MS)

First Advisor

Jason R. Marden

Second Advisor

Michael C. Mozer

Third Advisor

Behrouz Touri


Markov Decision Processes (MDPs) are discrete-time random processes that provide a framework to model sequential decision problems in stochastic environments. However, the use of MDPs to model real-world decision problems is restricted, since they often have continuous variables as part of their state space. Common approaches to extending the use of MDPs to solve these problems include discretization which suffers from inefficiency and inaccuracy. Here, we solve MDPs with continuous and discrete state variables by assuming the reward to be piecewise linear. We however allow for the continuous variable to have an infinite and continuous transition function. We then use our approach to solve an MDP modeling human behaviour in a specific task called delayed gratification. Simulation results are presented to analyze the model predictions which are fit post-hoc to synthetic as well as human data, to justify the approach solving the MDP and modeling behaviour.