How to choose the state variables in a dynamic programming

jpfeifer, thank you very much! I put the question here.

The question is how to choose the state variables in a dynamic programming problem.

For example, in page 100 of Walsh’s Monetary Theory and Policy (3e), the budget constraint is as (3.18). When one set up the dynamic programming problem, one can define a new varialbe such as “wealth”, which is the combination of B_{t-1}, K_{t_1}, as the state varibles.

w_t : = f(k_{t-1}) + (1-delta)k_{t-1} + t_t + m_{t-1} + (1+i_{t-1}) b_{t-1} = c_t + m_t + b_t + k_t

When I set up the Bellman equation, I have two choices:

  1. define a single variable w_t just as above, so the state variable is “one” variable w_t.
    or
  2. use k_{t_1}, m_{t_1} and b_{t-1} as state variables, so I have “three” state variable.

The questions are

  1. Which is the ‘correct’ choice and Why? Are the two give the same results in general?
  2. What is the general principle (if such principle really exists) to choose the state variables in
    a dynamic programming problem?

A state variable is everything that cannot be changed at time t. For example, the capital stock k_ {t-1} in your stock at the end of period notation is a state, because it is used for production at time t, but cannot be changed anymore at time t, because that would require changing yesterday’s investment (i_{t-1}).

This is the general rule: everything that is relevant at time t, but cannot be changed anymore is a state.

The example you describe is special in the respect that the authors are only interested in the total evolution of wealth, not in the one of the subcomponents. In this case, a composite state variable can be defined. That would not be possible if you would like to know the evolution of the variables separately.

Than you very much!

Dear jpfeifer,here is another question about choice of the state variables.
In a small open economy model as described in Coeurdacier, Rey and Winant ‘s “The Risky Steady-State”,
the Euler equation for consumption is
u’(c_t) = beta u’(c_{t+1}) r_{t+1}
If we choose the end of period ‘t’ net wealth ‘w_t’ as state varibles, we get the budget constraint as in the paper
w_t = w_{t-1} r_t + y_t - c_t
and the period ‘t’ state variable is w_{t-1}.
i can put c_t = w_{t-1} r_t + y_t - w_t into the Euler eqation and get
u’(w_{t-1} r_t + y_t - w_t) = beta u’(w_{t} r_{t+1} + y_{t+1} - w_{t+1}) r_{t+1}
This is the key equation to solve the linear solution w_{t} = w0 + w1w_{t-1} + w2y_t + w3*r_t.

if I choose the wealth after the agent receives the endowment y_t in period t (b_t) as the state varible, I get the budget constraint:
b_{t+1} = (b_t - c_t) r_{t+1} + y_{t+1}
the period ‘t’ state variable is b_t.
using the dynamic programming method, one can get the same Euler equation as above,
which is u’(c_t) = beta E_t u’(c_{t+1}) r_{t+1}

the question is, can i put’ c_t = w_{t-1} r_t + y_t - w_t ’ into the Euler eqation to get
u’( b_t - (b_{t+1} - y_{t+1})/r_{t+1} ) = beta E_t u’( b_{t+1} - (b_{t+2} - y_{t+2})/r_{t+2} ) r_{t+1}?

and how can i get the value of the left hand side without the expecatation operator E_t (y_{t+1} is stochastic variable)
Did i make something wrong?

Again there is one unique timing that is spelled out in the paper.

w = w(-1)*r + y - c

is the correct timing. w(-1) is the state inherited from last period. r and y realize and then w is chosen, which is clearly not predetermined.

What’s the main difference between the two types of budget constraint?

w = w(-1)*r + y - c
b = (b(-1)-c(-1))*r + y 

the second one was also used by Winant in the paper Dynamic Portfolios in DSGE Models
‘w’ is the end of period ‘t’ wealth. ‘b’ is the wealth after the agent receives the endowment y in period t.
It seems strange if i use the second one to put “c = b - (b(+1)-y(+1))/r(+1)” and “c(+1)= …” into the Euler eqation.

Yes, the strange part comes from y being stochastic and not known at the time when the consumption decision takes place. But that does not matter when plugging in, because the equation simply defines an algebraic relationship that has to hold for all occurrences of variables. It does not tell you how the actual solution looks like.

Thank you very much!

In case of lag of one period, what will be the state variable and control variable?
For example in case of part b and c in the photo attached below.

The control variable is always the contemporaneous choice c_t. The relevant state variable is past value c_{t-1} in the first case. In the second case, typically both c_{t-1} and c_{t-2} are states.

1 Like

Thank you very much