Should we use posterior mean or posterior mode

I estimate some medium DSGE models and find the posterior mode does not coincide with the posterior mean. Is it common in Bayesian estimation? If not, how should I improve the model? If yes, I am not sure which set of parameter values I should use for further quantitative exercise, because they tell me different things about the estimated model. Some important moment conditions are quite different from simulations based on the two sets of numbers. Which one is more trustworthy?

Hi, the mean and the mode of the posterior distribution are just two measures of the central tendency of the distribution. If the posterior distribution of a parameter is normal, the mean, the mode and the median will be the same. However, typically the exact distribution of the posterior is not known. In this case you have to pick a measure. Usually with irregular posteriors I would go with the mode. To see the problem consider a binomially distributed posterior with p=0.6. In this case, 60% of the draws will be 1 and 40% will be 0. The mode will be 1 and the mean will be 0.6. I would argue that the mode represents the distribution better than the mean (particularly as the latter is not guaranteed to fall on a value actually obtained in the sample).
However, I am sure you can construct cases where the mean is a better measure. I would recommend plotting the posterior distribution and see where the mean and the mode are for the parameters and then choose the one that better represents the sample.


Thanks. The explanation is very clear. Your answer to my other questions are very appreciated as well.

Thanks for the answers. Here is a follow up question. By default, the simulation after the estimation command is at the posterior mode. What if I want to use the posterior mean? I guess that I have to write another dynare code. However, I have trouble to specify the shock deviations, because the keyword ‘stderr’ cannot be followed by variable names Here is what I do:

I just write a code identical to the estimation code except the parameter values are undetermined. In the code, I read the value from posterior_mean by writing
The estimation code is named as ABC

load ABC_mode.mat;
load ABC_results.mat;

estim_param=char(parameter_names); %get the string variable containg the name of estimated parameters, if I have multiple parameters estimated, I need to write a loop.

for iter=1:n_num;
val([estim_param(iter,:slight_smile: '= oo_.posterior_mean.parameters.'estim_param(iter,:)]);

This works fine for parameters other than the standard deviations for shocks. Assume I need to pick the std for shock1, I wrote

var shock1;stderr oo_.posterior_mean.shocks_std.shock1;

Dynare will give the error message:

character unrecognized by lexer

How should I deal with it?

Thank you for the attention.

You could specify your shock processes as

where sigma_e is an estimated parameter and

shocks; var epsilon; stderr = 1; end
Thereby the estimated standard deviation sigma_e is now a normal parameter and you can use the procedure described.

Thanks. However, I cannot do the same thing to the estimation code, can I? I mean I just fix the standard deviation of shock to 1 but iterate the size of the multipliers in front of it.

Attached is a simple attempt of mine, and dynare gives me the error message:

??? Error using ==> kalman_filter at 83
The variance of the forecast error remains singular until the end of the sample

The file xydata.mat is generated in the following way:

save xydata x y;
estxy.mod (527 Bytes)

Hi, from my experience, this one should work. I think you are just missing the shocks-block that standardizes the variance to 1. If I add it, I get exactly the same results as with the standard estimation approach. See the attached files
estxy2.mod (601 Bytes)
estxy.mod (595 Bytes)

Thanks! That really helps.

Thanks for all the help with my question on this topic, however, I find the simulation after estimation is still a mystery. I am not sure what parameter values are used by dynare. The simulation outcome right after estimation is far from the results simulated based on posterior mean, or posterior mode or even the mode before the MH step.

I hereby upload an example with real data.

tryhard_est5.mod estimates the model, simulates a sample and computes certain moments.

tryhard_cal5_mean uses the posterior mean to simulate and compute the same moments.

tryhard_cal5_mode uses the posterior mode to simulate and compute the same moments.

tryhard_cal5_premh uses the posterior mode before the MH step to simulate and compute the same moments.

tryhard_cal_mean and tryhard_cal_mode load the results from tryhard_est5.mod automatically, but tryhard_cal5_premh use the pre-MH mode(mean?) by manually copying and pasting.

What I find it is that the results baed on posterior mean, mode and mode before MH are quite similar, but the result right after estimation is not. What causes the difference? Do dynare use a different set of parameter values but we don’t know what it is. (59.9 KB)


I am running a rather complex model that has already provided the results below. Nevertheless, I have some concerns about the adequacy of priors and posteriors distributions. The convergence diagnostic also points that the chains have not converged.

Additionally Dynare posted the following warning:
Warning: Matrix is singular to working precision.

In missing_DiffuseKalmanSmootherH1_Z at 249
In DsgeSmoother at 197
In evaluate_smoother at 90
In shock_decomposition at 60
In nonlinearomega at 2283
In dynare at 180

In contrast the acceptance ratio and the identification analysis seems quite good.
I am looking to know if the overall results are acceptable or, on the contrary, if it sounds that the model has some kind of misspecification related to prior distribution, etc…
I should be most thankful to an experienced advice on this matter.
Shock (98.9 KB)
nonlinear_JP_29 August Bodeinstein 1 copy.0059 (635 KB) (164 KB)
convergence diagonostic .zip (234 KB) (261 KB)

Your posterior does not look good at all. Non-convergence seems to be a big issue. Definitely use a longer chain and have a look at the trace_plots to see how the MCMC behaves.

Dear Pfeifer,
I am very thankful for your previous advice. I already increased the chain to 50000 draws, however, I didn’t reach convergence . The convergence diagnostics plots are posted below.
Unfortunately, I clumsy overwrite the trace plots. Nevertheless, it also showed that only very few parameters had converged.
Should the chain must be longer or there is probably some misspecification issue with my model?

In matter of fact I must inform you that, initially, it was very hard to find the posterior mode . With only very few parameters to estimate I simple got the error: Error using chol - The matrix must be positive definite.

Therefore following your previous advice (Saving the estimated shocks) **although you didn’t always recommended it I replaced the line chol(hh) in the try-catch-statement of dynare_estimation_1.m by hh=1e-4*eye(size(hh)). **

Today, I have tried to remake the calculation with the original code of dynare_estimation_1.m. I found that it is not out of question that some kind of model misspecification is related to the match between my model and observable variables namely, gov_share_jp_obs, oilprod_jp_deltalog_obs, oilprod_for_deltalog_obs, oilprod_jp_deltalog_obs_quarterly.

**Futhermore, with the original code of dynare_estimation_1.m, the output from compute_mode = 9 is the following
(minus) the hessian matrix at the “mode” is not positive definite!
=> posterior variance of the estimated parameters are not positive.
You should try to change the initial values of the parameters using
the estimated_params_init block, or use another optimization routine.
If you have some time to give me a better insight on this matter I shall be extremely obliged to you.
nonlinear_JP_31 August Bodeinstein 1 copy.0059 long chain copy (641 KB)

Due to the previous post, below are the trace plots and additional results for 100000 draws (with small identity matrix code).
After a full check up on the model I got approximately the same results with the standard dynare code and after to set it to a small identity matrix.
However, estimate all parameters with the standard dynare code proved once again almost impossible (the console output is in nonlinearomega consola output .txt in the shock decompositions & folder). How trustful are these results?

About the concerns expressed for the posteriors I may observe that all priors mean are the optimal values (estimated by the model’s author with the maximum likelihood method), what may explain the vertical line of the posteriors, the convergence diagnostics and the trace plot of all parameters.
If so, should I change the mean of the priors to different, but near, values?
Any suggestion shall be welcome!
shock decompositions & (637 KB) (530 KB)
trace_plots (1.02 MB) (812 KB) (757 KB)

Added to the previous post.
nonlinear_USA_3 Sep_ 100000 draws_Code Bodenstein 1970 (329 KB)

I look forward to some feedback, please!

This looks like a problem with the convergence of the MCMC and the initial jumping distribution. Please download the unstable version and try the use_tarb option.

Thank you. After I get a trial license for matlab statistical_toolbox I finally runned the model with use_tab option.
Nevertheless, the computation with Tailored randomized block MCMC algoritm has proved to be much more slower with a very low acceptance ratio. In order to avoid a lower acceptance ratio I tried mode_compute = 6 and after few hours I got the following error: Error using ==> chol
Matrix must be positive definite.

Note, I had also replaced the line **chol(hh) in the try-catch-statement of dynare_estimation_1.m by hh=1e-4*eye(size(hh)) on the unstable version.

What can possibly be done about this?**

Even if it takes longer, stay with the TaRB and lower mh_jscale to get a better acceptance rate. It will take longer, but you will need fewer draws. The reason is that when doing this, Dynare will recompute the Hessian so that the “Matrix must be positive definite” will not be an issue. You should use say 1000 draws in a first step and then use the trace_plot to check the movement of the MCMC. Check whether there is still a drift. If yes, restart with the most recent values.

Borrowing this post to ask a question: (because I don’t want to create redundant topics)

Should I be consistently using the mode or the mean from posterior distribution, or I can mix them up? (For instance, now I am using mode mostly but I have one of the parameters whose mean is much more plausible than the mode)

You cannot mix them as that would not make sense. The mode provides the one most likely calibration while the mean summarizes the full distribution of parameters.