# Outliers in data

Hi everyone,

I have a question regarding the effect of the financial crisis on the correlations of the variables. During the financial crisis, some variables were changing significantly; however, the others were not changing that much. As a results, the correlations between some of the series change for the period only because of the huge change we obsered during the crisis. It means these series have positive correlations before and after the financial crisis but adding only a few observations of the year 2008 or 2009 change them to negative or zero. If you do not have lots of data to split your period into two periods, what would be the best strategy to deal with this problem and remove the effect of the financial crisis?
As you can see in the attached image, the cycles of rent price and house price co-move from 1995 2008 and 2009 to 2018 and if we remove the year 2009, the correlation becomes positive (from negative) for the entire period. How could you deal with this problem if both the theory and results of the model confirm that the house price and rent price move together.
P.S. These series were in real terms and the logarithm forms before imposing one sided HP filter to get the cycles of each one.

1 Like

This is tough problem. You can only approach this with theory. What do you think explains the movements in 2009? Often, itâ€™s a particular shock, which you may want to include in the model.

Itâ€™s true. It happend because of the financial crisis and the rent price was sticky and did not drop as much as housing prices.
My problem is that if I want to consider the entire period and want to match the results of my model with the actual data, I should get a negative or close to zero correlation between the rent price and house price (only because of the year 2009 and its huge negative effect that causes the correlation to become negative or close to zero) but the model can generate the positive correlation close to the actual data (if I do not include 2009). If I split the data, I loose many observations. When it comes to VAR or a reduced form, I can consider the financial crisis effect by adding a dummy variable but I do not know how to deal with that in a DSGE model.

Are you estimating the model? Or just matching data moments?

Thank you so much for your time. I donâ€™t know if I get it correctly but I tried both Bayesian and simple DSGE models. From both, I get positive correlations for the house price and rent price. Then I tried to compare the correlations and second moments of the variables of the model with actual data to see if both models can replicate the actual data. I have several shocks in the model but I donâ€™t know how to add a specific shock for that period or add that in the model. I can attach my codes if it helps.
Iâ€™ve seen some researchers check the pattern of IRFs of the model with VAR and if they get similar pattern as in VAR, they claim their models work properly. Do you think this approach makes any sense these days? If yes, I might be able to take out the effect of the financial crisis by adding a dummy variable to a VAR model and see if the patterns of IRFs from the model and VAR match.

If you are doing Bayesian estimation, you could simply set the financial crisis observations to NaN and see whether that makes a big difference for your model. The VAR model would only help if you could cleanly identify the shock you are interested in. Having a dummy is essentially throwing those periods out.