Friday, March 15, 2013

Session # 8: Panel Data analysis with R


Session # 8


Assignment

Calculate the values for all the 3 models and decide which models best fits the data set for panel estimation ?

Panel Data:


Panel data is combination of Time Series Data and Cross sectional Data. It means it contains data of different time and for different attribute.

R for Panel Data:

The basic function used for panel data generation and estimation is plm.

The data set we have used in this session in "Produc".

The description for the same is as under.It contains the following data headings

- state : the state
- year : the year
- pcap: private capital stock
- hwy : highway and streets
- water: water and sewer facilities
- util: other public buildings and structures
- pc: public capital
- gsp: gross state products
- emp: labor input measured by the employement in non–agricultural payrolls
- unemp: state unemployment rate


Download and Load the "plm" package.
Use the data set "Produc" , a panel data set within plm package for panel estimations










Step1 : calculating value for pooling model

















Step 2: Calculating value for fixed model








Step3 : calculating value for random model




Now to choose the best model that fits the data set "Produc" , we need to run pairwise hypothesis tests among the 3 models and select the best fit in the end.


Test1 : Between pooling and fixed model


pFtest (fixed1 , pooled)




Test details :

H0: Null: the individual index and time based params are all zero
Alternative Hypothesis : atleast one of the index and time based params are non zero

The hypothesis test suggests that the alternative hypothesis has significant effects.
As the p-value is too low.. Null hypothesis is rejected.

Hence Fixed model is better than the pooling model.


Test 2: Between pooling and random model

Command used :
plmtest (pooled)



Test details :

H0: Null: the individual index and time based params are all zero : Pooling Model
Alternative Hypothesis : atleast one of the index and time based params are non zero : Random Model

The hypothesis test suggests that the alternative hypothesis has significant effects.
As the p-value is too low.. Null hypothesis is rejected.

Hence random model is better than the pooling model.


Test 3: Between fixed and random model


We use Hausman test :-

phtest(random1 , fixed1)




Test details :

H0: Null: individual effects are not correlated with any regressor : Random Model
Alternative Hypothesis : Individual effects are correlated : Fixed Model

The hypothesis test suggests that the one of the models is inconsistent.
As the p-value is too low.. Null hypothesis is rejected.

Hence fixed model is better than random model.


Conclusion :-

After the series of tests , we can conclude that fixed model best fits the "Produc" data set panel data estimations. i.e there is significant correlation observed with the regressor variables and index impact exists.
Hence we would choose "Fixed" model to estimate the panel data presented by "Produc" data set.


No comments:

Post a Comment