------------------------------------------------------------------------------------------------------- name: log: C:\Users\Andrew\Dropbox\Teaching 2018\Assignment3\Assignment3log.log log type: text opened on: 11 Oct 2018, 04:36:06 . . * Import data . use "qog_bas_ts_jan18.dta", clear (Quality of Government Basic dataset 2018 - Time-Series) . . * Merge Acemoglu Data . merge m:1 ccodealp using "colonialorigins.dta" (note: variable ccodealp was str3, now str22 to accommodate using data's values) Result # of obs. ----------------------------------------- not matched 3,099 from master 3,096 (_merge==1) from using 3 (_merge==2) matched 12,096 (_merge==3) ----------------------------------------- . . tabulate cname if _merge == 1 Country Name | Freq. Percent Cum. ------------------------------+----------------------------------- Albania | 72 2.33 2.33 Andorra | 72 2.33 4.65 Antigua and Barbuda | 72 2.33 6.98 Brunei | 72 2.33 9.30 Cambodia | 72 2.33 11.63 Congo, Democratic Republic | 72 2.33 13.95 Cuba | 72 2.33 16.28 Cyprus (-1974) | 72 2.33 18.60 Cyprus (1975-) | 72 2.33 20.93 Czechoslovakia | 72 2.33 23.26 Dominica | 72 2.33 25.58 Equatorial Guinea | 72 2.33 27.91 Germany, East | 72 2.33 30.23 Grenada | 72 2.33 32.56 Kiribati | 72 2.33 34.88 Lebanon | 72 2.33 37.21 Liechtenstein | 72 2.33 39.53 Maldives | 72 2.33 41.86 Marshall Islands | 72 2.33 44.19 Micronesia | 72 2.33 46.51 Monaco | 72 2.33 48.84 Montenegro | 72 2.33 51.16 Nauru | 72 2.33 53.49 Palau | 72 2.33 55.81 Romania | 72 2.33 58.14 Samoa | 72 2.33 60.47 San Marino | 72 2.33 62.79 Serbia | 72 2.33 65.12 Serbia and Montenegro | 72 2.33 67.44 Seychelles | 72 2.33 69.77 Solomon Islands | 72 2.33 72.09 South Sudan | 72 2.33 74.42 St Kitts and Nevis | 72 2.33 76.74 St Lucia | 72 2.33 79.07 St Vincent and the Grenadines | 72 2.33 81.40 Tibet | 72 2.33 83.72 Timor-Leste | 72 2.33 86.05 Tonga | 72 2.33 88.37 Tuvalu | 72 2.33 90.70 USSR | 72 2.33 93.02 Vanuatu | 72 2.33 95.35 Vietnam, South | 72 2.33 97.67 Yemen, South | 72 2.33 100.00 ------------------------------+----------------------------------- Total | 3,096 100.00 . . ********** Data Preparation . . /* > Note: most potential controls are undersirable for potential endogeneity > with quality of government (and I don't want multiple proxy variables > for quality of government). > > There are a lot of variables that may influence quality of government > (fractionalization, colonial legacy, etc) but may have persistent effects as > well. Others, like quality of human capital, infrastructure, etc may be > intermediate outcomes of quality of government. Neither are desirable to > controls, although their omission may likewise be problematic. We should > prefer an econometric approach which isolates the effect of quality of > government while avoiding bias from direct effects of determinants of > quality of government, as well as effects of intermediate outcomes of > quality of government. > > */ . . label variable ccode "Countries" . label variable year "Year" . . * Rename Variables . rename vdem_corr corruption . rename cname country . rename ht_region region . rename ccodealp country_abbrev . . * Reformat Time-Invariant Controls . rename lat_abst latitude . label var latitude "Latitude of capital (absolute value)" . replace latitude = latitude *100 (12,026 real changes made) . rename f_brit BritishColony . label var BritishColony "Former British colony" . rename f_french FrenchColony . label var FrenchColony "Former French colony" . rename malfal94 malaria . label var malaria "Malaria index in 1994" . label var africa "Africa indicator" . label var yellow "Yellow fever present today" . label var meantemp "Mean temperature" . . * Reformat Factor Input Controls: Oil & Natural Gas Production, Population . rename ross_gas_value_2014 gas . replace gas = gas / 10000000000 (4,047 real changes made) . label var gas "National gas production (in 2014 dollars, 10 billions)" . rename ross_oil_value_2014 oil . replace oil = oil / 10000000000 (4,707 real changes made) . label var oil "National oil production (in 2014 dollars, 10 billions)" . rename imf_pop pop . replace pop = pop / 100 (5,350 real changes made) . label variable pop "Population (Lagged, 100 millions)" . . * Transform GDP . gen logGDP = log(wdi_gdpcappppcon2011) (10,444 missing values generated) . label variable logGDP "Log of GDP (2011 Constant Dollars PPP)" . . * Save Data for Assignment 4 . save "C:/Users/AN.4271/Dropbox/Teaching 2018/Assignment4/assignments3to4data.dta", replace file C:/Users/AN.4271/Dropbox/Teaching 2018/Assignment4/assignments3to4data.dta saved . *save "/home/andrew/Dropbox/Teaching 2018/Assignment4/assignments3to4data.dta", replace . . * Keep relevant variables for Assignment 4 . keep corruption logGDP gle_rgdpc wdi_gdpcappppcon2011 ccode /// > year country pop country_abbrev region gas /// > oil wdi_refori latitude BritishColony FrenchColony /// > malaria africa yellow meantemp . . * Create mean corruption by year . egen mean_corruption = mean(corruption), by(year) (214 missing values generated) . . * Select only countries with at least 10 observations for all variables . gen obs_nonmissing = . (15,195 missing values generated) . replace obs_nonmissing = 1 if !missing(logGDP, corruption, /// > pop,oil,gas) & year >= 1991 (3,183 real changes made) . egen counter = total(obs_nonmissing), by(ccode) . keep if counter >= 10 (4,395 observations deleted) . . ********** Data Exploration . . set scheme sj // Set Color Scheme of Stata Graphs . . eststo clear . * Create Summary Statistics Table . estpost summarize year logGDP corruption pop oil gas /// > latitude meantemp yellow malaria BritishColony FrenchColony /// > if inrange(year, 1991,2016) | e(count) e(sum_w) e(mean) e(Var) e(sd) e(min) e(max) e(sum) -------------+---------------------------------------------------------------------------------------- year | 3900 3900 2003.5 56.26443 7.500962 1991 2016 7813650 logGDP | 3772 3772 8.910477 1.515713 1.231143 5.511154 11.77028 33610.32 corruption | 3856 3856 .512592 .0839443 .2897315 .0094875 .9690049 1976.555 pop | 3269 3269 .2561809 .1938137 .4402428 .00222 3.2108 837.4552 oil | 3557 3557 .8832079 10.6414 3.262116 0 41.77396 3141.57 gas | 3479 3479 .3928103 2.870296 1.694195 0 29.64167 1366.587 latitude | 3666 3666 30.20035 368.7088 19.20179 1.111111 72.22222 110714.5 meantemp | 1378 1378 22.88165 26.59751 5.157278 -.2 29.3 31530.91 yellow | 3666 3666 .4751773 .2494519 .4994516 0 1 1742 malaria | 3614 3614 .2835887 .1550619 .393779 0 .95 1024.89 BritishCol~y | 3666 3666 .2907801 .2062833 .4541842 0 1 1066 FrenchColony | 3666 3666 .141844 .1217575 .3489376 0 1 520 . esttab using "tables/summarystats", /// > cells("count(fmt(a2)) mean(fmt(%9.2fc)) sd(fmt(%9.2fc)) min(fmt(%9.2fc)) max(fmt(%9.2 > fc))") /// > title("Summary Statistics of Key Variables") /// > label rtf replace (output written to tables/summarystats.rtf) . . * Set Panel . xtset ccode year panel variable: ccode (strongly balanced) time variable: year, 1946 to 2017 delta: 1 unit . . * Summary Statistics . summarize logGDP corruption if inrange(year,1991,2016) Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- logGDP | 3,772 8.910477 1.231143 5.511154 11.77028 corruption | 3,856 .512592 .2897315 .0094875 .9690049 . . * Show countries with highest and lowest GDP values (start and end year) . count if year == 2016 & !missing(logGDP, corruption) 140 . scalar obs_ranks = r(N) . . gen nGDP = logGDP * -1 (6,910 missing values generated) . egen rk_GDP = rank(nGDP) if year == 2016 & !missing(logGDP, corruption) (10660 missing values generated) . egen rk_Corr = rank(corruption) if year == 2016 & !missing(logGDP, corruption) (10660 missing values generated) . . sort rk_Corr . list country rk_Corr rk_GDP region if year == 2016 & /// > (rk_Corr <= 10 | rk_Corr >= (obs_ranks - 9)) & /// > !missing(logGDP, corruption) +----------------------------------------------------------------------------+ | country rk_Corr rk_GDP region | |----------------------------------------------------------------------------| 1. | Sweden 1 9 5. Western Europe and North America | 2. | Norway 2 3 5. Western Europe and North America | 3. | New Zealand 3 21 5. Western Europe and North America | 4. | Denmark 4 10 5. Western Europe and North America | 5. | Netherlands 5 8 5. Western Europe and North America | |----------------------------------------------------------------------------| 6. | Switzerland 6 5 5. Western Europe and North America | 7. | Singapore 7 2 7. South-East Asia | 8. | Canada 8 15 5. Western Europe and North America | 9. | Iceland 9 11 5. Western Europe and North America | 10. | Belgium 10 16 5. Western Europe and North America | |----------------------------------------------------------------------------| 131. | Uzbekistan 131 95 1. Eastern Europe and post Soviet Union | 132. | Egypt 132 77 3. North Africa & the Middle East | 133. | Madagascar 133 132 4. Sub-Saharan Africa | 134. | Afghanistan 134 125 8. South Asia | 135. | Tajikistan 135 114 1. Eastern Europe and post Soviet Union | |----------------------------------------------------------------------------| 136. | Guinea-Bissau 136 130 4. Sub-Saharan Africa | 137. | Turkmenistan 137 56 1. Eastern Europe and post Soviet Union | 138. | Cameroon 138 111 4. Sub-Saharan Africa | 139. | Azerbaijan 139 54 1. Eastern Europe and post Soviet Union | 140. | Chad 140 123 4. Sub-Saharan Africa | +----------------------------------------------------------------------------+ . . * Within- and Between- Variation . xtsum corruption if inrange(year,1991,2016) Variable | Mean Std. Dev. Min Max | Observations -----------------+--------------------------------------------+---------------- corrup~n overall | .512592 .2897315 .0094875 .9690049 | N = 3856 between | .2824454 .0105427 .9608864 | n = 150 within | .0658783 .026089 .8961137 | T = 25.7067 . . * Graph of mean corruption by year . graph twoway line mean_corruption year if inrange(year,1991,2016), sort /// > title("Mean corruption by Year") /// > ytitle("Mean corruption ") /// > xtitle("Year") . graph export "scatterplot.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\scatterplot.emf written in Enhanced Metafile f > ormat) . . codebook if inrange(year,1991,2016) ------------------------------------------------------------------------------------------------------- ccode Countries ------------------------------------------------------------------------------------------------------- type: numeric (int) range: [4,894] units: 1 unique values: 150 missing .: 0/3,900 mean: 427.92 std. dev: 254.088 percentiles: 10% 25% 50% 75% 90% 71 208 420 634 790 ------------------------------------------------------------------------------------------------------- country Country Name ------------------------------------------------------------------------------------------------------- type: string (str29), but longest is str24 unique values: 150 missing "": 0/3,900 examples: "Comoros" "Iceland" "Mexico" "Singapore" warning: variable has embedded blanks ------------------------------------------------------------------------------------------------------- year Year ------------------------------------------------------------------------------------------------------- type: numeric (int) range: [1991,2016] units: 1 unique values: 26 missing .: 0/3,900 mean: 2003.5 std. dev: 7.50096 percentiles: 10% 25% 50% 75% 90% 1993 1997 2003.5 2010 2014 ------------------------------------------------------------------------------------------------------- country_abbrev 3-letter Country Code ------------------------------------------------------------------------------------------------------- type: string (str22), but longest is str3 unique values: 150 missing "": 0/3,900 examples: "CPV" "IDN" "MLI" "SDN" ------------------------------------------------------------------------------------------------------- gle_rgdpc Real GDP per Capita (2005) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [132.82,102804.82] units: .01 unique values: 3,107 missing .: 789/3,900 mean: 9909.97 std. dev: 12157.7 percentiles: 10% 25% 50% 75% 90% 847.98 1764.3 4650.62 13135.1 28431.3 ------------------------------------------------------------------------------------------------------- region The Region of the Country ------------------------------------------------------------------------------------------------------- type: numeric (byte) label: ht_region range: [1,10] units: 1 unique values: 10 missing .: 38/3,900 examples: 2 2. Latin America 3 3. North Africa & the Middle East 4 4. Sub-Saharan Africa 5 5. Western Europe and North America ------------------------------------------------------------------------------------------------------- pop Population (Lagged, 100 millions) ------------------------------------------------------------------------------------------------------- type: numeric (double) range: [.00222,3.2108] units: .00001 unique values: 3,044 missing .: 631/3,900 mean: .256181 std. dev: .440243 percentiles: 10% 25% 50% 75% 90% .0097 .03413 .08833 .27064 .65301 ------------------------------------------------------------------------------------------------------- gas National gas production (in 2014 dollars, 10 billions) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [0,29.64167] units: 1.000e-13 unique values: 1,785 missing .: 421/3,900 mean: .39281 std. dev: 1.69419 percentiles: 10% 25% 50% 75% 90% 0 0 .000221 .118509 .781116 ------------------------------------------------------------------------------------------------------- oil National oil production (in 2014 dollars, 10 billions) ------------------------------------------------------------------------------------------------------- type: numeric (double) range: [0,41.77396] units: 1.000e-14 unique values: 1,900 missing .: 343/3,900 mean: .883208 std. dev: 3.26212 percentiles: 10% 25% 50% 75% 90% 0 0 .001321 .163094 2.07351 ------------------------------------------------------------------------------------------------------- corruption Political corruption index ------------------------------------------------------------------------------------------------------- type: numeric (double) range: [.00948755,.96900491] units: 1.000e-12 unique values: 1,608 missing .: 44/3,900 mean: .512592 std. dev: .289731 percentiles: 10% 25% 50% 75% 90% .04926 .240688 .575249 .765845 .853967 ------------------------------------------------------------------------------------------------------- wdi_gdpcappppcon2011 GDP per capita, PPP (constant 2011 international $) ------------------------------------------------------------------------------------------------------- type: numeric (double) range: [247.43654,129349.92] units: 1.000e-08 unique values: 3,772 missing .: 128/3,900 mean: 14300.8 std. dev: 16435.5 percentiles: 10% 25% 50% 75% 90% 1385.94 2638.29 8098.51 20290.6 37457.3 ------------------------------------------------------------------------------------------------------- wdi_refori Refugee population by country or territory of origin ------------------------------------------------------------------------------------------------------- type: numeric (long) range: [1,6306301] units: 1 unique values: 2,377 missing .: 172/3,900 mean: 57436.9 std. dev: 285602 percentiles: 10% 25% 50% 75% 90% 14 75 856 11117 93993 ------------------------------------------------------------------------------------------------------- latitude Latitude of capital (absolute value) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [1.1111112,72.222221] units: 1.000e-07 unique values: 87 missing .: 234/3,900 mean: 30.2003 std. dev: 19.2018 percentiles: 10% 25% 50% 75% 90% 7 14.4444 27.7778 45.5556 57.7778 ------------------------------------------------------------------------------------------------------- BritishColony Former British colony ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [0,1] units: 1 unique values: 2 missing .: 234/3,900 tabulation: Freq. Value 2,600 0 1,066 1 234 . ------------------------------------------------------------------------------------------------------- FrenchColony Former French colony ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [0,1] units: 1 unique values: 2 missing .: 234/3,900 tabulation: Freq. Value 3,146 0 520 1 234 . ------------------------------------------------------------------------------------------------------- africa Africa indicator ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [0,1] units: 1 unique values: 2 missing .: 234/3,900 tabulation: Freq. Value 2,496 0 1,170 1 234 . ------------------------------------------------------------------------------------------------------- malaria Malaria index in 1994 ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [0,.94999999] units: 1.000e-10 unique values: 45 missing .: 286/3,900 mean: .283589 std. dev: .393779 percentiles: 10% 25% 50% 75% 90% 0 0 0 .65436 .95 ------------------------------------------------------------------------------------------------------- yellow Yellow fever present today ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [0,1] units: 1 unique values: 2 missing .: 234/3,900 tabulation: Freq. Value 1,924 0 1,742 1 234 . ------------------------------------------------------------------------------------------------------- meantemp Mean temperature ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [-.2,29.3] units: .00001 unique values: 51 missing .: 2,522/3,900 mean: 22.8817 std. dev: 5.15728 percentiles: 10% 25% 50% 75% 90% 17.7 20.9 24.4333 26.5 27.5 ------------------------------------------------------------------------------------------------------- logGDP Log of GDP (2011 Constant Dollars PPP) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [5.5111542,11.770276] units: 1.000e-07 unique values: 3,772 missing .: 128/3,900 mean: 8.91048 std. dev: 1.23114 percentiles: 10% 25% 50% 75% 90% 7.23414 7.87789 8.99943 9.91791 10.531 ------------------------------------------------------------------------------------------------------- mean_corruption (unlabeled) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [.4963347,.53363556] units: 1.000e-08 unique values: 26 missing .: 0/3,900 mean: .522656 std. dev: .010434 percentiles: 10% 25% 50% 75% 90% .504797 .514952 .52783 .530893 .531509 ------------------------------------------------------------------------------------------------------- obs_nonmissing (unlabeled) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [1,1] units: 1 unique values: 1 missing .: 744/3,900 tabulation: Freq. Value 3,156 1 744 . ------------------------------------------------------------------------------------------------------- counter (unlabeled) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [10,24] units: 1 unique values: 15 missing .: 0/3,900 mean: 21.04 std. dev: 3.58911 percentiles: 10% 25% 50% 75% 90% 15 20 23 24 24 ------------------------------------------------------------------------------------------------------- nGDP (unlabeled) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [-11.770276,-5.5111542] units: 1.000e-07 unique values: 3,772 missing .: 128/3,900 mean: -8.91048 std. dev: 1.23114 percentiles: 10% 25% 50% 75% 90% -10.531 -9.91791 -8.99943 -7.87789 -7.23414 ------------------------------------------------------------------------------------------------------- rk_GDP rank of (nGDP) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [1,140] units: 1 unique values: 140 missing .: 3,760/3,900 mean: 70.5 std. dev: 40.5586 percentiles: 10% 25% 50% 75% 90% 14.5 35.5 70.5 105.5 126.5 ------------------------------------------------------------------------------------------------------- rk_Corr rank of (corruption) ------------------------------------------------------------------------------------------------------- type: numeric (float) range: [1,140] units: 1 unique values: 140 missing .: 3,760/3,900 mean: 70.5 std. dev: 40.5586 percentiles: 10% 25% 50% 75% 90% 14.5 35.5 70.5 105.5 126.5 . . ***** Histograms . * Histogram of GDP . histogram logGDP if inrange(year,1991,2016), /// > xscale(range(5 12)) /// > yscale(range(0 0.4)) /// > title("Distribution of GDP") /// > subtitle("(all years)") /// > xtitle("Log GDP") (bin=35, start=5.5111542, width=.17883205) . graph export "histogram_gdp.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\histogram_gdp.emf written in Enhanced Metafile > format) . . * Histogram of GDP for 1991 . histogram logGDP if year == 1991, /// > xscale(range(5 12)) /// > yscale(range(0 0.4)) /// > title("Distribution of GDP") /// > subtitle("in 1991") /// > xtitle("Log GDP") (bin=10, start=5.9584846, width=.48049965) . graph export "histogram_gdp1991.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\histogram_gdp1991.emf written in Enhanced Meta > file format) . . * Histogram of GDP for 2016 . histogram logGDP if year == 2016, /// > xscale(range(5 12)) /// > yscale(range(0 0.4)) /// > title("Distribution of GDP") /// > subtitle("in 2016") /// > xtitle("Log GDP") (bin=11, start=6.4737062, width=.4733233) . graph export "graphs/histogram_gdp2016.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\graphs/histogram_gdp2016.emf written in Enhanc > ed Metafile format) . . * Histogram of Corruption . histogram corruption if inrange(year,1991,2016), /// > title("Distribution of Political corruption Index") /// > subtitle("(all years)") /// > xtitle("Political corruption Index") (bin=35, start=.00948755, width=.02741478) . graph export "graphs/histogram_corrupt.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\graphs/histogram_corrupt.emf written in Enhanc > ed Metafile format) . . * Histogram of corruption for 1991 . histogram corruption if year == 1991, /// > title("Distribution of Political corruption Index") /// > subtitle("in 1991") /// > xtitle("Political corruption Index") (bin=11, start=.00948755, width=.0856991) . graph export "graphs/histogram_corrupt1991.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\graphs/histogram_corrupt1991.emf written in En > hanced Metafile format) . . * Histogram of corruption for 2016 . histogram corruption if year == 2016, /// > title("Distribution of Political corruption Index") /// > subtitle("in 2016") /// > xtitle("Political corruption Index") (bin=12, start=.01675996, width=.07933182) . graph export "graphs/histogram_corrupt2016.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\graphs/histogram_corrupt2016.emf written in En > hanced Metafile format) . . xtset ccode year panel variable: ccode (strongly balanced) time variable: year, 1946 to 2017 delta: 1 unit . histogram corruption if year == 2016 & !missing(L16.corruption), /// > title("Distribution of Political corruption Index") /// > subtitle("in 2016") /// > xtitle("Political corruption Index") (bin=12, start=.01675996, width=.07933182) . graph export "graphs/histogram_corrupt2016_sampleconsist.emf", as(emf) replac > e (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\graphs/histogram_corrupt2016_sampleconsist.emf > written in Enhanced Metafile format) . . ***** Scatterplots . * 1991-2016 . graph twoway scatter logGDP corruption if inrange(year,1991,2016), /// > title("Scatterplot of corruption and GDP") /// > subtitle("(all years)") /// > ytitle("Log GDP") /// > xtitle("Political corruption Index") . graph export "graphs/scatter_GDPcorrupt.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\graphs/scatter_GDPcorrupt.emf written in Enhan > ced Metafile format) . . * Scatterplot for 1991 . graph twoway scatter logGDP corruption if year == 1991, /// > title("Scatterplot of corruption and GDP") /// > subtitle("in 1991") /// > ytitle("Log GDP") /// > xtitle("Political corruption Index") /// > mlabel(country_abbrev) m(i) mlabsize(vsmall) . graph export "graphs/scatter_GDPcorrupt1991.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\graphs/scatter_GDPcorrupt1991.emf written in E > nhanced Metafile format) . . * Scatterplot for 2016 . graph twoway scatter logGDP corruption if year == 2016, /// > title("Scatterplot of corruption and GDP") /// > subtitle("in 2016") /// > ytitle("Log GDP") /// > xtitle("Political corruption Index") /// > mlabel(country_abbrev) m(i) mlabsize(vsmall) . graph export "graphs/scatter_GDPcorrupt2016.emf", as(emf) replace (file C:\Users\AN.4271\Dropbox\Teaching 2018\Assignment3\graphs/scatter_GDPcorrupt2016.emf written in E > nhanced Metafile format) . . * Missingness . gen is_missing = . (10,800 missing values generated) . replace is_missing = 0 if !missing(logGDP,corruption) (3,890 real changes made) . replace is_missing = 1 if missing(logGDP,corruption) (6,910 real changes made) . label var is_missing "Missing data (=1)" . . gen corruption_missing = . (10,800 missing values generated) . replace corruption_missing = 0 if !missing(corruption) (8,235 real changes made) . replace corruption_missing = 1 if missing(corruption) (2,565 real changes made) . label var corruption_missing "Corruption data missing (=1)" . . gen GDP_missing = . (10,800 missing values generated) . replace GDP_missing = 0 if !missing(logGDP) (3,890 real changes made) . replace GDP_missing = 1 if missing(logGDP) (6,910 real changes made) . label var GDP_missing "GDP data missing (=1)" . . label define GDP_missing_lab 0 "GDP data missing" 1 "GDP data non-missing" . label define corruption_missing_lab 0 "Corruption data missing" 1 "Corruption data non-missin > g" . label values GDP_missing GDP_missing_lab . label values corruption_missing corruption_missing_lab . . summarize is_missing Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- is_missing | 10,800 .6398148 .4800762 0 1 . summarize is_missing if !missing(logGDP) Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- is_missing | 3,890 0 0 0 0 . . egen mean_corruption_country = mean(corruption) if inrange(year,1991,2016), by(ccode) (6900 missing values generated) . label var mean_corruption_country "Average corruption index in country, 1991-2016" . egen mean_GDP_country = mean(logGDP) if inrange(year,1991,2016), by(ccode) (6900 missing values generated) . label var mean_GDP_country "Average GDP per capita in country, 1991-2016" . . eststo clear . eststo: reg is_missing mean_corruption_country mean_GDP_country, robust Linear regression Number of obs = 3,900 F(2, 3897) = 8.37 Prob > F = 0.0002 R-squared = 0.0064 Root MSE = .17766 ----------------------------------------------------------------------------------------- | Robust is_missing | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------+---------------------------------------------------------------- mean_corruption_country | .0664205 .0178986 3.71 0.000 .0313291 .101512 mean_GDP_country | .0124594 .0048545 2.57 0.010 .0029418 .0219769 _cons | -.1123193 .0511427 -2.20 0.028 -.2125883 -.0120503 ----------------------------------------------------------------------------------------- (est1 stored) . eststo: reg corruption_missing logGDP i.year if inrange(year,1991,2016) Source | SS df MS Number of obs = 3,772 -------------+---------------------------------- F(26, 3745) = . Model | 0 26 0 Prob > F = . Residual | 0 3,745 0 R-squared = . -------------+---------------------------------- Adj R-squared = . Total | 0 3,771 0 Root MSE = 0 ------------------------------------------------------------------------------ corruption~g | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- logGDP | 0 (omitted) | year | 1992 | 0 (omitted) 1993 | 0 (omitted) 1994 | 0 (omitted) 1995 | 0 (omitted) 1996 | 0 (omitted) 1997 | 0 (omitted) 1998 | 0 (omitted) 1999 | 0 (omitted) 2000 | 0 (omitted) 2001 | 0 (omitted) 2002 | 0 (omitted) 2003 | 0 (omitted) 2004 | 0 (omitted) 2005 | 0 (omitted) 2006 | 0 (omitted) 2007 | 0 (omitted) 2008 | 0 (omitted) 2009 | 0 (omitted) 2010 | 0 (omitted) 2011 | 0 (omitted) 2012 | 0 (omitted) 2013 | 0 (omitted) 2014 | 0 (omitted) 2015 | 0 (omitted) 2016 | 0 (omitted) | _cons | 0 (omitted) ------------------------------------------------------------------------------ (est2 stored) . eststo: xtreg GDP_missing corruption i.year if inrange(year,1991,2016), fe Fixed-effects (within) regression Number of obs = 3,856 Group variable: ccode Number of groups = 150 R-sq: Obs per group: within = 0.0449 min = 15 between = 0.0064 avg = 25.7 overall = 0.0341 max = 26 F(26,3680) = 6.65 corr(u_i, Xb) = -0.0256 Prob > F = 0.0000 ------------------------------------------------------------------------------ GDP_missing | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | .0399829 .0311158 1.28 0.199 -.0210231 .1009889 | year | 1992 | .0264544 .0153085 1.73 0.084 -.0035595 .0564683 1993 | .0220295 .0151694 1.45 0.147 -.0077119 .0517708 1994 | .0150464 .0151724 0.99 0.321 -.0147007 .0447935 1995 | -.0321446 .0151767 -2.12 0.034 -.0619002 -.002389 1996 | -.0322206 .0151787 -2.12 0.034 -.0619801 -.0024611 1997 | -.0323887 .0151839 -2.13 0.033 -.0621584 -.002619 1998 | -.0324957 .0151878 -2.14 0.032 -.0622731 -.0027183 1999 | -.0391724 .0151865 -2.58 0.010 -.0689472 -.0093976 2000 | -.0458087 .0151838 -3.02 0.003 -.0755782 -.0160391 2001 | -.0525284 .0151841 -3.46 0.001 -.0822985 -.0227583 2002 | -.0596779 .0151643 -3.94 0.000 -.0894091 -.0299467 2003 | -.0596568 .0151635 -3.93 0.000 -.0893865 -.0299271 2004 | -.0595495 .0151599 -3.93 0.000 -.0892721 -.0298269 2005 | -.0594332 .0151565 -3.92 0.000 -.0891492 -.0297173 2006 | -.0594297 .0151564 -3.92 0.000 -.0891454 -.0297139 2007 | -.059398 .0151556 -3.92 0.000 -.0891122 -.0296839 2008 | -.0594019 .0151557 -3.92 0.000 -.0891162 -.0296876 2009 | -.0593859 .0151553 -3.92 0.000 -.0890994 -.0296724 2010 | -.0593607 .0151546 -3.92 0.000 -.089073 -.0296484 2011 | -.0590534 .0151491 -3.90 0.000 -.0887549 -.0293519 2012 | -.0457841 .015174 -3.02 0.003 -.0755344 -.0160339 2013 | -.0452095 .0151736 -2.98 0.003 -.0749591 -.01546 2014 | -.0384142 .0151747 -2.53 0.011 -.0681658 -.0086626 2015 | -.0315726 .0151768 -2.08 0.038 -.0613284 -.0018167 2016 | .0019945 .015177 0.13 0.895 -.0277618 .0317507 | _cons | .0384472 .0191728 2.01 0.045 .0008567 .0760377 -------------+---------------------------------------------------------------- sigma_u | .07485542 sigma_e | .12540102 rho | .26271275 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(149, 3680) = 9.16 Prob > F = 0.0000 (est3 stored) . . esttab using "tables/missingness", title("Regression of missingness by variable") /// > drop(19* 20*) /// > se label wrap noabbrev rtf star(* 0.10 ** 0.05 *** 0.01) /// > b(%9.2fc) compress one replace (output written to tables/missingness.rtf) . eststo clear . . eststo clear . estpost tabulate corruption_missing GDP_missing GDP_missing | corruption_m | e(b) e(pct) e(colpct) e(rowpct) -------------+-------------------------------------------- GDP data m~g | Corruption~g | 3890 36.01852 100 47.2374 Corruption~g | 0 0 0 0 Total | 3890 36.01852 100 36.01852 -------------+-------------------------------------------- GDP data n~g | Corruption~g | 4345 40.23148 62.87988 52.7626 Corruption~g | 2565 23.75 37.12012 100 Total | 6910 63.98148 100 63.98148 -------------+-------------------------------------------- Total | Corruption~g | 8235 76.25 76.25 100 Corruption~g | 2565 23.75 23.75 100 Total | 10800 100 100 100 . esttab using "tables/missing_cross_tab", cell(b(fmt(%11.0g)) b(fmt(%11.0g) par keep(Total))) > /// > collabels(none) noabbrev unstack noobs nonumber nomtitle /// > title("Cross-Tabulation Table of Missing GDP and Corruption Data") /// > rtf replace (output written to tables/missing_cross_tab.rtf) . . ********** Regression Analysis . . **** Basic Pooled OLS . eststo clear . . * Set Panel . xtset ccode year panel variable: ccode (strongly balanced) time variable: year, 1946 to 2017 delta: 1 unit . . * Without controls . eststo: regress logGDP corruption if inrange(year,1991,2016), robust Linear regression Number of obs = 3,772 F(1, 3770) = 4408.74 Prob > F = 0.0000 R-squared = 0.4297 Root MSE = .92987 ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | -2.770672 .041728 -66.40 0.000 -2.852484 -2.688861 _cons | 10.32494 .0223286 462.41 0.000 10.28117 10.36872 ------------------------------------------------------------------------------ (est1 stored) . . * Insitutions and Climate/Disease Environment Controls . eststo: regress logGDP corruption latitude meantemp yellow malaria /// > BritishColony FrenchColony, robust Linear regression Number of obs = 1,423 F(7, 1415) = 972.58 Prob > F = 0.0000 R-squared = 0.7699 Root MSE = .51684 ------------------------------------------------------------------------------- | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------+---------------------------------------------------------------- corruption | -1.237814 .0638915 -19.37 0.000 -1.363147 -1.112482 latitude | -.0162882 .0020891 -7.80 0.000 -.0203863 -.0121902 meantemp | -.0657867 .0047551 -13.83 0.000 -.0751146 -.0564589 yellow | .2204066 .032542 6.77 0.000 .1565709 .2842423 malaria | -1.629469 .0491218 -33.17 0.000 -1.725829 -1.53311 BritishColony | .23732 .0398434 5.96 0.000 .1591614 .3154786 FrenchColony | .1195648 .0464008 2.58 0.010 .028543 .2105867 _cons | 11.47377 .1347097 85.17 0.000 11.20951 11.73802 ------------------------------------------------------------------------------- (est2 stored) . . * Domestic controls . eststo: regress logGDP corruption L5.pop oil gas, robust Linear regression Number of obs = 3,267 F(4, 3262) = 1148.31 Prob > F = 0.0000 R-squared = 0.5053 Root MSE = .86603 ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | -2.800022 .0423538 -66.11 0.000 -2.883065 -2.716979 | pop | L5. | .1468383 .0354904 4.14 0.000 .0772526 .2164239 | oil | .0903522 .0087733 10.30 0.000 .0731505 .1075539 gas | -.0009952 .0117837 -0.08 0.933 -.0240994 .022109 _cons | 10.22152 .0239932 426.02 0.000 10.17447 10.26856 ------------------------------------------------------------------------------ (est3 stored) . . * Create tables . esttab using "tables/pooled", /// > title("Pooled OLS Regression of GDP and corruption ") /// > se label wrap noabbrev rtf star(* 0.10 ** 0.05 *** 0.01) /// > b(%9.2fc) compress one replace (output written to tables/pooled.rtf) . . **** Basic Pooled OLS . eststo clear . . * Set Panel . xtset ccode year panel variable: ccode (strongly balanced) time variable: year, 1946 to 2017 delta: 1 unit . . * Without controls . eststo: regress logGDP corruption if inrange(year,1991,2016), cluster(ccode) Linear regression Number of obs = 3,772 F(1, 149) = 217.21 Prob > F = 0.0000 R-squared = 0.4297 Root MSE = .92987 (Std. Err. adjusted for 150 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | -2.770672 .1879964 -14.74 0.000 -3.142156 -2.399189 _cons | 10.32494 .1015097 101.71 0.000 10.12436 10.52553 ------------------------------------------------------------------------------ (est1 stored) . . * Insitutions and Climate/Disease Environment Controls . eststo: regress logGDP corruption latitude meantemp yellow malaria /// > BritishColony FrenchColony , cluster(ccode) Linear regression Number of obs = 1,423 F(7, 52) = 42.67 Prob > F = 0.0000 R-squared = 0.7699 Root MSE = .51684 (Std. Err. adjusted for 53 clusters in ccode) ------------------------------------------------------------------------------- | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------+---------------------------------------------------------------- corruption | -1.237814 .2998248 -4.13 0.000 -1.839457 -.636172 latitude | -.0162882 .010361 -1.57 0.122 -.0370791 .0045026 meantemp | -.0657867 .0233126 -2.82 0.007 -.1125668 -.0190066 yellow | .2204066 .1536675 1.43 0.157 -.0879498 .528763 malaria | -1.629469 .2337516 -6.97 0.000 -2.098526 -1.160413 BritishColony | .23732 .1921887 1.23 0.222 -.1483349 .6229749 FrenchColony | .1195648 .2240056 0.53 0.596 -.3299352 .5690649 _cons | 11.47377 .6669085 17.20 0.000 10.13552 12.81202 ------------------------------------------------------------------------------- (est2 stored) . . * Domestic controls . eststo: regress logGDP corruption L5.pop oil gas, cluster(ccode) Linear regression Number of obs = 3,267 F(4, 149) = 69.17 Prob > F = 0.0000 R-squared = 0.5053 Root MSE = .86603 (Std. Err. adjusted for 150 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | -2.800022 .1749585 -16.00 0.000 -3.145742 -2.454302 | pop | L5. | .1468383 .1376211 1.07 0.288 -.1251029 .4187794 | oil | .0903522 .0198348 4.56 0.000 .0511583 .1295461 gas | -.0009952 .0281431 -0.04 0.972 -.0566063 .0546159 _cons | 10.22152 .1028838 99.35 0.000 10.01822 10.42482 ------------------------------------------------------------------------------ (est3 stored) . . * Create tables . esttab using "tables/pooled_andclustered", /// > title("Pooled OLS Regression of GDP and corruption ") /// > se label wrap noabbrev rtf star(* 0.10 ** 0.05 *** 0.01) /// > b(%9.2fc) compress one replace (output written to tables/pooled_andclustered.rtf) . . . **** Fixed Effects Regression . . * Without controls . eststo: xtreg logGDP corruption i.year if inrange(year,1991,2016), fe robust Fixed-effects (within) regression Number of obs = 3,772 Group variable: ccode Number of groups = 150 R-sq: Obs per group: within = 0.5230 min = 13 between = 0.4360 avg = 25.1 overall = 0.1682 max = 26 F(26,149) = 21.93 corr(u_i, Xb) = 0.2630 Prob > F = 0.0000 (Std. Err. adjusted for 150 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | -.2846926 .1695606 -1.68 0.095 -.6197465 .0503614 | year | 1992 | .0139316 .0107871 1.29 0.199 -.0073838 .035247 1993 | -.0003826 .01444 -0.03 0.979 -.0289163 .0281512 1994 | -.0073671 .0194946 -0.38 0.706 -.0458887 .0311545 1995 | .01138 .0207402 0.55 0.584 -.0296028 .0523629 1996 | .0478039 .0215618 2.22 0.028 .0051976 .0904103 1997 | .0871907 .0242911 3.59 0.000 .0391912 .1351903 1998 | .1064757 .025386 4.19 0.000 .0563126 .1566387 1999 | .1259661 .0270532 4.66 0.000 .0725087 .1794236 2000 | .1531321 .0278558 5.50 0.000 .0980885 .2081756 2001 | .1731967 .0302618 5.72 0.000 .113399 .2329943 2002 | .1904906 .0312325 6.10 0.000 .1287748 .2522064 2003 | .2138183 .0317481 6.73 0.000 .1510835 .276553 2004 | .2596149 .0336896 7.71 0.000 .1930438 .3261859 2005 | .2961781 .0346005 8.56 0.000 .2278071 .3645492 2006 | .3406337 .0354468 9.61 0.000 .2705904 .410677 2007 | .3857981 .0366745 10.52 0.000 .3133288 .4582674 2008 | .411455 .0378987 10.86 0.000 .3365667 .4863433 2009 | .3970261 .0379999 10.45 0.000 .3219378 .4721143 2010 | .4268225 .0376663 11.33 0.000 .3523935 .5012516 2011 | .4451508 .03855 11.55 0.000 .3689755 .5213261 2012 | .4727286 .0387518 12.20 0.000 .3961545 .5493026 2013 | .4860748 .0393256 12.36 0.000 .4083668 .5637827 2014 | .5056896 .0397408 12.72 0.000 .4271613 .5842178 2015 | .5161562 .0401744 12.85 0.000 .4367711 .5955413 2016 | .5324918 .0404236 13.17 0.000 .4526142 .6123695 | _cons | 8.797818 .0891766 98.66 0.000 8.621604 8.974032 -------------+---------------------------------------------------------------- sigma_u | 1.161453 sigma_e | .18071416 rho | .976363 (fraction of variance due to u_i) ------------------------------------------------------------------------------ (est4 stored) . . * Factor input controls . eststo: xtreg logGDP corruption L5.pop oil gas i.year /// > if inrange(year,1991,2016), fe robust Fixed-effects (within) regression Number of obs = 3,160 Group variable: ccode Number of groups = 150 R-sq: Obs per group: within = 0.4970 min = 5 between = 0.2063 avg = 21.1 overall = 0.1335 max = 24 F(27,149) = 18.14 corr(u_i, Xb) = 0.2258 Prob > F = 0.0000 (Std. Err. adjusted for 150 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | -.2478302 .142734 -1.74 0.085 -.5298745 .034214 | pop | L5. | -.1493188 .1957645 -0.76 0.447 -.5361521 .2375144 | oil | .0000155 .0047528 0.00 0.997 -.009376 .0094071 gas | .0059005 .0027106 2.18 0.031 .0005444 .0112567 | year | 1992 | .0097965 .005776 1.70 0.092 -.001617 .02121 1993 | .0136561 .0097119 1.41 0.162 -.0055347 .0328469 1994 | .0292822 .0129593 2.26 0.025 .0036744 .05489 1995 | .0650439 .0150663 4.32 0.000 .0352726 .0948151 1996 | .1010466 .0188788 5.35 0.000 .0637418 .1383513 1997 | .1035396 .0279316 3.71 0.000 .0483464 .1587327 1998 | .1178366 .0298289 3.95 0.000 .0588944 .1767789 1999 | .134266 .0321808 4.17 0.000 .0706763 .1978558 2000 | .1572589 .0337619 4.66 0.000 .090545 .2239728 2001 | .1764189 .0371698 4.75 0.000 .1029708 .2498669 2002 | .1949781 .0387455 5.03 0.000 .1184165 .2715397 2003 | .2209483 .0401143 5.51 0.000 .1416819 .3002147 2004 | .2658193 .0419949 6.33 0.000 .1828369 .3488018 2005 | .3007463 .0435056 6.91 0.000 .2147786 .386714 2006 | .3490634 .0444212 7.86 0.000 .2612864 .4368403 2007 | .3920565 .0461232 8.50 0.000 .3009163 .4831966 2008 | .4165023 .0477705 8.72 0.000 .3221071 .5108974 2009 | .4070636 .0482946 8.43 0.000 .3116328 .5024943 2010 | .4355468 .0481245 9.05 0.000 .3404522 .5306415 2011 | .4579594 .0496986 9.21 0.000 .3597544 .5561645 2012 | .4824159 .0500574 9.64 0.000 .3835018 .58133 2013 | .5016652 .0510824 9.82 0.000 .4007257 .6026047 2014 | .5138187 .0561156 9.16 0.000 .4029335 .6247039 | _cons | 8.860585 .0806939 109.80 0.000 8.701133 9.020037 -------------+---------------------------------------------------------------- sigma_u | 1.1748574 sigma_e | .16357886 rho | .98098285 (fraction of variance due to u_i) ------------------------------------------------------------------------------ (est5 stored) . . * Create tables . esttab using "tables/fixeff", /// > title("Fixed Effects Regression of GDP and corruption") /// > drop(19* 20*) se label wrap noabbrev rtf /// > star(* 0.10 ** 0.05 *** 0.01) b(%8.2g) compress one replace (output written to tables/fixeff.rtf) . eststo clear . . /* > Without lag: > xtreg logGDP corruption pop oil gas i.year /// > if inrange(year,1991,2016), fe robust > */ . . **** Random Effects Regression . eststo clear . . * Without controls . eststo: xtreg logGDP corruption i.year if inrange(year,1991,2016), re theta robust Random-effects GLS regression Number of obs = 3,772 Group variable: ccode Number of groups = 150 R-sq: Obs per group: within = 0.5225 min = 13 between = 0.4442 avg = 25.1 overall = 0.2192 max = 26 Wald chi2(27) = 19141.78 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 ------------------- theta -------------------- min 5% median 95% max 0.9398 0.9526 0.9574 0.9574 0.9574 (Std. Err. adjusted for 150 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | -.3771114 .1570673 -2.40 0.016 -.6849577 -.0692652 | year | 1990 | 0 (empty) 1991 | 8.847011 .1222808 72.35 0.000 8.607345 9.086677 1992 | 8.860795 .1242757 71.30 0.000 8.617219 9.104371 1993 | 8.846256 .1243768 71.12 0.000 8.602482 9.09003 1994 | 8.83981 .1272419 69.47 0.000 8.59042 9.089199 1995 | 8.8595 .1273789 69.55 0.000 8.609842 9.109158 1996 | 8.896123 .1266072 70.27 0.000 8.647978 9.144269 1997 | 8.935998 .1264179 70.69 0.000 8.688224 9.183773 1998 | 8.955534 .126363 70.87 0.000 8.707867 9.203201 1999 | 8.974971 .1261952 71.12 0.000 8.727633 9.222309 2000 | 9.002001 .1265789 71.12 0.000 8.75391 9.250091 2001 | 9.022108 .1265541 71.29 0.000 8.774067 9.27015 2002 | 9.039319 .1261503 71.66 0.000 8.792069 9.286569 2003 | 9.062598 .1262449 71.79 0.000 8.815163 9.310034 2004 | 9.108147 .1258611 72.37 0.000 8.861463 9.35483 2005 | 9.144441 .1253595 72.95 0.000 8.898741 9.390141 2006 | 9.188889 .1255203 73.21 0.000 8.942873 9.434904 2007 | 9.23398 .1251881 73.76 0.000 8.988616 9.479344 2008 | 9.259646 .1241626 74.58 0.000 9.016292 9.503 2009 | 9.24518 .1226711 75.37 0.000 9.004749 9.485611 2010 | 9.274918 .1223928 75.78 0.000 9.035032 9.514803 2011 | 9.292536 .1214598 76.51 0.000 9.054479 9.530593 2012 | 9.320143 .1209357 77.07 0.000 9.083113 9.557173 2013 | 9.332115 .1192361 78.27 0.000 9.098417 9.565814 2014 | 9.351434 .1186816 78.79 0.000 9.118822 9.584045 2015 | 9.361607 .1185819 78.95 0.000 9.129191 9.594023 2016 | 9.377884 .1191133 78.73 0.000 9.144427 9.611342 | _cons | 0 (omitted) -------------+---------------------------------------------------------------- sigma_u | .8308469 sigma_e | .18071416 rho | .95482818 (fraction of variance due to u_i) ------------------------------------------------------------------------------ (est1 stored) . . * Insitutions and Climate/Disease Environment Controls . eststo: xtreg logGDP corruption latitude meantemp yellow malaria /// > BritishColony FrenchColony if inrange(year,1991,2016), re theta robust Random-effects GLS regression Number of obs = 1,370 Group variable: ccode Number of groups = 53 R-sq: Obs per group: within = 0.0287 min = 21 between = 0.7746 avg = 25.8 overall = 0.7531 max = 26 Wald chi2(7) = 333.66 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 ------------------- theta -------------------- min 5% median 95% max 0.9222 0.9300 0.9300 0.9300 0.9300 (Std. Err. adjusted for 53 clusters in ccode) ------------------------------------------------------------------------------- | Robust logGDP | Coef. Std. Err. z P>|z| [95% Conf. Interval] --------------+---------------------------------------------------------------- corruption | -.5482486 .1610688 -3.40 0.001 -.8639376 -.2325595 latitude | -.0163834 .0114574 -1.43 0.153 -.0388395 .0060728 meantemp | -.0787527 .0248313 -3.17 0.002 -.1274212 -.0300841 yellow | .2531058 .161039 1.57 0.116 -.0625248 .5687365 malaria | -1.779672 .248567 -7.16 0.000 -2.266855 -1.29249 BritishColony | .3346041 .2079323 1.61 0.108 -.0729356 .7421438 FrenchColony | .1301055 .223867 0.58 0.561 -.3086658 .5688768 _cons | 11.37757 .7036674 16.17 0.000 9.99841 12.75674 --------------+---------------------------------------------------------------- sigma_u | .5185985 sigma_e | .18551804 rho | .88654803 (fraction of variance due to u_i) ------------------------------------------------------------------------------- (est2 stored) . . * Factor input controls . eststo: xtreg logGDP corruption L5.pop oil gas i.year /// > if inrange(year,1991,2016), re theta robust Random-effects GLS regression Number of obs = 3,160 Group variable: ccode Number of groups = 150 R-sq: Obs per group: within = 0.4951 min = 5 between = 0.4068 avg = 21.1 overall = 0.2632 max = 24 Wald chi2(28) = 21864.02 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 ------------------- theta -------------------- min 5% median 95% max 0.8978 0.9364 0.9522 0.9532 0.9532 (Std. Err. adjusted for 150 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | -.411214 .1288626 -3.19 0.001 -.66378 -.158648 | pop | L5. | -.0693919 .1526073 -0.45 0.649 -.3684967 .2297129 | oil | .0006875 .0046833 0.15 0.883 -.0084915 .0098665 gas | .006658 .0030263 2.20 0.028 .0007266 .0125895 | year | 1990 | 0 (empty) 1991 | 8.876484 .1212082 73.23 0.000 8.638921 9.114048 1992 | 8.886808 .1220848 72.79 0.000 8.647527 9.12609 1993 | 8.889481 .1220159 72.86 0.000 8.650334 9.128628 1994 | 8.905905 .1236383 72.03 0.000 8.663579 9.148232 1995 | 8.941346 .1236533 72.31 0.000 8.69899 9.183702 1996 | 8.977162 .1236619 72.59 0.000 8.734789 9.219535 1997 | 8.979474 .1255174 71.54 0.000 8.733464 9.225483 1998 | 8.993978 .1260373 71.36 0.000 8.74695 9.241007 1999 | 9.01057 .1266569 71.14 0.000 8.762328 9.258813 2000 | 9.032845 .1275502 70.82 0.000 8.782851 9.282839 2001 | 9.051869 .1283181 70.54 0.000 8.80037 9.303368 2002 | 9.070323 .1283348 70.68 0.000 8.818792 9.321855 2003 | 9.095781 .1281027 71.00 0.000 8.844705 9.346858 2004 | 9.13989 .1281135 71.34 0.000 8.888792 9.390988 2005 | 9.173562 .1280611 71.63 0.000 8.922567 9.424557 2006 | 9.221749 .1284335 71.80 0.000 8.970024 9.473474 2007 | 9.264067 .1287213 71.97 0.000 9.011778 9.516356 2008 | 9.287866 .1278115 72.67 0.000 9.03736 9.538372 2009 | 9.278617 .1262092 73.52 0.000 9.031251 9.525982 2010 | 9.306521 .1257866 73.99 0.000 9.059983 9.553058 2011 | 9.326824 .1250968 74.56 0.000 9.081639 9.57201 2012 | 9.352372 .1251953 74.70 0.000 9.106994 9.59775 2013 | 9.368365 .1236801 75.75 0.000 9.125956 9.610773 2014 | 9.379452 .1242661 75.48 0.000 9.135895 9.623009 | _cons | 0 (omitted) -------------+---------------------------------------------------------------- sigma_u | .71220365 sigma_e | .16357886 rho | .94989056 (fraction of variance due to u_i) ------------------------------------------------------------------------------ (est3 stored) . . * Create tables . esttab using "tables/randomeff", /// > title("Random Effects Regression of GDP and corruption") /// > drop(19* 20*) se label wrap noabbrev rtf /// > star(* 0.10 ** 0.05 *** 0.01) b(%9.2fc) compress one replace (output written to tables/randomeff.rtf) . eststo clear . . . . ***************************************************************************** . ************* Extra Analysis ***** > ******** . ***************************************************************************** . . **** LDV Regression . . eststo: reg logGDP L(1/3).logGDP corruption i.year, vce(cluster ccode) Linear regression Number of obs = 3,440 F(27, 149) > 99999.00 Prob > F = 0.0000 R-squared = 0.9983 Root MSE = .05038 (Std. Err. adjusted for 150 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- logGDP | L1. | 1.322746 .0469859 28.15 0.000 1.229902 1.415591 L2. | -.2498257 .0499299 -5.00 0.000 -.3484879 -.1511636 L3. | -.0757641 .0274899 -2.76 0.007 -.1300844 -.0214438 | corruption | -.0040037 .0047925 -0.84 0.405 -.0134738 .0054664 | year | 1994 | .0081755 .0092484 0.88 0.378 -.0100994 .0264504 1995 | .0185729 .007793 2.38 0.018 .0031739 .033972 1996 | .0229965 .0066722 3.45 0.001 .0098121 .0361809 1997 | .0216741 .0099343 2.18 0.031 .0020438 .0413043 1998 | .0017634 .0068266 0.26 0.797 -.011726 .0152528 1999 | .0086188 .0055379 1.56 0.122 -.0023242 .0195619 2000 | .0175977 .0064157 2.74 0.007 .0049202 .0302753 2001 | .0096414 .0065701 1.47 0.144 -.0033412 .0226241 2002 | .0070609 .0059521 1.19 0.237 -.0047006 .0188224 2003 | .0148143 .0073261 2.02 0.045 .0003379 .0292907 2004 | .0372223 .0072209 5.15 0.000 .0229537 .051491 2005 | .0192385 .0065463 2.94 0.004 .006303 .032174 2006 | .0276703 .0060102 4.60 0.000 .0157941 .0395464 2007 | .027089 .0059496 4.55 0.000 .0153325 .0388455 2008 | .006625 .0062376 1.06 0.290 -.0057005 .0189505 2009 | -.0268832 .0066719 -4.03 0.000 -.040067 -.0136994 2010 | .0318675 .0060314 5.28 0.000 .0199493 .0437856 2011 | .0110674 .0087024 1.27 0.205 -.0061287 .0282634 2012 | .0087668 .0058807 1.49 0.138 -.0028534 .0203871 2013 | .0082379 .0067126 1.23 0.222 -.0050263 .0215021 2014 | .0115548 .0057635 2.00 0.047 .000166 .0229436 2015 | .0032716 .0066333 0.49 0.623 -.0098358 .016379 2016 | .0054863 .0049794 1.10 0.272 -.004353 .0153257 | _cons | .0286302 .012068 2.37 0.019 .0047837 .0524767 ------------------------------------------------------------------------------ (est1 stored) . . . /* Here, I use a pooled regression specification including lags of GDP. These > lags are useful in two ways: > > First, output is often considered autoregressive, such that a primary determinant > of GDP growth is growth in previous periods. > > Moreover, given that controlling for the lag of GDP is tantamount to controlling > for the persistent effect of lagged determinants of the GDP, the lag variable may > partially control for OVBs in a manner somewhat similar to fixed effects. > Lags should not be expected to capture time-invariant effects as well as FE, but > FE by construction will not capture time-varying omitted variables, while lags > can partially capture time-varying effects that are common between the lagged > and current period. > > Since lagged corruption is presumed to be a determinant of the lagged GDP, it is > worth noting that the explanatory variable becomes a measure of the > effect of the contemporaneous corruption conditional on lagged corruption . > */ . . . /* > --Note 1-- > > Originally, I ran: > reg logGDP L(1/5)logGDP corruption pop i.year, vce(cluster ccode) > But note insignificance of later lags. Reducing it to 3 lags, I get > consistence significance and higher F-stats. This may or may not be > better. > */ . . /* > --Note 2-- > > I could also include lags of the corruption measure, but under the > assumption of linear additivity of variables in logGDP, then this is > already at least partially captured. Running the following regression, > I get no significance for additional corruption lags. > > reg logGDP L(1/3).logGDP L(0/3).corruption i.year, vce(cluster ccode) > */ . . /* As a reminder: Should we trust any regression here with no real case for > exogeneity? No. The regression below looks somewhat plausible, and would seem > to yield a very different interpretation of the effects of corruption than the > preceding LDV models. > reg logGDP L(1/5).logGDP L(0/5).corruption i.year, vce(cluster ccode) > */ . . *** Deep Lags as an instrument . eststo: reg logGDP L(20).logGDP L(20).corruption i.year, vce(cluster ccode) Linear regression Number of obs = 916 F(8, 143) = 141.62 Prob > F = 0.0000 R-squared = 0.8826 Root MSE = .40904 (Std. Err. adjusted for 144 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- logGDP | L20. | .9225919 .0348018 26.51 0.000 .8537994 .9913843 | corruption | L20. | -.0420387 .1217095 -0.35 0.730 -.2826209 .1985435 | year | 2011 | .0265972 .0105458 2.52 0.013 .0057514 .0474431 2012 | .0510926 .0173054 2.95 0.004 .0168851 .0853 2013 | .0849834 .0215163 3.95 0.000 .0424522 .1275145 2014 | .1176754 .0269095 4.37 0.000 .0644836 .1708673 2015 | .1208891 .0284426 4.25 0.000 .0646668 .1771115 2016 | .107422 .0293497 3.66 0.000 .0494066 .1654374 | _cons | 1.088817 .3560613 3.06 0.003 .3849936 1.792641 ------------------------------------------------------------------------------ (est2 stored) . /* Assumes 20-year lagged corruption is only related to current GDP > through it's relationship with current corruption , and 20-year lagged > GDP is only related to current GDP through eg time-invariant > determinants of production (unrelated to current corruption > conditional on lagged corruption ). */ . . /* > Alternately: > reg logGDP L(20).corruption i.year, vce(cluster ccode) > /* Instrument for current corruption using a deep lag. */ > reg logGDP L(20).logGDP corruption i.year, vce(cluster ccode) > /* Control for time fixed effects through deep lag. */ > */ . . *** Reverse Causality? . eststo: xtreg logGDP F(0/3).corruption i.year, fe robust Fixed-effects (within) regression Number of obs = 3,456 Group variable: ccode Number of groups = 150 R-sq: Obs per group: within = 0.4850 min = 12 between = 0.4557 avg = 23.0 overall = 0.2483 max = 24 F(27,149) = 19.21 corr(u_i, Xb) = 0.3606 Prob > F = 0.0000 (Std. Err. adjusted for 150 clusters in ccode) ------------------------------------------------------------------------------ | Robust logGDP | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- corruption | --. | -.167645 .1487533 -1.13 0.262 -.4615836 .1262936 F1. | .0904085 .0915758 0.99 0.325 -.0905464 .2713635 F2. | -.045818 .0347635 -1.32 0.190 -.1145111 .0228751 F3. | -.2823386 .1418924 -1.99 0.048 -.5627199 -.0019573 | year | 1991 | -.0032022 .0104994 -0.30 0.761 -.0239492 .0175448 1992 | .0156724 .0144659 1.08 0.280 -.0129125 .0442572 1993 | .0025382 .0164146 0.15 0.877 -.0298974 .0349737 1994 | -.0038946 .0206735 -0.19 0.851 -.0447456 .0369564 1995 | .0154389 .0217961 0.71 0.480 -.0276305 .0585082 1996 | .0510052 .0227395 2.24 0.026 .0060716 .0959388 1997 | .0889423 .0250265 3.55 0.001 .0394896 .138395 1998 | .1079393 .0253856 4.25 0.000 .057777 .1581015 1999 | .1279088 .0266435 4.80 0.000 .0752609 .1805567 2000 | .1549428 .0274081 5.65 0.000 .1007841 .2091016 2001 | .1743067 .0297941 5.85 0.000 .1154332 .2331802 2002 | .1908151 .0309546 6.16 0.000 .1296484 .2519818 2003 | .2142887 .0324318 6.61 0.000 .1502032 .2783743 2004 | .2604348 .0337661 7.71 0.000 .1937125 .3271571 2005 | .2973377 .0346783 8.57 0.000 .2288129 .3658624 2006 | .3417666 .0354027 9.65 0.000 .2718104 .4117229 2007 | .3868183 .0367232 10.53 0.000 .3142528 .4593839 2008 | .4103014 .0377707 10.86 0.000 .335666 .4849368 2009 | .3948576 .0380135 10.39 0.000 .3197423 .4699728 2010 | .4215536 .0377782 11.16 0.000 .3469033 .4962039 2011 | .4390584 .0388459 11.30 0.000 .3622984 .5158184 2012 | .4677141 .0387661 12.07 0.000 .3911118 .5443164 2013 | .4828387 .0392157 12.31 0.000 .4053479 .5603295 | _cons | 8.858039 .0898457 98.59 0.000 8.680503 9.035575 -------------+---------------------------------------------------------------- sigma_u | 1.1424536 sigma_e | .18006869 rho | .97575947 (fraction of variance due to u_i) ------------------------------------------------------------------------------ (est3 stored) . . /* > We find that *future* corruption values are estimated to affect current consumption, > which is probably not reasonable. Hence, it appears likely there is simultaneity > between GDP and the quality of government measure. > */ . . esttab using "tables/leadsandlags", /// > title("Further Analysis: LDV Regression and Reverse Cauality") /// > mtitles("LDV" "Deep Lags as Instruments" "Reverse Causality") /// > drop(19* 20*) se label wrap noabbrev rtf /// > star(* 0.10 ** 0.05 *** 0.01) compress one replace (output written to tables/leadsandlags.rtf) . eststo clear . . . . . ********** End do-file . . log close _all name: log: C:\Users\Andrew\Dropbox\Teaching 2018\Assignment3\Assignment3log.log log type: text closed on: 11 Oct 2018, 04:36:24 -------------------------------------------------------------------------------------------------------