Part II: Building the Post-Soviet Dataset

Part II moves from research design to an analysis-ready gravity panel. The reference dataset is data/ExSoviet_balanced_clean.csv, a directed Post-Soviet country-pair-year panel with 5,253 observations, 20 columns, 15 exporters, 15 importers, and years 1992-2020.

The section teaches students how CEPII gravity variables become a replication dataset, how Python validates the data structure, and why descriptive evidence must come before regression tables.

Chp Topic Main Output
04 CEPII data CEPII variable map
05 Dataset building Cleaned Post-Soviet panel
06 Descriptive analysis Descriptive statistics and network evidence

This part contributes to the final publication-ready paper by documenting the empirical sample before estimation begins. A credible gravity paper must explain where the data come from, how variables are constructed, what observations are included, and what the raw patterns suggest before interpreting coefficients.