Data Setup¶
DisruptSC keeps model code and large input datasets separate. The public
repository contains the model code, configuration, documentation, and a small
bundled Testkistan dataset for smoke tests and examples. Full country and
regional datasets should live outside the code repository.
Data Location Priority¶
DisruptSC resolves data per scope:
- If
DISRUPT_SC_DATA_PATHis set, use that data root for all scopes. - Otherwise, if the requested scope is
Testkistan, use the bundled example data at./examples/data/Testkistan. - Otherwise, use the sibling private data repository at
../disrupt-sc-data.
If DISRUPT_SC_DATA_PATH is set but points to a folder that does not exist,
DisruptSC raises a clear error instead of silently falling back to bundled data.
Recommended Setup: Sibling Data Repository¶
If you have access to the private data repository, clone it next to
disrupt-sc:
cd DisruptSC
git clone https://github.com/ccolon/disrupt-sc.git
git clone <private-data-repo-url> disrupt-sc-data
cd disrupt-sc
With this layout, no environment variable is needed. DisruptSC automatically
uses ../disrupt-sc-data.
Custom Data Location¶
If your data repository is elsewhere, set DISRUPT_SC_DATA_PATH to the folder
that contains scope folders such as Cambodia, Ecuador, or Testkistan.
PowerShell:
bash/zsh:
For a persistent user-level setting on Windows:
[Environment]::SetEnvironmentVariable(
"DISRUPT_SC_DATA_PATH",
"C:\path\to\disrupt-sc-data",
"User"
)
Open a new terminal after setting a persistent environment variable.
Bundled Test Data¶
If DISRUPT_SC_DATA_PATH is not set, DisruptSC uses
./examples/data/Testkistan for the synthetic Testkistan dataset.
Required Data Structure¶
Input data must be organized by scope inside the resolved data root:
<data-root>/
+-- <scope>/ # e.g. Cambodia, Ecuador, Testkistan
+-- Economic/ # MRIO tables, sector definitions, firm data
+-- Transport/ # Infrastructure GeoPackage files
+-- Spatial/ # Geographic disaggregation files
+-- Disruption/ # Optional scenario files
The exact filenames are configured in
config/user_defined_<scope>.yaml (or config/user_defined_<scope>.local.yaml
for an untracked personal version) under filepaths.
Scope Configuration¶
Each runnable scope needs:
- A data folder at
<data-root>/<scope>/. - A parameter file at
config/user_defined_<scope>.yaml(committed) orconfig/user_defined_<scope>.local.yaml(gitignored, for personal tweaks).
Only the bundled Testkistan scope ships with a committed parameter file. For
any other scope you are expected to create a .local.yaml file pointing at
your own data folder.
For example, with the sibling data repository:
File Requirements¶
Essential Files¶
Economic/mrio.csv- Input-output table.Economic/sector_table.csv- Sector definitions.Transport/transport.gpkg- Transport network GeoPackage.Spatial/households.geojson- Household locations.
Transport Networks¶
At minimum, roads are required. Additional transport modes are optional:
- Maritime networks for international shipping.
- Railways for freight transport.
- Airways for high-value goods.
- Waterways for inland navigation.
- Pipelines for energy and chemicals.
Data Modes¶
MRIO mode is the default and uses:
Economic/mrio.csvEconomic/sector_table.csvSpatial/*.geojson
Supplier-buyer network mode additionally uses:
Economic/firm_table.csvEconomic/location_table.csvEconomic/transaction_table.csv
Verification¶
Check which data path DisruptSC resolves:
Then run a smoke test with bundled data:
Troubleshooting¶
Data Path Not Found¶
Check the resolved data root:
If using DISRUPT_SC_DATA_PATH, verify that it points to the data root, not to
an individual scope folder.
PowerShell:
bash/zsh:
Missing Scope¶
If DisruptSC resolves the data root correctly but a scope fails to load, verify that the scope folder exists:
If you are using only the public repository, use Testkistan unless you have
provided another dataset.
Invalid File Formats¶
- CSV files should use UTF-8 encoding.
- Transport edges must use LineString geometries.
- Spatial agent files must use Point geometries.
- Required columns must match the parameter file references.
What's Next?¶
After setting up data:
- Read the Quick Start.
- Review Input Validation.
- Customize Parameters.