Climate Maps
This pipeline maps annual climate indicators onto Rwanda’s sectors. It cleans ERA5-Land
climate values keyed by health facility, aligns their names to the admin3 boundaries,
joins the two, and renders a binned choropleth. The full source is
analysis/spatial-analysis/Climate_maps.ipynb.
The end result is a two-panel choropleth of annual mean temperature and total precipitation:

The data
The climate workbook (Climate data.xls) holds ERA5-Land variables per health facility,
with the facility’s location described by the DHIS2 org-unit hierarchy. Only five variables
are carried forward:
| Source column | Mapped to |
|---|---|
RF001 Relative humidity (ERA5-Land) | humidity |
TMP001 Air temperature (ERA5-Land) | Average_temperature |
TMP002 Max air temperature (ERA5-Land) | Max_temperature |
TMP005 Min temperature (ERA5-Land) | Min_temperature |
RF003 Precipitation (ERA5-Land) | Rainfall |
The org-unit levels map onto the administrative hierarchy — orgunitlevel3 is the district
and orgunitlevel5 is the sector, which is the join key for admin3.
Pipeline
Read the climate workbook
The header spans two rows, so the first row is skipped and the legacy .xls format needs
the xlrd engine:
import pandas as pd
cl_df = pd.read_excel("Climate data.xls", engine="xlrd", skiprows=1)Derive the join keys
District and sector names are extracted from the org-unit columns and normalised (the sector key is later combined with the district to disambiguate duplicate sector names):
cl_df["District"] = cl_df["orgunitlevel3"].str.split(" ").str[0]
cl_df["Sector"] = cl_df["orgunitlevel5"].str.split(r"[ _-]").str[0].str.capitalize()Rename and select
cl_df["humidity"] = cl_df["RF001 Relative humidity (ERA5-Land)"]
cl_df["Average_temperature"] = cl_df["TMP001 Air temperature (ERA5-Land)"]
cl_df["Max_temperature"] = cl_df["TMP002 Max air temperature (ERA5-Land)"]
cl_df["Min_temperature"] = cl_df["TMP005 Min temperature (ERA5-Land)"]
cl_df["Rainfall"] = cl_df["RF003 Precipitation (ERA5-Land)"]
climate_data = cl_df[["periodname", "Sector", "District", "Average_temperature",
"Max_temperature", "Min_temperature", "Rainfall", "humidity"]]
climate_data["periodname"] = pd.to_datetime(climate_data["periodname"], format="%B %Y")Join to sectors
The admin3 shapefile is the geometry side. A combined District + Sector key on both
sides makes the merge unambiguous:
merged = pd.merge(admin3, climate_data, on="Sector", how="inner")Aggregate to a year
Monthly rows are collapsed to an annual mean temperature and total precipitation per
sector, then rebuilt as a GeoDataFrame:
import geopandas as gpd
agg = merged[merged["periodname"].dt.year == year_to_plot]
g = (agg.groupby("adm_name", as_index=False)
.agg({"geometry": "first",
"Average_temperature": "mean",
"Rainfall": "sum"}))
gdf = gpd.GeoDataFrame(g, geometry="geometry", crs=merged.crs)Name standardization
The single most error-prone step is matching climate sector names to the shapefile. Spelling drifts between systems, and an unmatched key silently drops a sector from the inner join. Known fixes are applied explicitly:
Align climate-data place names with the admin3 shapefile so no sector is lost in the join.
- Before merging any facility-keyed dataset onto admin boundaries
- Whenever an inner join returns fewer sectors than expected
# Shapefile side
admin3["ADM3_EN"].replace("Mageregere", "Mageragere", inplace=True)
admin3["ADM3_EN"].replace("Shyrongi", "Shyorongi", inplace=True)
admin3["Sector"] = admin3["ADM2_EN"] + " " + admin3["ADM3_EN"]
# Climate side
climate_data["Sector"].replace("Ririma", "Rilima", inplace=True)- An inner join hides mismatches — compare row counts before and after to catch dropped sectors
- These fixes are specific to the 2006 NISR boundaries; verify against the shapefile you actually load
Binning and rendering
The choropleth uses fixed, meaningful class breaks rather than a continuous ramp so the map
reads at a glance. Temperature and precipitation each get their own bins and colormap
(YlOrRd for temperature, Blues for precipitation):
temp_bins = [0, 15, 17, 19, 22, gdf["Average_temperature"].max()]
temp_labels = ["0-15", "15-17", "17-19", "19-22", "> 22"]
precip_bins = [0, 200, 300, 500, 800, gdf["Rainfall"].max()]
precip_labels = ["0-200", "200-300", "300-500", "500-800", "> 800"]Each panel plots the value column with its colormap, overlays the admin1–admin3
boundaries for context, and draws a patch legend from the bin colors. See the notebook’s
final cell for the complete two-panel figure.
This image is regenerated by analysis/spatial-analysis/generate_figures.py, which executes
Climate_maps.ipynb and writes climate_two_panel_map.png locally; it is then published to
Hugging Face (nhic/rwanda-spatial/charts/) by scripts/upload-spatial-to-hf.sh.