Climate Maps

This pipeline maps annual climate indicators onto Rwanda’s sectors. It cleans ERA5-Land climate values keyed by health facility, aligns their names to the admin3 boundaries, joins the two, and renders a binned choropleth. The full source is analysis/spatial-analysis/Climate_maps.ipynb.

The end result is a two-panel choropleth of annual mean temperature and total precipitation:

The data

The climate workbook (Climate data.xls) holds ERA5-Land variables per health facility, with the facility’s location described by the DHIS2 org-unit hierarchy. Only five variables are carried forward:

Source column	Mapped to
`RF001 Relative humidity (ERA5-Land)`	`humidity`
`TMP001 Air temperature (ERA5-Land)`	`Average_temperature`
`TMP002 Max air temperature (ERA5-Land)`	`Max_temperature`
`TMP005 Min temperature (ERA5-Land)`	`Min_temperature`
`RF003 Precipitation (ERA5-Land)`	`Rainfall`

The org-unit levels map onto the administrative hierarchy — orgunitlevel3 is the district and orgunitlevel5 is the sector, which is the join key for admin3.

Pipeline

Read the climate workbook

The header spans two rows, so the first row is skipped and the legacy .xls format needs the xlrd engine:


import pandas as pd
 
cl_df = pd.read_excel("Climate data.xls", engine="xlrd", skiprows=1)

Derive the join keys

District and sector names are extracted from the org-unit columns and normalised (the sector key is later combined with the district to disambiguate duplicate sector names):


cl_df["District"] = cl_df["orgunitlevel3"].str.split(" ").str[0]
cl_df["Sector"] = cl_df["orgunitlevel5"].str.split(r"[ _-]").str[0].str.capitalize()

Rename and select


cl_df["humidity"] = cl_df["RF001 Relative humidity (ERA5-Land)"]
cl_df["Average_temperature"] = cl_df["TMP001 Air temperature (ERA5-Land)"]
cl_df["Max_temperature"] = cl_df["TMP002 Max air temperature (ERA5-Land)"]
cl_df["Min_temperature"] = cl_df["TMP005 Min temperature (ERA5-Land)"]
cl_df["Rainfall"] = cl_df["RF003 Precipitation (ERA5-Land)"]
 
climate_data = cl_df[["periodname", "Sector", "District", "Average_temperature",
                      "Max_temperature", "Min_temperature", "Rainfall", "humidity"]]
climate_data["periodname"] = pd.to_datetime(climate_data["periodname"], format="%B %Y")

Join to sectors

The admin3 shapefile is the geometry side. A combined District + Sector key on both sides makes the merge unambiguous:


merged = pd.merge(admin3, climate_data, on="Sector", how="inner")

Aggregate to a year

Monthly rows are collapsed to an annual mean temperature and total precipitation per sector, then rebuilt as a GeoDataFrame:


import geopandas as gpd
 
agg = merged[merged["periodname"].dt.year == year_to_plot]
g = (agg.groupby("adm_name", as_index=False)
        .agg({"geometry": "first",
              "Average_temperature": "mean",
              "Rainfall": "sum"}))
gdf = gpd.GeoDataFrame(g, geometry="geometry", crs=merged.crs)

Name standardization

The single most error-prone step is matching climate sector names to the shapefile. Spelling drifts between systems, and an unmatched key silently drops a sector from the inner join. Known fixes are applied explicitly:

Sector name reconciliationGeoPandasPython

Purpose

Align climate-data place names with the admin3 shapefile so no sector is lost in the join.

When to use

Before merging any facility-keyed dataset onto admin boundaries
Whenever an inner join returns fewer sectors than expected

Usage


# Shapefile side
admin3["ADM3_EN"].replace("Mageregere", "Mageragere", inplace=True)
admin3["ADM3_EN"].replace("Shyrongi", "Shyorongi", inplace=True)
admin3["Sector"] = admin3["ADM2_EN"] + " " + admin3["ADM3_EN"]
 
# Climate side
climate_data["Sector"].replace("Ririma", "Rilima", inplace=True)

Caveats & gotchas

An inner join hides mismatches — compare row counts before and after to catch dropped sectors
These fixes are specific to the 2006 NISR boundaries; verify against the shapefile you actually load

Data Reference

Binning and rendering

The choropleth uses fixed, meaningful class breaks rather than a continuous ramp so the map reads at a glance. Temperature and precipitation each get their own bins and colormap (YlOrRd for temperature, Blues for precipitation):


temp_bins = [0, 15, 17, 19, 22, gdf["Average_temperature"].max()]
temp_labels = ["0-15", "15-17", "17-19", "19-22", "> 22"]
 
precip_bins = [0, 200, 300, 500, 800, gdf["Rainfall"].max()]
precip_labels = ["0-200", "200-300", "300-500", "500-800", "> 800"]

Each panel plots the value column with its colormap, overlays the admin1–admin3 boundaries for context, and draws a patch legend from the bin colors. See the notebook’s final cell for the complete two-panel figure.

This image is regenerated by analysis/spatial-analysis/generate_figures.py, which executes Climate_maps.ipynb and writes climate_two_panel_map.png locally; it is then published to Hugging Face (nhic/rwanda-spatial/charts/) by scripts/upload-spatial-to-hf.sh.