Skip to Content

Climate Maps

This pipeline maps annual climate indicators onto Rwanda’s sectors. It cleans ERA5-Land climate values keyed by health facility, aligns their names to the admin3 boundaries, joins the two, and renders a binned choropleth. The full source is analysis/spatial-analysis/Climate_maps.ipynb.

The end result is a two-panel choropleth of annual mean temperature and total precipitation:

charts/climate_two_panel_map.png

The data

The climate workbook (Climate data.xls) holds ERA5-Land variables per health facility, with the facility’s location described by the DHIS2 org-unit hierarchy. Only five variables are carried forward:

Source columnMapped to
RF001 Relative humidity (ERA5-Land)humidity
TMP001 Air temperature (ERA5-Land)Average_temperature
TMP002 Max air temperature (ERA5-Land)Max_temperature
TMP005 Min temperature (ERA5-Land)Min_temperature
RF003 Precipitation (ERA5-Land)Rainfall

The org-unit levels map onto the administrative hierarchy — orgunitlevel3 is the district and orgunitlevel5 is the sector, which is the join key for admin3.

Pipeline

Read the climate workbook

The header spans two rows, so the first row is skipped and the legacy .xls format needs the xlrd engine:

import pandas as pd cl_df = pd.read_excel("Climate data.xls", engine="xlrd", skiprows=1)

Derive the join keys

District and sector names are extracted from the org-unit columns and normalised (the sector key is later combined with the district to disambiguate duplicate sector names):

cl_df["District"] = cl_df["orgunitlevel3"].str.split(" ").str[0] cl_df["Sector"] = cl_df["orgunitlevel5"].str.split(r"[ _-]").str[0].str.capitalize()

Rename and select

cl_df["humidity"] = cl_df["RF001 Relative humidity (ERA5-Land)"] cl_df["Average_temperature"] = cl_df["TMP001 Air temperature (ERA5-Land)"] cl_df["Max_temperature"] = cl_df["TMP002 Max air temperature (ERA5-Land)"] cl_df["Min_temperature"] = cl_df["TMP005 Min temperature (ERA5-Land)"] cl_df["Rainfall"] = cl_df["RF003 Precipitation (ERA5-Land)"] climate_data = cl_df[["periodname", "Sector", "District", "Average_temperature", "Max_temperature", "Min_temperature", "Rainfall", "humidity"]] climate_data["periodname"] = pd.to_datetime(climate_data["periodname"], format="%B %Y")

Join to sectors

The admin3 shapefile is the geometry side. A combined District + Sector key on both sides makes the merge unambiguous:

merged = pd.merge(admin3, climate_data, on="Sector", how="inner")

Aggregate to a year

Monthly rows are collapsed to an annual mean temperature and total precipitation per sector, then rebuilt as a GeoDataFrame:

import geopandas as gpd agg = merged[merged["periodname"].dt.year == year_to_plot] g = (agg.groupby("adm_name", as_index=False) .agg({"geometry": "first", "Average_temperature": "mean", "Rainfall": "sum"})) gdf = gpd.GeoDataFrame(g, geometry="geometry", crs=merged.crs)

Name standardization

The single most error-prone step is matching climate sector names to the shapefile. Spelling drifts between systems, and an unmatched key silently drops a sector from the inner join. Known fixes are applied explicitly:

Sector name reconciliationGeoPandasPython
Purpose

Align climate-data place names with the admin3 shapefile so no sector is lost in the join.

When to use
  • Before merging any facility-keyed dataset onto admin boundaries
  • Whenever an inner join returns fewer sectors than expected
Usage
# Shapefile side admin3["ADM3_EN"].replace("Mageregere", "Mageragere", inplace=True) admin3["ADM3_EN"].replace("Shyrongi", "Shyorongi", inplace=True) admin3["Sector"] = admin3["ADM2_EN"] + " " + admin3["ADM3_EN"] # Climate side climate_data["Sector"].replace("Ririma", "Rilima", inplace=True)
Caveats & gotchas
  • An inner join hides mismatches — compare row counts before and after to catch dropped sectors
  • These fixes are specific to the 2006 NISR boundaries; verify against the shapefile you actually load
Related

Binning and rendering

The choropleth uses fixed, meaningful class breaks rather than a continuous ramp so the map reads at a glance. Temperature and precipitation each get their own bins and colormap (YlOrRd for temperature, Blues for precipitation):

temp_bins = [0, 15, 17, 19, 22, gdf["Average_temperature"].max()] temp_labels = ["0-15", "15-17", "17-19", "19-22", "> 22"] precip_bins = [0, 200, 300, 500, 800, gdf["Rainfall"].max()] precip_labels = ["0-200", "200-300", "300-500", "500-800", "> 800"]

Each panel plots the value column with its colormap, overlays the admin1admin3 boundaries for context, and draws a patch legend from the bin colors. See the notebook’s final cell for the complete two-panel figure.

This image is regenerated by analysis/spatial-analysis/generate_figures.py, which executes Climate_maps.ipynb and writes climate_two_panel_map.png locally; it is then published to Hugging Face (nhic/rwanda-spatial/charts/) by scripts/upload-spatial-to-hf.sh.

Last updated on