Getting Started with Spark
Prerequisites
Ensure you have Docker + Docker Compose and Python 3.9 installed.
Navigate to the Spark app directory
cd apps/analytics/sparkStart the cluster
docker-compose up -dThis starts the Spark master, workers, and Livy server.
Verify the Spark UI
Navigate to http://localhost:8080 to confirm the cluster is running.
Verify the Livy API
curl http://localhost:8998/sessionsA successful response returns an empty sessions list.
Navigate to the apps directory
cd appsInstall dependencies
pipenv installIn production, Spark and Livy run on Kubernetes. See Infrastructure → Kubernetes → Helm Charts for the spark chart.
Last updated on