Skip to Content
AnalyticsSparkGetting Started

Getting Started with Spark

Prerequisites

Ensure you have Docker + Docker Compose and Python 3.9 installed.

cd apps/analytics/spark

Start the cluster

docker-compose up -d

This starts the Spark master, workers, and Livy server.

Verify the Spark UI

Navigate to http://localhost:8080  to confirm the cluster is running.

Verify the Livy API

curl http://localhost:8998/sessions

A successful response returns an empty sessions list.

cd apps

Install dependencies

pipenv install

In production, Spark and Livy run on Kubernetes. See Infrastructure → Kubernetes → Helm Charts for the spark chart.

Last updated on