Spark Mode - Local/Client/Cluster

20 Nov 2016

Refer to http://spark.apache.org/docs/latest/submitting-applications.html

Launching Applications with spark-submit

Once a user application is bundled, it can be launched using the bin/spark-submit script.This script takes care of setting up the classpath with Spark and itsdependencies, and can support different cluster managers and deploy modes that Spark supports:

1
2
3
4
5
6
7
8
./bin/spark-submit \
  --class <main-class> \
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # other options
  <application-jar> \
  [application-arguments]

Some of the commonly used options are:

  • --class: The entry point for your application (e.g. org.apache.spark.examples.SparkPi)
  • --master: The master URL for the cluster (e.g. spark://23.195.26.187:7077)
  • --deploy-mode: Whether to deploy your driver on the worker nodes (cluster) or locally as an external client (client) (default: client)
  • --conf: Arbitrary Spark configuration property in key=value format. For values that contain spaces wrap “key=value” in quotes (as shown).
  • application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes.
  • application-arguments: Arguments passed to the main method of your main class, if any

Master URLs

The master URL passed to Spark can be in one of the following formats:

Master URL Meaning
local Run Spark locally with one worker thread (i.e. no parallelism at all).
local[K] Run Spark locally with K worker threads.
local[*] Run Spark locally with as many worker threads as logical cores on your machine.
spark://HOST:PORT Connect to the given Spark standalone cluster master.
mesos://HOST:PORT Connect to the given Mesos cluster.
yarn Connect to a YARN cluster in client or cluster mode.

Spark Shell Command

Run the following commands is running succesffully in both local machine or cluster node.

1
2
3
4
5
6
7
8
# Local Mode
./spark-shell

# Client Mode
./spark-shell --master spark://9.111.159.156:7077

# Cluster Mode
./spark-shell --master spark://9.111.159.156:7077 --deploy-mode cluster

Spark Local Mode

Run the following command in the local laptop/cluster node

1
2
3
4
./spark-submit \
 --class main.scala.internals.GroupByKeyTest \
 --master local[2] \
/out/artifacts/GroupByKeyTest1102_jar/GroupByKeyTest1102.jar

Spark Client Mode

Run the following command in the local laptop/cluster node

1
2
3
4
5
6
7
8
9
10
./spark-submit \
--master spark://9.111.159.156:7077 \
--class org.apache.spark.examples.GroupByTest \
../lib/spark-examples-1.6.2-hadoop2.6.0.jar

./spark-submit \
 --class main.scala.internals.GroupByKeyTest \
 --master spark://9.111.159.156:7077 \
 --deploy-mode client \
/myhome/hadoop/upload/GroupByKeyTest1102.jar

Spark Cluster Mode

Run the following command in the local laptop/cluster node

1
2
3
4
5
6
7
8
9
10
11
12
13
./spark-submit \
--master spark://9.111.159.156:7077 \
--class org.apache.spark.examples.GroupByTest \
 --deploy-mode cluster \
../lib/spark-examples-1.6.2-hadoop2.6.0.jar \
2 1000 1000 2

./spark-submit \
--master spark://9.111.159.156:7077 \
--class org.apache.spark.examples.GroupByTest \
--deploy-mode cluster \
../lib/spark-examples-1.6.2-hadoop2.6.0.jar \
100 1000 10000 36

<< Older Post     Newer Post >>