A layman’s guide to install and setup Apache Spark

https://databricks.com/spark/about
  • An Ubuntu system. (We developers love open source more than anything)
  • Access to a terminal or command line. (ctrl + alt + T)
  • A user with sudo or root permissions. (SUDO — never forget your roots)
  1. JDK
  2. Scala
  3. Git
sudo apt install default-jdk scala git -y
java -version; javac -version; scala -version; git --version
wget https://downloads.apache.org/spark/spark-3.0.1/spark-3.0.1-bin-hadoop2.7.tgz
tar xvf spark-*
sudo mv spark-3.0.1-bin-hadoop2.7 /opt/spark
echo "export SPARK_HOME=/opt/spark" >> ~/.profile
echo "export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin" >> ~/.profile
echo "export PYSPARK_PYTHON=/usr/bin/python3" >> ~/.profile
original image
start-master.sh
start-slave.sh spark://master:port
stop-master.sh
stop-slave.sh

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store