Apache Spark
Apache Spark is a fast and general engine for large-scale data processing.
In order to gather Spark workers statistics, we need to download and enable Jolokia JVM Agent
Jolokia
cd /usr/share/java/
wget -O jolokia-agent.jar http://search.maven.org/remotecontent?filepath=org/jolokia/jolokia-jvm/1.3.6/jolokia-jvm-1.3.6-agent.jar
Spark Master
As far, as Jolokia JVM Agent is downloaded, we should configure Apache Spark, to use it as JavaAgent for workers and expose metrics via HTTP/Json.
Edit spark-env.sh
. It should be in /usr/local/spark/conf
and add following parameters :
export SPARK_MASTER_OPTS="$SPARK_MASTER_OPTS -javaagent:/usr/share/java/jolokia-agent.jar=config=/usr/local/spark/conf/jolokia-master.properties"
Now create /usr/local/spark/conf/jolokia-master.properties
file with following content
(Assuming that spark install folder is /usr/local/spark
, if not change the pathe to one on which Spark is installed ):
host=0.0.0.0
port=7777
agentContext=/jolokia
backlog=100
policyLocation=file:///usr/local/spark/conf/jolokia.policy
historyMaxEntries=10
debug=false
debugMaxEntries=100
maxDepth=15
maxCollectionSize=1000
maxObjects=0
Now we need to create /usr/local/spark/conf/jolokia.policy
with following content :
<?xml version="1.0" encoding="utf-8"?>
<restrict>
<http>
<method>get</method>
<method>post</method>
</http>
<commands>
<command>read</command>
<command>list</command>
<command>search</command>
</commands>
</restrict>
Configure Agent with following in conf/bigdata.ini
file :
[Spark-Master]
stats: http://127.0.0.1:7777/jolokia/read
Restart Spark master.
Spark worker
Edit spark-env.sh
. It should be in /usr/local/spark/conf
and add following parameters :
export SPARK_WORKER_OPTS="$SPARK_WORKER_OPTS -javaagent:/usr/share/java/jolokia-agent.jar=config=/usr/local/spark/conf/jolokia-worker.properties"
Now create /usr/local/spark/conf/jolokia-worker.properties
file with following content
(Assuming that spark install folder is /usr/local/spark
, if not, change the path to one on which Spark is installed ):
host=0.0.0.0
port=7778
agentContext=/jolokia
backlog=100
policyLocation=file:///usr/local/spark/conf/jolokia.policy
historyMaxEntries=10
debug=false
debugMaxEntries=100
maxDepth=15
maxCollectionSize=1000
maxObjects=0
Create /usr/local/spark/conf/jolokia.policy
with following content :
<?xml version="1.0" encoding="utf-8"?>
<restrict>
<http>
<method>get</method>
<method>post</method>
</http>
<commands>
<command>read</command>
<command>list</command>
<command>search</command>
</commands>
</restrict>
Configure Agent with following in conf/bigdata.ini
file :
[Spark-Worker]
stats: http://127.0.0.1:7778/jolokia/read
Restart Spark worker.