Friday, 15 April 2016

Frequent Issues occurred during Spark Development

While coding,we face many issues,be it compilation or execution. So I tried to collate some frequently faced issues for Spark development here.

  •    When we run spark on windows, sometimes following error is displayed:
Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxrwxr-x
at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:529)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:478)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:430)
... 7 more

            Solution:
You need to give 777 permission to this directory. 
Lets say, if /tmp/hive is present in your D: drive, run following command:

D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive
For complete installation steps, you can refer previous post.


  •    How to launch Master and worker on windows manually?
            Solution:
Open command prompt and go to %SPARK_HOME%/bin folder.  Run the following commands:

spark-class org.apache.spark.deploy.master.Master     <= for master node
spark-class org.apache.spark.deploy.worker.Worker spark://masternode:7077  <= for worker node


  •     How to get rid of “A master url is not set for configuration” error?
            Solution:
From command line:

Set –Dspark.master=spark://hostname:7077 as a JVM parameter

From code, use SparkConf.setMaster() method.
SparkConf conf = new SparkConf().setAppName("App_Name").setMaster("spark://hostname:7077);


  •     How to solve following “System memory, Please use larger heap” size error?
Exception in thread "main" java.lang.IllegalArgumentException: System memory 259522560 must be at least 4.718592E8. Please use a larger heap size.
at
org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:193)
at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:175)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)
       at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
       at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
       at org.apache.spark.SparkContext.<init>(SparkContext.scala:457)
       at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
       at com.spark.example.SimpleApp.main(SimpleApp.java:18)

            Solution:
Add -Xmx1024m -Xms512m in VM arguments


             Stay tuned for further updates..!!!