为什么我每次开机都会出现 Can not Load Library
为什么我每次开机都会出现 Can not Load Library
出现此问题的原因可能是因为用户对系统中目录没有可写权限造成的
Spark: Could not load native gpl library
6 messages
Spark: Could not load native gpl library
I had the following error when trying to run a very simple spark job (which uses logistic regression with SGD in mllib):
ERROR GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
& & at java.lang.ClassLoader.loadLibrary(
& & at java.lang.Runtime.loadLibrary0(
& & at java.lang.System.loadLibrary(
& & at pression.lzo.GPLNativeCodeLoader.&clinit&(
& & at pression.lzo.LzoCodec.&clinit&(
& & at java.lang.Class.forName0(Native Method)
& & at java.lang.Class.forName(
& & at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(
& & at org.apache.hadoop.conf.Configuration.getClassByName(
& & at org.apache.pressionCodecFactory.getCodecClasses(
& & at org.apache.pressionCodecFactory.&init&(
& & at org.apache.hadoop.mapred.TextInputFormat.configure(
& & at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
& & at sun.reflect.NativeMethodAccessorImpl.invoke(
& & at sun.reflect.DelegatingMethodAccessorImpl.invoke(
& & at java.lang.reflect.Method.invoke(
& & at org.apache.hadoop.util.ReflectionUtils.setJobConf(
& & at org.apache.hadoop.util.ReflectionUtils.setConf(
& & at org.apache.hadoop.util.ReflectionUtils.newInstance(
& & at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:155)
& & at org.apache.spark.rdd.HadoopRDD$$anon$1.&init&(HadoopRDD.scala:187)
& & at org.apache.spark.pute(HadoopRDD.scala:181)
& & at org.apache.spark.pute(HadoopRDD.scala:93)
& & at org.apache.spark.puteOrReadCheckpoint(RDD.scala:262)
& & at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
& & at org.apache.spark.pute(MappedRDD.scala:31)
& & at org.apache.spark.puteOrReadCheckpoint(RDD.scala:262)
& & at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
& & at org.apache.spark.pute(MappedRDD.scala:31)
& & at org.apache.spark.puteOrReadCheckpoint(RDD.scala:262)
& & at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
& & at org.apache.spark.pute(FilteredRDD.scala:34)
& & at org.apache.spark.puteOrReadCheckpoint(RDD.scala:262)
& & at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
& & at org.apache.spark.pute(MappedRDD.scala:31)
& & at org.apache.spark.puteOrReadCheckpoint(RDD.scala:262)
& & at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
& & at org.apache.spark.pute(MappedRDD.scala:31)
& & at org.apache.spark.puteOrReadCheckpoint(RDD.scala:262)
& & at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
& & at org.apache.spark.pute(FilteredRDD.scala:34)
& & at org.apache.spark.puteOrReadCheckpoint(RDD.scala:262)
& & at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
& & at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
& & at
& & at org.apache.spark.executor.Executor$
& & at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
& & at java.util.concurrent.ThreadPoolExecutor$
& & at
14/08/06 20:32:11 ERROR LzoCodec: Cannot load native-lzo without native-hadoop
This is the command I used to submit the job:
~/spark/spark-1.0.0-bin-hadoop2/bin/spark-submit \
--class com.jk.sparktest.Test \
--master yarn-cluster \
--num-executors 40 \
The actual java command is :
/usr/java/latest/bin/java -cp /apache/hadoop/share/hadoop/common/hadoop-common- &\
-XX:MaxPermSize=128m \
-Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit \
--class com.jk.sparktest.Test &\
--master yarn-cluster &\
--num-executors 40 & \
Seems the -Djava.library.path is not set. I also tried the java command above and supplied the native lib directory to the java.library.path, but still got the same errors.
Any idea on what's wrong? Thanks.
Re: Spark: Could not load native gpl library
Is the GPL library only available on the driver node? If that is the
case, you need to add them to `--jars` option of spark-submit.
To unsubscribe, e-mail:
For additional commands, e-mail:
Re: Spark: Could not load native gpl library
Hi Jikai,It looks like you&#39;re trying to run a Spark job on data that&#39;s stored in HDFS in .lzo format.
Spark can handle this (I do it all the time), but you need to configure your Spark installation to know about the .lzo format.
There are two parts to the hadoop lzo library -- the first is the jar (hadoop-lzo.jar) and the second is the native library (libgplcompression.{a,so,la} and liblzo2.{a,so,la}).
You need the jar on the classpath across your cluster, but also the native libraries exposed as well.
In Spark 1.0.1 I modify entries in set SPARK_LIBRARY_PATH to include the path to the native library directory (e.g. /path/to/hadoop/lib/native/Linux-amd64-64) and SPARK_CLASSPATH to include the hadoop-lzo jar.
Hope that helps,AndrewOn Thu, Aug 7, 2014 at 7:19 PM, Xiangrui Meng && wrote:
Is the GPL library only available on the driver node? If that is the
case, you need to add them to `--jars` option of spark-submit.
Re: Spark: Could not load native gpl library
Thanks. I tried this option, but still got the same error.
Re: Spark: Could not load native gpl library
Thanks Andrew. &Actually my job did not use any data in .lzo format. Here is the program itself:
import org.apache.spark._
import org.apache.spark.mllib.util.MLUtils
import org.apache.spark.mllib.classification.LogisticRegressionWithSGD
object Test {
& def main(args: Array[String]) {
& & val sparkConf = new SparkConf().setAppName(&SparkMLTest&)
& & val sc = new SparkContext(sparkConf)
& & val training = MLUtils.loadLibSVMFile(sc, &hdfs://url:8020/user/jilei/sparktesttraining_libsvmfmt_10k.txt&)
& & &val model = LogisticRegressionWithSGD.train(training, numIterations = 20)
I copied this form a github gist and want to have a try. The file is a libsvm format file and is in HDFS (I removed the actual hdfs url here in the code.)
And in the file, I set the evns:
export SPARK_LIBRARY_PATH=/apache/hadoop/lib/native/
export SPARK_CLASSPATH=/apache/hadoop/share/hadoop/common/hadoop-common-
Here is the content of the /apache/hadoop/lib/native/ folder:
ls /apache/hadoop/lib/native/
libgplcompression.a & & & &libhadooppipes.a & &libhdfs.a & & & &libhadoop.a & & & & & & & & & & &libhadooputils.a & & & & & &
Re: Spark: Could not load native gpl library
Hi Jikai,The reason I ask is because your stacktrace has this section in it:pression.lzo.GPLNativeCodeLoader.&clinit&(
at pression.lzo.LzoCodec.&clinit&(
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(
Maybe you have the Lzo codec defined in your core-site.pression.codecs setting?
In the short run you could disable it.
In the long run, I wonder if this is an issue with YARN not propagating the setting through to the executors.
Have you tried in other cluster deployment modes?
