Spark在windows运行报错-ERROR Shell Failed to locate the winutils binary in the hadoop binary path java.io.IOException Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

windowsidea运行spark程序,报了如下错:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/08/06 18:53:56 INFO SparkContext: Running Spark version 2.4.3
19/08/06 18:53:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/08/06 18:53:56 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
	at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378)
	at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:386)
	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:116)
	at org.apache.hadoop.security.Groups.<init>(Groups.java:93)
	at org.apache.hadoop.security.Groups.<init>(Groups.java:73)
	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:293)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:789)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2422)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2422)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2422)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:293)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
	at com.au.common.Base$.main(Base.scala:17)
	at com.au.common.Base.main(Base.scala)

解决方案:指定一个winutils.exe文件即可。

  1. 下载我给的bin目录文件。
    百度云链接:https://pan.baidu.com/s/1422rEurIxnMr6wPJr5G8UA
    提取码:ymm2
    复制这段内容后打开百度网盘手机App,操作更方便哦。
  2. bin放到任意目录,我这里是D:/test目录下(注意一定要在winutils.exe外套上bin,不然还是会报错)。
  3. 在代码中指定该bin目录的上一级,即test目录(实际上就是模拟hadoop),代码为:System.setProperty("hadoop.home.dir", "D:/test")

测试代码,亲测有效(词频统计):

import org.apache.spark.{SparkConf, SparkContext}

object Base {
  def main(args: Array[String]): Unit = {

    System.setProperty("hadoop.home.dir", "D:/test") // 加入这句代码,将下载的bin目录放到任意目录,我这里是新建的test目录

    val conf = new SparkConf().setMaster("local[*]").setAppName("wordcount")
    val sc = SparkContext.getOrCreate(conf)

    val rdd = sc.textFile("D:/markdown note/Flume学习笔记.md")
    rdd.flatMap(_.split(" ")).groupBy(x => x).mapValues(x => x.size).foreach(println)
  }
}
GitHub 加速计划 / wi / winutils
2.51 K
3 K
下载
Windows binaries for Hadoop versions (built from the git commit ID used for the ASF relase)
最近提交(Master分支:2 个月前 )
e8089ecf - 1 年前
d4f71517 point people at cdarlint/winutils for binaries and call out the fact that we could remove the need for this entirely just to run spark on windows 5 年前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐