win10 java客户端连接hdfs集群报 java.io.Exception: Could not locate executable null/bin/winutils.exe in the Hadoop binaries

缘起

小白继续入门. 使用java客户端连接hdfs集群, 本地环境是win10 Home Basic.搭建好hadoop集群, 配置好HADOOP_HOME环境变量.以及PATH环境变量.

1
2
3
4
5
6
7
@Test
public void upload() throws Exception {
Configuration configuration = new Configuration();
FileSystem fs = FileSystem.get(new URI("hdfs://hadoop01:9000"), configuration, "root");
fs.copyFromLocalFile(new Path("e:/hello.txt"), new Path("/hello.txt"));
fs.close();
}

其中hadoop01是hdfs的namenode所在节点. 运行之后报错

1
java.io.Exception: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

分析

从控制台日志中的报错调用栈就不难定位到

1
org.apache.hadoop.util.Shell.getQualifiedBinPath(String executable)

方法. 源码如下

1
2
3
4
5
6
7
8
9
10
public static final String getQualifiedBinPath(String executable)
throws IOException
{
String fullExeName = new StringBuilder().append(HADOOP_HOME_DIR).append(File.separator).append("bin").append(File.separator).append(executable).toString();

File exeFile = new File(fullExeName);
if (!exeFile.exists()) {
throw new IOException(new StringBuilder().append("Could not locate executable ").append(fullExeName).append(" in the Hadoop binaries.").toString());
}
...

HADOOP_HOME_DIR的初始化代码为

1
private static String HADOOP_HOME_DIR = checkHadoopHome();

我们不由得去调查checkHadoopHome方法.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
private static String checkHadoopHome()
{
String home = System.getProperty("hadoop.home.dir");

if (home == null) {
home = System.getenv("HADOOP_HOME");
}

try
{
if (home == null) {
throw new IOException("HADOOP_HOME or hadoop.home.dir are not set.");
}

if ((home.startsWith("\"")) && (home.endsWith("\""))) {
home = home.substring(1, home.length() - 1);
}

File homedir = new File(home);
if ((!homedir.isAbsolute()) || (!homedir.exists()) || (!homedir.isDirectory())) {
throw new IOException(new StringBuilder().append("Hadoop home directory ").append(homedir).append(" does not exist, is not a directory, or is not an absolute path.").toString());
}

home = homedir.getCanonicalPath();
}
catch (IOException ioe) {
if (LOG.isDebugEnabled()) {
LOG.debug("Failed to detect a valid hadoop home directory", ioe);
}
home = null;
}

return home;
}

讲道理, 如果我配置了环境变量HADOOP_HOME的话, 第6行代码就应该有值. 于是,我简单写了一个main方法打印System.getenv(“HADOOP_HOME”), 结果竟然是null. 但是我配置了HADOOP_HOME啊! 是不是System.getenv方法就获取不到我设置的环境变量呢? 于是main方法打印System.getenv(“GRADLE_HOME”), 发现打印的妥妥的. 于是就知道答案了, 我配置了 HADOOP_HOME环境变量但是没有重启机器, 导致System.getenv方法读取不到该环境变量.

解决

重启