Accessing HDFS from H2O under Rstudio

terry.stebbensterry.stebbens Member, Moderator, Domino Posts: 7 mod

In order to access HDFS from H2O under Rstudio you'll need to supply the Hadoop classpath to H2O in the h2o.init() method using the extra_classpath argument. For example, if the output of running hadoop classpath is

/usr/local/hadoop-2.8.5/etc/hadoop:/usr/local/hadoop-2.8.5/share/hadoop/common/lib/*:/usr/local/hadoop-2.8.5/share/hadoop/common/*:/usr/local/hadoop-2.8.5/share/hadoop/hdfs:/usr/local/hadoop-2.8.5/share/hadoop/hdfs/lib/*:/usr/local/hadoop-2.8.5/share/hadoop/hdfs/*:/usr/local/hadoop-2.8.5/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.8.5/share/hadoop/yarn/*:/usr/local/hadoop-2.8.5/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-2.8.5/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar:/usr/local/hadoop-2.8.5/share/hadoop/tools/lib/*.jar:/usr/local/hadoop-2.8.5/share/hadoop/tools/lib/hadoop-aws-2.8.5.jar

then you'll need to call h2o.init() like this:

h2o.init(extra_classpath = "/usr/local/hadoop-2.8.5/etc/hadoop:/usr/local/hadoop-2.8.5/share/hadoop/common/lib/*:/usr/local/hadoop-2.8.5/share/hadoop/common/*:/usr/local/hadoop-2.8.5/share/hadoop/hdfs:/usr/local/hadoop-2.8.5/share/hadoop/hdfs/lib/*:/usr/local/hadoop-2.8.5/share/hadoop/hdfs/*:/usr/local/hadoop-2.8.5/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.8.5/share/hadoop/yarn/*:/usr/local/hadoop-2.8.5/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-2.8.5/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar:/usr/local/hadoop-2.8.5/share/hadoop/tools/lib/*.jar:/usr/local/hadoop-2.8.5/share/hadoop/tools/lib/hadoop-aws-2.8.5.jar" )
Sign In or Register to comment.