Accessing AWS s3 from on premises Hadoop

Add aws libraries to class path

hadoop-aws-*.jar library is not in classpath by default, but it exists in $HADOOP_HOME/tools/lib folder. To fix it, edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh and add the following line:

Configure AWS security in Hadoop file system

Edit $HADOOP_HOME/etc/hdfs-site.xml and add properties for access key and security key. Note that s3a property name has a little bit different pattern.

Check the access

Hadoop fs commands should work now.

 

Leave a Reply

Your email address will not be published.