Add aws libraries to class path
hadoop-aws-*.jar library is not in classpath by default, but it exists in $HADOOP_HOME/tools/lib folder. To fix it, edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh and add the following line:
1 |
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/tools/lib/* |
Configure AWS security in Hadoop file system
Edit $HADOOP_HOME/etc/hdfs-site.xml and add properties for access key and security key. Note that s3a property name has a little bit different pattern.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
<property> <name>fs.s3.awsAccessKeyId</name> <value>My access key id</value> </property> <property> <name>fs.s3.awsSecretAccessKey</name> <value>My secret access key</value> </property> <property> <name>fs.s3n.awsAccessKeyId</name> <value>My access key id</value> </property> <property> <name>fs.s3n.awsSecretAccessKey</name> <value>My secret access key</value> </property> <property> <name>fs.s3a.access.key</name> <value>My access key id</value> </property> <property> <name>fs.s3a.secret.key</name> <value>My secret access key</value> </property> |
Check the access
Hadoop fs commands should work now.
1 2 3 |
hadoop fs -ls s3://mybucket/ hadoop fs -ls s3n://mybucket/ hadoop fs -ls s3a://mybucket/ |