Tag Archives: AWS

HBase shell on AWS EMR cluster quickstart

How to start HBase client in AWS EMR and query external HBase DB

  1. Create EMR cluster with HBase application enabled manually or using command like this:

2. Establish ssh connection to the cluster

3.To work with external database, set zookeeper quorum in /etc/hbase/conf/hbase-site.xml

3. Start HBase shell

4. In shell do queries like that:

 

s3cmd WARNING: Retrying failed request

I used s3cmd cp and s3cmd sync commands to copy files from source s3 folder to s3 destination. Copying  is too slow, I see many attempts, failed by timeout, like this

In ~/.s3cfg file I found socket_timeout setting, which is 100 by default. Setting it to 1000 helped me:

 

Most used s3cmd commands

This article contains my most used s3cmd and other commands for my Ubuntu server.

Upload folder to S3:
$ s3cmd put mylocalfile.ext s3://mybucket/myfolder/myfile.ext

Download file from S3
$ s3cmd get s3://mybusket/myfolder/mufile.ext mylocalfile.ext

Download S3 folder to local file system (sync)
$ s3cmd sync s3://mybusket/myfolder mylocalfolder

Continue reading