Git server on Windows

Spent some time struggling with Git server on Windows (!). Memorize it here to not forget.

Init Windows git server

On Windows. git server create a folder for project, let it be c:/GitRepos/myProject.git Enter that directory with cygwin and run command there:

Access from client.

Note: Windows user git includes server name with plus sign: server1+git

 

 

Accessing AWS s3 from on premises Hadoop

Add aws libraries to class path

hadoop-aws-*.jar library is not in classpath by default, but it exists in $HADOOP_HOME/tools/lib folder. To fix it, edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh and add the following line:

Continue reading

Work with HBase from Spark shell

My software versions

Spark 1.6.1, HBase 1.2.1, run on EMR 4.7.1

Spark and HBase installation:
http://dmitrypukhov.pro/install-apache-spark-on-ubuntu/,
http://dmitrypukhov.pro/install-hbase-on-linux-dev/

Configure Spark

Edit spark-defaults.conf and ensure spark.driver.extraClassPath and spark.executor.extraClassPath contain path to hbase libraries.  For me it is /usr/lib/hbase/lib/*

My extra class pathes: Continue reading

Install Zookeeper on Linux

Zookeeper installation steps are nice and easy for dev environment.

  1. Download zookeeper from http://zookeeper.apache.org, extract it to some place, let it be /opt/zookeeper/
  2. Create a simple zoo.cfg file, i.e. copy config sample /opt/zookeeper/conf/zoo_sample.cfg to /opt/zookeeper/conf/zoo.cfg
  3. Start zookeeper
    /opt/zookeeper/bin/zkServer.sh start
  4. Stop zookeeper
    /opt/zookeeper/bin/zkServer.sh stop

HBase shell on AWS EMR cluster quickstart

How to start HBase client in AWS EMR and query external HBase DB

  1. Create EMR cluster with HBase application enabled manually or using command like this:

2. Establish ssh connection to the cluster

3.To work with external database, set zookeeper quorum in /etc/hbase/conf/hbase-site.xml

3. Start HBase shell

4. In shell do queries like that: