Work with HBase from Spark shell

My software versions

Spark 1.6.1, HBase 1.2.1, run on EMR 4.7.1

Spark and HBase installation:
http://dmitrypukhov.pro/install-apache-spark-on-ubuntu/,
http://dmitrypukhov.pro/install-hbase-on-linux-dev/

Configure Spark

Edit spark-defaults.conf and ensure spark.driver.extraClassPath and spark.executor.extraClassPath contain path to hbase libraries.  For me it is /usr/lib/hbase/lib/*

My extra class pathes:

Start spark-shell and do all the following there:

Import necessary classes:

Configure access to HBase

Create Spark RDD on HBase table

Use Spark SQL to query HBase

 

Leave a Reply

Your email address will not be published.