Monday, November 12, 2018

HBase installation on Windows 10


Apache HBASE

HBase Introduction


In the previous post we have been talking a little bit about Hadoop introduction and focused how to install Hadoop on Windows environment. In brief, Hadoop can perform only batch processing, and data will be accessed only in a sequential manner. It does mean one has to search the entire data-set even for the simplest of jobs.

A huge data-set when processed results in another huge data set, which should also be processed sequentially. At this point, a new solution is needed to access any point of data in a single unit of time (random access).

Apache HBase is an open source non-relational (NoSQL) distributed column-oriented database that runs on top of HDFS and real-time read/write access to those large data-sets. Initially, it was Google Big Table, afterwards it was re-named as HBase and is primarily written in Java, designed to provide quick random access to huge amounts of the data-set.

In brief, the HBase can store massive amounts of data from terabytes to petabytes and allows fast random reads and writes that cannot be handled by the Hadoop. Even relational databases (RDBMS) cannot handle a variety of data that is growing exponentially. 

HBase Installation



Here we will work through how to install HBase on Windows environment, HBase can be installed in three modes. The features of these modes are mentioned below.

[1] Standalone mode installation (No dependency on Hadoop system)
  • This is default mode of HBase
  • It runs against local file system
  • It doesn't use Hadoop HDFS
  • Only HMaster daemon can run
  • Not recommended for production environment
  • Runs in single JVM


[2] Pseudo-Distributed mode installation ( Single node Hadoop system + HBase installation)

  • It runs on Hadoop HDFS
  • All Daemons run in single node
  • Recommend for production environment


[3] Fully Distributed mode installation ( Multi node Hadoop environment + HBase installation)

  • It runs on Hadoop HDFS
  • All daemons going to run across all nodes present in the cluster
  • Highly recommended for production environment


Henceforward Hadoop should be pre-installed before installing HBase on windows. If you didn't install the Hadoop then visit previous post to install Hadoop on Windows 10.

I went through HBase 1.4.7 version, though you can use any stable version. 

Download HBase 1.4.7
  • http://www.apache.org/dyn/closer.lua/hbase/

Hbase - Standalone mode installation


Here, we will go through the Standalone mode installation with Hbase on Windows 10.

STEP - 1: Extract the HBase file


Extract file hbase-1.4.7-bin.tar.gz and place under "D:\HBase", you can use any preferred location – 

[1] You will get again a tar file post extraction – 

Local Folder


[2] Go inside of hbase-1.4.7-bin.tar folder and extract again – 


Extract


[3] Copy the leaf folder “hbase-1.4.7” and move to the root folder "D:\HBase" and removed all other files and folders – 


Extract Folder

Extracted files

STEP - 2: Configure Environment variable


Set the path for the following Environment variable (User Variables) on windows 10 – 
  • HBASE_HOME - D:\HBase\hbase-1.4.7

This PC - > Right Click - > Properties - > Advanced System Settings - > Advanced - > Environment Variables 


Environment variable


STEP - 3: Configure System variable


Next onward need to set System variable, including Hive bin directory path – 


Variable: Path 

Value: 
  • D:\HBase\hbase-1.4.7\bin
System variable



STEP - 4: Create required folders


Create some dedicated folders - 
  1. Create folder "hbase" under “D:\HBase\hbase-1.4.7”.
  2. Create folder "zookeeper" under “D:\HBase\hbase-1.4.7”.

For example - 


Required folders

Required folders

STEP - 5: Configured required files


Next, essential to configure two key files with minimal required details – 
  • hbase-env.cmd
  • hbase-site.xml

[1] Edit file D:/HBase/hbase-1.4.7/conf/hbase-env.cmd, mention JAVA_HOME path in the location and save this file.

@rem set JAVA_HOME=c:\apps\java

set JAVA_HOME=%JAVA_HOME

Java Home


[2] Edit file D:/HBase/hbase-1.4.7/conf/hbase-site.xml, paste below xml paragraph and save this file.

<configuration>

<property>
<name>hbase.rootdir</name>
<value>file:///D:/HBase/hbase-1.4.7/hbase</value>
</property>

<property>

<name>hbase.zookeeper.property.dataDir</name>
<value>/D:/HBase/hbase-1.4.7/zookeeper</value>
</property>

<property>
<name> hbase.zookeeper.quorum</name>
<value>127.0.0.1</value>
</property>
</configuration>
All HMaster and ZooKeeper activities point out to this hbase-site.xml.

[3] Edit file hosts (C: /Windows/System32/drivers/etc/hosts), mention localhost IP and save this file.

127.0.0.1       localhost


Localhost
 

STEP - 6: Start HBase


Here need to start HBase first - 


Open command prompt and change directory to “D:\HBase\hbase-1.4.7\bin" and type "start-hbase.cmd" to start HBase.


Start HBase

It will open a separate instances of cmd for following tasks – 
  • HBase Master

HBase Master


STEP - 7: Validate HBase


Post successful execution of HBase, verify the installation using following commands –
  • hbase –version
  • jps

Validate Java

Validate HBase

If we can see HMaster is in running mode, then our installation is okay.


STEP - 8: Execute HBase Shell


The standalone mode does not require Hadoop daemons to start. HBase can run independently. HBase shell can start by using "hbase shell" and it will enter into interactive shell mode – 

HBase Shell

Shell

Congratulations, HBase installed !! 😊

STEP-9: Some hands on activities 


[1] Create a simple table
create 'student', 'bigdata'

Create table

[2] List the table has been created
list

List table


[3] Insert some data to above created table

put ‘tablename’, ‘rowname’, ‘columnvalue’, ‘value’
put 'student', 'row1', 'bigdata:hadoop', 'hadoop couse'

Insert table


[4] List all rows in the table
scan 'student'


List all raws

List all rows result


Stay in touch for more posts.

8 comments:

  1. Hi Team
    Am getting the below error
    2019-11-21 09:08:16,529 INFO [main-SendThread(127.0.0.1:2181)] zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    2019-11-21 09:08:18,543 WARN [main-SendThread(127.0.0.1:2181)] zookeeper.ClientCnxn: Session 0x16e8b7d06800001 for server null, unexpected error, closing socket connection and attempting reconnect
    java.net.ConnectException: Connection refused: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

    ReplyDelete
  2. I am also getting the same error. Please give a solution.

    ReplyDelete
  3. please provide full details;

    I am getting below error


    Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
    at org.apache.hadoop.conf.Configuration.(Configuration.java:178)
    at org.apache.hadoop.hbase.util.HBaseConfTool.main(HBaseConfTool.java:39)
    Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    ... 2 more
    ERROR: Could not determine the startup mode.

    ReplyDelete
    Replies
    1. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
      at org.apache.hadoop.conf.Configuration.(Configuration.java:187)
      at org.apache.hadoop.hbase.util.HBaseConfTool.main(HBaseConfTool.java:39)
      Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory
      at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
      ... 2 more
      ERROR: Could not determine the startup mode.

      Delete
    2. https://ro.coredump.biz/questions/58063070/hbase-shell-missing-class-name-39orgapachelog4jlevel39#58082245

      Delete
  4. Extremely helpful post. This is my first time visiting here. I discovered such a large number of intriguing stuff in your blog particularly its exchange. Truly its extraordinary article. Keep it up. oracle fusion training in bangalore

    ReplyDelete
  5. Am getting the below error
    2019-11-21 09:08:16,529 INFO [main-SendThread(127.0.0.1:2181)] zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    2019-11-21 09:08:18,543 WARN [main-SendThread(127.0.0.1:2181)] zookeeper.ClientCnxn: Session 0x16e8b7d06800001 for server null, unexpected error, closing socket connection and attempting reconnect
    java.net.ConnectException: Connection refused: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

    ReplyDelete
  6. [ERROR] Terminal initialization failed; falling back to unsupported
    java.lang.NoClassDefFoundError: Could not initialize class org.fusesource.jansi.internal.Kernel32


    I got this error for hbase shell command

    ReplyDelete