HBase Introduction
In the previous post we have been talking a little bit about Hadoop introduction and focused how to install Hadoop on Windows environment. In brief, Hadoop can perform only batch processing, and data will be accessed only in a sequential manner. It does mean one has to search the entire data-set even for the simplest of jobs.
A huge data-set when processed results in another huge data set, which should also be processed sequentially. At this point, a new solution is needed to access any point of data in a single unit of time (random access).
Apache HBase is an open source non-relational (NoSQL) distributed column-oriented database that runs on top of HDFS and real-time read/write access to those large data-sets. Initially, it was Google Big Table, afterwards it was re-named as HBase and is primarily written in Java, designed to provide quick random access to huge amounts of the data-set.
In brief, the HBase can store massive amounts of data from terabytes to petabytes and allows fast random reads and writes that cannot be handled by the Hadoop. Even relational databases (RDBMS) cannot handle a variety of data that is growing exponentially.
HBase Installation
Here we will work through how to install HBase on Windows environment, HBase can be installed in three modes. The features of these modes are mentioned below.
[1] Standalone mode installation (No dependency on Hadoop system)
- This is default mode of HBase
- It runs against local file system
- It doesn't use Hadoop HDFS
- Only HMaster daemon can run
- Not recommended for production environment
- Runs in single JVM
[2] Pseudo-Distributed mode installation ( Single node Hadoop system + HBase installation)
- It runs on Hadoop HDFS
- All Daemons run in single node
- Recommend for production environment
[3] Fully Distributed mode installation ( Multi node Hadoop environment + HBase installation)
- It runs on Hadoop HDFS
- All daemons going to run across all nodes present in the cluster
- Highly recommended for production environment
Henceforward Hadoop should be pre-installed before installing HBase on windows. If you didn't install the Hadoop then visit previous post to install Hadoop on Windows 10.
I went through HBase 1.4.7 version, though you can use any stable version.
Download HBase 1.4.7
- http://www.apache.org/dyn/closer.lua/hbase/
Hbase - Standalone mode installation
Here, we will go through the Standalone mode installation with Hbase on Windows 10.
STEP - 1: Extract the HBase file
Extract file hbase-1.4.7-bin.tar.gz and place under "D:\HBase", you can use any preferred location –
[1] You will get again a tar file post extraction –
[2] Go inside of hbase-1.4.7-bin.tar folder and extract again –
[3] Copy the leaf folder “hbase-1.4.7” and move to the root folder "D:\HBase" and removed all other files and folders –
STEP - 2: Configure Environment variable
Set the path for the following Environment variable (User Variables) on windows 10 –
- HBASE_HOME - D:\HBase\hbase-1.4.7
This PC - > Right Click - > Properties - > Advanced System Settings - > Advanced - > Environment Variables
STEP - 3: Configure System variable
Next onward need to set System variable, including Hive bin directory path –
Variable: Path
Value:
- D:\HBase\hbase-1.4.7\bin
STEP - 4: Create required folders
Create some dedicated folders -
- Create folder "hbase" under “D:\HBase\hbase-1.4.7”.
- Create folder "zookeeper" under “D:\HBase\hbase-1.4.7”.
For example -
STEP - 5: Configured required files
Next, essential to configure two key files with minimal required details –
- hbase-env.cmd
- hbase-site.xml
[1] Edit file D:/HBase/hbase-1.4.7/conf/hbase-env.cmd, mention JAVA_HOME path in the location and save this file.
@rem set JAVA_HOME=c:\apps\java
set JAVA_HOME=%JAVA_HOME
[2] Edit file D:/HBase/hbase-1.4.7/conf/hbase-site.xml, paste below xml paragraph and save this file.
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///D:/HBase/hbase-1.4.7/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/D:/HBase/hbase-1.4.7/zookeeper</value>
</property>
<property>
<name> hbase.zookeeper.quorum</name>
<value>127.0.0.1</value>
</property>
</configuration>
All HMaster and ZooKeeper activities point out to this hbase-site.xml.
[3] Edit file hosts (C: /Windows/System32/drivers/etc/hosts), mention localhost IP and save this file.
127.0.0.1 localhost
STEP - 6: Start HBase
Here need to start HBase first -
Open command prompt and change directory to “D:\HBase\hbase-1.4.7\bin" and type "start-hbase.cmd" to start HBase.
It will open a separate instances of cmd for following tasks –
- HBase Master
STEP - 7: Validate HBase
Post successful execution of HBase, verify the installation using following commands –
- hbase –version
- jps
If we can see HMaster is in running mode, then our installation is okay.
STEP - 8: Execute HBase Shell
The standalone mode does not require Hadoop daemons to start. HBase can run independently. HBase shell can start by using "hbase shell" and it will enter into interactive shell mode –
Congratulations, HBase installed !! 😊
STEP-9: Some hands on activities
[1] Create a simple table
create 'student', 'bigdata'
[2] List the table has been created
list
[3] Insert some data to above created table
put ‘tablename’, ‘rowname’, ‘columnvalue’, ‘value’
put 'student', 'row1', 'bigdata:hadoop', 'hadoop couse'
[4] List all rows in the table
scan 'student'
Stay in touch for more posts.
Hi Team
ReplyDeleteAm getting the below error
2019-11-21 09:08:16,529 INFO [main-SendThread(127.0.0.1:2181)] zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2019-11-21 09:08:18,543 WARN [main-SendThread(127.0.0.1:2181)] zookeeper.ClientCnxn: Session 0x16e8b7d06800001 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
I am also getting the same error. Please give a solution.
ReplyDeleteplease provide full details;
ReplyDeleteI am getting below error
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
at org.apache.hadoop.conf.Configuration.(Configuration.java:178)
at org.apache.hadoop.hbase.util.HBaseConfTool.main(HBaseConfTool.java:39)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 2 more
ERROR: Could not determine the startup mode.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
Deleteat org.apache.hadoop.conf.Configuration.(Configuration.java:187)
at org.apache.hadoop.hbase.util.HBaseConfTool.main(HBaseConfTool.java:39)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 2 more
ERROR: Could not determine the startup mode.
https://ro.coredump.biz/questions/58063070/hbase-shell-missing-class-name-39orgapachelog4jlevel39#58082245
DeleteExtremely helpful post. This is my first time visiting here. I discovered such a large number of intriguing stuff in your blog particularly its exchange. Truly its extraordinary article. Keep it up. oracle fusion training in bangalore
ReplyDeleteAm getting the below error
ReplyDelete2019-11-21 09:08:16,529 INFO [main-SendThread(127.0.0.1:2181)] zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2019-11-21 09:08:18,543 WARN [main-SendThread(127.0.0.1:2181)] zookeeper.ClientCnxn: Session 0x16e8b7d06800001 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[ERROR] Terminal initialization failed; falling back to unsupported
ReplyDeletejava.lang.NoClassDefFoundError: Could not initialize class org.fusesource.jansi.internal.Kernel32
I got this error for hbase shell command