How to Get Started with Hive on Cloudera

Apache Hive is a data warehousing package built on top of Hadoop for providing data summarization, query and analysis. Hive was initially developed by Facebook and was later contributed to the open source community. It is mostly being targeted toward users comfortable with SQL. It is similar to SQL and the query language of Hive is called HiveQL.

Step 1:

Cloudera CDH3 Setup:

CDH is the Open Source Distribution of Apache Hadoop and related projects. CDH delivers the core elements of Hadoop scalable storage and distributed computing โ€“ along with additional components such as a user interface, plus necessary enterprise capabilities such as security, and integration with a broad range of hardware and software solutions.

You can download the CDH3 VM file from this link.

Extract the zip file and associate it with your VmWare player.

Step 2:

Click on play virtual machine and login on ClouderaVm as explained below:

Login as:

Username – Cloudera
Password – Cloudera

Hive on Cloudera


Step 3:

Create a folder with any name on the Cloudera Vm desktop. For this example, I have named it himanshuHive.

Hive-Cloudera


Step 4:

  • Open terminal and execute the command:

cloudera@cloudera-vm:/home/cloudera# > sudo su

  • If it prompts for a password, then type: cloudera
  • Now run this command to open the Hive configuration file:

root@cloudera-vm:/home/cloudera# > cd /usr/lib/hive/conf/

root@cloudera-vm:/home/cloudera# > sudo gedit hive-site.xml

Now you would have to copy the path of the folder that we have created. (ie โ€“ himanshuHive) into the ConnectionUrl property of the Hive configuration file as below:

Hive on Cloudera


Step 5:

Type this command to enter into Hive shell: sudo hive

Hive on Cloudera

Now you are all set to execute the Hive command and run Hive queries into the Hive shell. Interested in learning more about Apache Hive and Cloudera? Seeย Apache Hive documentationย on the Apache Hive home page.

Himanshu Agrawal

Himanshu Agrawal

Sr. Technical Lead

Himanshu Agrawal is a Senior Technical Lead at 3Pillar Global. He brings with him rich experience in designing/developing enterprise wide web applications and platforms. He has expertise on complete J2EE stack and open source technologies and has been majorly involved in designing/developing products for Healthcare, Finance, Global Trade Management, and Content Management Applications. He likes to be driven by challenges and is passionate about learning new technologies/domains with prime area of interest being web platforms. Prior to joining 3Pillar Global, he has worked for various product and service based companies including RSystems, Syntel and Metacube.

7 Responses to “How to Get Started with Hive on Cloudera”
  1. Arghya Polley on

    I am facing issues while creating a folder. Error: “bash: cd: class3: No such file or directory”
    How should I do to sort out the issue?

    Reply
  2. Himanshu on

    @Arghya
    It seems that you are trying to enter into the directory for which you don’t have permission. Try to create the folder in your home directory with your user login.
    If you have access with root login, then it should work for you. But I would prefer you use your user account.

    Reply
  3. kumar77 on

    Hi Himanshu,

    I am getting hive query error through HUE, and also commandline. The error is ;

    java.io.IOException: org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=EXECUTE, inode="/user/hive":hive:hive:drwxr-x—

    I have set hive.server2.enable.impersonation on all hive-site.xml files. Please help to resolve

    Kumar77

    Reply
  4. Vinay Singh on

    This is excellent….Great Article buddy ๐Ÿ™‚

    Reply
  5. Vivek on

    NIce article ๐Ÿ™‚ , this was something i was looking for !!

    Reply
  6. Gaurav Pandey on

    There is no such property as Connection URL in the xml.

    Reply
  7. Ada on

    When I typed “sudo hive”, I’m getting the error “java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient”

    I followed the preceding instructions and I’m not sure what’s going on. How do I fix that?

    Reply
Leave a Reply