月归档:2012年07月

HBase Storage and PIG

HBase Storage and PIGWe’ve been using PIG for analytics and for processing data for use in our site for some time now. PIG is a high level language for building data analysis programs that can run across a distributed Hadoop cluster. It has allowed us to scale up our data processing while decreasing the amount of time it takes to run jobs.[……]

Read more

继续阅读

发表在 未分类 | 标签为 , | HBase Storage and PIG已关闭评论

Apache Pig入门1 –介绍/基本架构/与Hive对比

Apache Pig入门1 –介绍/基本架构/与Hive对比[……]

Read more

继续阅读

发表在 未分类 | 标签为 | Apache Pig入门1 –介绍/基本架构/与Hive对比已关闭评论

使用 Apache Pig 处理数据

使用 Apache Pig 处理数据[……]

Read more

继续阅读

发表在 未分类 | 标签为 | 使用 Apache Pig 处理数据已关闭评论

Achieving High Availability and Scalability – ARR and NLB

Achieving High Availability and Scalability – ARR and NLB[……]

Read more

继续阅读

发表在 未分类 | 标签为 | Achieving High Availability and Scalability – ARR and NLB已关闭评论

Hive HBase Integration

Hive HBase IntegrationVersion informationAs of Hive 0.9.0 the HBase integration requires at least HBase 0.92, earlier versions of Hive were working with HBase 0.89/0.90[……]

Read more

继续阅读

发表在 未分类 | 标签为 , | Hive HBase Integration已关闭评论

cloudera manager free edition install notes

【声明】本文为AdamsLee原创,转载请注明出自围炉网并保留本文有效链接:cloudera manager free edition install notes, 转载请保留本声明!

update /etc/hosts to remove 127.0.1.1install may take long time$ sudo -u hdfs hadoop fs -mkdir /tmp
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp$ sudo -u hdfs hadoop fs -mkdir /tmp/mapred
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp/mapred$ sudo -u hdfs hadoop fs -chmod -R 1777 /data node cannot register to namenode
<property><name>dfs.hosts</name><value></value><final>true</final></property>java.io.IOException: Cannot create an instance of InputSplit class = org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit<property>   <name>hive.aux.jars.path</name>   <value>file:///home/hadoop/cdh3/hive-0.7.0-cdh3u0/lib/hive-hbase-handler-0.7.0-cdh3u0.jar,file:///home/hadoop/cdh3/hive-0.7.0-cdh3u0/lib/hbase-0.90.1-cdh3u0.jar,file:///home/hadoop/cdh3/hive-0.7.0-cdh3u0/lib/zookeeper-3.3.1.jar</value> </property> The storage handler is built as an independent module, hive-hbase-handler-x.y.z.jar, which must be available on the Hive client auxpath, along with HBase, Guava and ZooKeeper jars. It also requires the correct configuration property to be set in order to connect to the right HBase master. See the HBase documentation for how to set up an HBase cluster.
Here's an example using CLI from a source build environment, targeting a single-node HBase server. (Note that the jar locations and names have changed in Hive 0.9.0, so for earlier releases, some changes are needed.)
$HIVE_SRC/build/dist/bin/hive –auxpath $HIVE_SRC/build/dist/lib/hive-hbase-handler-0.9.0.jar,$HIVE_SRC/build/dist/lib/hbase-0.92.0.jar,$HIVE_SRC/build/dist/lib/zookeeper-3.3.4.jar,$HIVE_SRC/build/dist/lib/guava-r09.jar -hiveconf hbase.master=hbase.yoyodyne.com:60000
Here's an example which instead targets a distributed HBase cluster where a quorum of 3 zookeepers is used to elect the HBase master:
$HIVE_SRC/build/dist/bin/hive –auxpath $HIVE_SRC/build/dist/lib/hive-hbase-handler-0.9.0.jar,$HIVE_SRC/build/dist/lib/hbase-0.92.0.jar,$HIVE_SRC/build/dist/lib/zookeeper-3.3.4.jar,$HIVE_SRC/build/dist/lib/guava-r09.jar -hiveconf hbase.zookeeper.quorum=zk1.yoyodyne.com,zk2.yoyodyne.com,zk3.yoyodyne.com
The handler requires Hadoop 0.20 or higher, and has only been tested with dependency versions hadoop-0.20.x, hbase-0.92.0 and zookeeper-3.3.4. If you are not using hbase-0.92.0, you will need to rebuild the handler with the HBase jar matching your version, and change the –auxpath above accordingly. [……]

Read more

继续阅读

发表在 未分类 | 标签为 | cloudera manager free edition install notes已关闭评论

使用Hive读取Hbase中的数据

【说明】 本文转载自:http://victorzhzh.iteye.com/blog/972406

使用Hive读取Hbase中的数据[……]

Read more

继续阅读

发表在 未分类 | 标签为 , | 使用Hive读取Hbase中的数据已关闭评论

Hbase分析报告

【说明】 本文转载自:http://rockecsn.iteye.com/blog/1538194

Hbase分析报告[……]

Read more

继续阅读

发表在 未分类 | 标签为 , | Hbase分析报告已关闭评论

Collecting and analyzing log data via Flume and Hive

Collecting and analyzing log data via Flume and HiveAugust 15th, 2010 by aphadke[……]

Read more

继续阅读

发表在 未分类 | 标签为 , , | Collecting and analyzing log data via Flume and Hive已关闭评论

HBase技术介绍

HBase技术介绍莫问[……]

Read more

继续阅读

发表在 未分类 | 标签为 , | HBase技术介绍已关闭评论