Category: Hadoop

  • How to get started with Hadoop – Hadoop căn bản

    1 of the most painful jobs of a system engineer is to build a whole system by installing multiple packages, one-by-one. We all worry about incompatibility and dependencies With Hadoop, you can do that with big help from HDP (Hortonworks Data Platform) Great tutorials and documentation can be found here The order of methods you […]

  • About the Chukwa released versions

    I’m working with some log collection & aggregation tools from Apache Project, when  it came to Chukwa – I read the introduction, release note of the project and didn’t know what to do because it seemed like Chukwa had been in and out for a while and a bit obsolete. So I decided to email the […]

  • Hadoop 2.2 and Flume 1.4 Protobuf Problem and Solution

    I have to say the big THANK to the author of  “Hadoop in Practice” : Alex Holmes Source : The problem you may encounter while  trying to integrate Hadoop 2.2 and Flume 1.4 is the incompatibility between protobuf versions : 2014-04-15 13:56:23,251 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR – org.apache.flume.sink.hdfs.HDFSEventSink.process(] process failed java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RecoverLeaseRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet; […]

  • Hadoop 2.2 Single Node Installation on CentOS 6.5

    By far the best tutorial for you to get started with Hadoop installation. Source : Introduction This HOWTO covers Hadoop 2.2 installation with CentOS 6.5. My series of tutorials are meant just as that – tutorials. The intent is to allow the user to gain familiarity with the application and should not be construed […]