Submitted by heartin on Sat, 02/04/2017 - 23:23
Get started learning about Hadoop and its ecosystem components through simple theory and Hands on exercises.
Submitted by heartin on Sat, 02/04/2017 - 21:27
Here I will include notes on Big Data and Data Science concepts in general. There will be separate books on specific technologies like Hadoop.
Submitted by heartin on Mon, 01/30/2017 - 03:35
Apache ZooKeeper is a software project of the Apache Software Foundation, providing an open source distributed configuration service, synchronization service, and naming registry for large distributed systems.
Submitted by heartin on Sun, 01/29/2017 - 02:30
Apache Kafka is an open source publish-subscribe based distributed messaging system. From the architecture perspective, Kafka is closer to traditional messaging systems such as ActiveMQ or RabitMQ. However from a Big Data and Hadoop perspective, Kafka can be compared with Scribe or Flume as it is useful for processing activity stream data.