Unlocking the Basics: Key Concepts and Practical Applications - Part III

Photo by Alex wong on Unsplash

Unlocking the Basics: Key Concepts and Practical Applications - Part III

·

2 min read

Week1 - Continuation

Article Outline.:

  • HDFS Architecture - Basic Overview

  • Week1 Summary


HDFS Architecture - Basic Overview:

HDFS, short for Hadoop Distributed File System, caters to applications dealing with large datasets, ranging from gigabytes to terabytes.

The architecture follows a master-slave model, comprising one name node (master) and multiple data nodes (slaves).

Name Node:

Maintains a hashtable-like structure containing system metadata. Metadata includes information about the block location on data nodes. For fault tolerance, a secondary name node is in place, ready to take over in case of primary name node failure.

Name Node Federation:

To horizontally scale the name service, federation employs multiple independent Namenodes/namespaces. Namenodes operate independently without requiring coordination. Datanodes serve as common storage for blocks across all Namenodes.

We can login to the name node using the gateway node.

Apache Hadoop 3.3.6 – HDFS Architecture

Data Node:

  • Stores actual data in the form of blocks, with a default size of 128MB.

  • For example, a 1GB file would be divided into approximately 8 blocks, each stored in data nodes.

  • Blocks are replicated for fault tolerance, with a default replication factor of 3.

  • Each data node features a heartbeat mechanism, sending periodic messages to the NameNode for connectivity checks.


Week 1 Summary:

In Week 1, we delved into fundamental concepts crucial for navigating the realm of big data. Topics covered included:

  • Definition of Big Data

  • Monolithic vs Distributed Systems

  • Overview of Hadoop

  • All About Cloud

  • Delving into D's - Db vs Data Warehouse vs Data Lake

  • Big Data - The Big Picture

  • Overview of HDFS Architecture

  • Common Linux and Hadoop Commands

This foundation sets the stage for a deeper exploration into the technical aspects of big data in the coming weeks.


Further Readings.:


The resources I consulted for reference are credited in the above section contributing valuable insights to the content presented.

Image Credits.: I do not claim credit for the image; all acknowledgment and appreciation go to the original creator.


If you found this article helpful or learned something new, please show your support by liking it and following me for updates on future posts.

Till next time, happy coding!