50 pts.
 what is rack awareness of HDFS?
What is reck awareness of HDFS and how can i optimize the network bandwidth using replication factor 2 ?

Software/Hardware used:
ASKED: July 21, 2009  7:41 PM
UPDATED: July 21, 2009  8:18 PM

Answer Wiki:
"...Large HDFS instances run on a cluster of computers that commonly spread across many racks. Communication between two nodes in different racks has to go through switches. In most cases, network bandwidth between machines in the same rack is greater than network bandwidth between machines in different racks...". The idea is that the rack id of any data node can be obtained with the help of some processes, and it can be used to apply some replica policies. For more information, have a look at the <a href="http://hadoop.apache.org/common/docs/current/hdfs_design.html#Data+Replication">Data Replication</a> section of the documentation.
Last Wiki Answer Submitted:  July 21, 2009  8:18 pm  by  carlosdl   63,535 pts.
All Answer Wiki Contributors:  carlosdl   63,535 pts.
To see all answers submitted to the Answer Wiki: View Answer History.


Discuss This Question:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _