what is rack awareness of HDFS?

50 pts.
Tags:
Database optimization
HDFS
Network performance
What is reck awareness of HDFS and how can i optimize the network bandwidth using replication factor 2 ?

Answer Wiki

Thanks. We'll let you know when a new response is added.

“…Large HDFS instances run on a cluster of computers that commonly spread across many racks. Communication between two nodes in different racks has to go through switches. In most cases, network bandwidth between machines in the same rack is greater than network bandwidth between machines in different racks…”.

The idea is that the rack id of any data node can be obtained with the help of some processes, and it can be used to apply some replica policies.

For more information, have a look at the <a href=”http://hadoop.apache.org/common/docs/current/hdfs_design.html#Data+Replication”>Data Replication</a> section of the documentation.

Discuss This Question:  

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following