checking the replication factor on files:
hadoop fsck /user/analytics/staging_intellitxt/ -files -blocks
Setting the replication, where R = replication number
hadoop fs -setrep -R 2 /
This takes approx 5 minutes to complete and will give you a saving of about 30% in disk space!
No comments:
Post a Comment