I need to process a huge file and I'm looking to use Hadoop for it. From what my friend has told me, the file would get split into several different nodes. But if the file is compressed, then the file won't be split and would need to be processed a single node (and I wouldn't be able to use MapReduce). Would it be possible to split the large file in fixed size chunks, compress them and perform a MapReduce? Thanks!
Free Guide: Managing storage for virtual environments
Complete a brief survey to get a complimentary 70-page whitepaper featuring the best methods and solutions for your virtual environment, as well as hypervisor-specific management advice from TechTarget experts. Don’t miss out on this exclusive content!
No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.
Your password has been sent to:email@example.com
To follow this tag...
Thanks! We'll email you when relevant content is added and updated.
Share this item with your network: