We have to log millions and millions of small log documents on a weekly basis. This includes:
Ad hoc queries for data mining
Joining and filtering values
Full-text search with Python
We thought about using HBase and running Hadoop jobs to generate stat results. But the problem is that the results have to be real-time. So now we're thinking of using ElasticSearch. Is that a good idea?
Free Guide: Managing storage for virtual environments
Complete a brief survey to get a complimentary 70-page whitepaper featuring the best methods and solutions for your virtual environment, as well as hypervisor-specific management advice from TechTarget experts. Don’t miss out on this exclusive content!