We have to log millions and millions of small log documents on a weekly basis. This includes:
Ad hoc queries for data mining
Joining and filtering values
Full-text search with Python
We thought about using HBase and running Hadoop jobs to generate stat results. But the problem is that the results have to be real-time. So now we're thinking of using ElasticSearch. Is that a good idea?