My partner is working on a project in which he's writing a data processing library in Python that's going to read data from different sources into memory, manipulate it, and export it into different formats. I'm helping him load the data but some of the datasets are really large (over 4 gigs).
So we're looking for an open source library for backup storage that can deal with these large datasets. It also needs to be able to deal with the data structure (add, rename, etc) and should support fast iteration. It also needs to be able to add missing values. Does anyone have a good suggestion?
Free Guide: Managing storage for virtual environments
Complete a brief survey to get a complimentary 70-page whitepaper featuring the best methods and solutions for your virtual environment, as well as hypervisor-specific management advice from TechTarget experts. Don’t miss out on this exclusive content!