Posted by: Beth Pariseau
I was intrigued when a colleague sent me a link to an article by Henry Newman referring to a “firestorm” touched off by some remarks he recently made in another article he wrote. The first article addressed the scalability of standalone Linux file systems vs. standalone symmetric multiprocessing (SMP) file systems such as IBM’s GPFS or Sun’s ZFS. His point was that in high-performance computing environments requiring a single file system to handle large files and provide streaming performance, an SMP file system that pools CPU and memory components yields the best performance.
Newman begins his followup piece by writing, ”My article three weeks ago on Linux file systems set off a firestorm unlike any other I’ve written in the decade I’ve been writing on storage and technology issues.” He refers later on to “emotional responses and personal attacks.” I’m no stranger to such responses myself, so it’s not that I doubt they occurred, but in poking around on message boards and the various places Newman’s article was syndicated I haven’t been able to uncover any of that controversy in a public place. And I’m not sure why there would be any firestorm.
I hit up StorageMojo blogger and Data Mobility Group analyst Robin Harris yesterday for an opinion on whether what Newman wrote was really that incendiary. Harris answered that while he disagreed with Newman’s contention that Linux was invented as a desktop replacement for Windows, he didn’t see what was so crazy about Newman’s ultimate point: a single, standalone Linux file system (Newman is explicit in the article that he is not referring to file systems clustered by another application) does not offer the characteristics ideal for a high-performance computing environment. “It seems he made a reasonable statement about a particular use case,” was Harris’s take. “I’m kind of surprised at the response that he says he got.”
That said, how do you define the use case Newman is referring to–what exactly is HPC, and how do you draw the line between HPC and high-end OLTP environments in the enterprise? Harris conceded that those lines are blurring, and that moreover, image processing in general is something more and more companies are discovering in various fields that didn’t consider such applications 15 years ago, like medicine. So isn’t the problem Newman is describing headed for the enterprise anyway?
“Not necessarily,” Harris said, because Newman is also referring to applications requiring a single standalone large file system. “The business of aggregating individual bricks with individual file systems is a fine way to build reliable systems,” he said.
But what about another point Newman raised–that general-purpose Linux file systems often have difficulty with large numbers of file requests? Just a little while ago I was speaking with a user who was looking for streaming performance from a file system, and an overload of small random requests brought an XFS system down. “Well, someone worried about small files does have a problem,” Harris said, though it’s a tangential point to the original point Newman raised. “But everybody has this problem–there is no single instance file system that does everything for everybody.” He added, “this may be an earea where Flash drives have a particular impact going forward.”