Data science image via Shutterstock
By James Kobielus (@jameskobielus)
Data science is a substantial body of skills and practices. It is certainly a profession in the broadest sense of the term, what Merriam-Webster defines as a “calling requiring specialized knowledge and often long and intensive academic preparation.”
But is data science, or should it become, a profession in the more narrow sense debated in this recent article? That sense, as stated in the first paragraph, involves having “a code of professional conduct and self-regulation.” In other words, the debaters define a “profession” as essentially what doctors, lawyers, and certified public accountants belong to in most advanced societies. If you don’t certify in some formal way in your jurisdiction, you can’t legally practice your chosen profession.
I really don’t think we should regulate data science as a profession. There is a big difference between data science and the aforementioned regulated professions. The primary difference is that data scientists–like the other fields to which it’s akin–rarely provide personal services to the general public. In other words, data scientists seldom render services for which the consumer-protection safeguards associated with strict certifications might be necessary. Data scientists are usually either paid employees who only do work for their employers, or are contractors who serve the needs of private or public-sector clients.
Yes, of course, organizations trust that their data scientists know their field and apply its practices with integrity. But the standard free-market mechanisms (e.g., employment contracts, firing with cause, etc.) are usually sufficient to weed out the non-performing or dishonest data scientists.
Those who argue for data science to become a certified, self-regulating profession with an official code of conduct tend to seize on privacy protection as the compelling issue. But none of them can ever point to examples of rogue, dishonest, or unscrupulous data scientists who run amok and abuse some ill-defined public trust. Usually, the main culprit is not the data scientists themselves but the business decision makers who direct them to engage in intrusive target marketing and other practices that may step over a privacy line.
Formal certification of data scientists would be regulatory overkill. Also, as someone noted in the article, legal certification requirements tend to create artificial entry barriers, inflate prices, and stifle innovation in the regulated industry. None of that would be in the public’s best interests.