Responsible Data Science
On this week’s Wednesday in the Woods, we enjoy a conversation about responsible data science and algorithms as you join me for my completely normal human morning routine.
If you felt like our discussion of data science and data strategy previously missed out on pieces of the big picture, you’d be right. That’s because we’ve been focused on some of the traditional management perspectives around “digital” or “AI.”
Today, we’re taking a moment to unpack the word “responsible” in this context. Specifically, we propose that responsible data science considers three factors:
Technical factors are the ones you’d expect an organization to focus on first. Are we using a clustering algorithm for a regression problem? Are we using the right data store or production implementation architecture? Is our #model overfit or underfit?
Legal factors are often next for an organization to consider. Is our use of data within the rights and obligations of our contracts? Are there any statutory frameworks or treaties that obligate us to notify a party or prohibit a behavior? Has a court set a precedent for damages to consider?
Lastly, and most relevant to recent conversation in society, is whether such use of data science and algorithms are “ethical.” Data science projects can be technically sound and not in violation of public laws or contracts, yet still violate ethical principles or make people feel off.
Not-so-coincidentally, we recently released a Responsible Data Science Policy framework. If you’re interested in more “real” contents on the topic, head on over to our article introducing the framework.