Episode 28: (Data Bites) Creating Checklists for Practical Data Ethics

On Monday’s episode, I scratched the surface of Ethics and Ethics in the Data Profession. During the show, I remained uncommitted to whichever normative ethical theory as it applied to the data profession.

In this episode we focus on a quickly growing practical suggestion for how to implement whichever normative theory you like. By focusing on checklists, we are able to routinely implement best ethical practices without having to constantly consider whether we are living up to our best ethical selves.

To keep up with the podcast be sure to visit our website at datacouture.org, follow us on twitter @datacouturepod, and on instagram @datacouturepodcast. And, if you’d like to help support future episodes, then consider becoming a patron at patreon.com/datacouture!

Music for the show: Foolish Game / God Don’t Work On Commission by spinmeister (c) copyright 2014 Licensed under a Creative Commons Attribution (3.0) license. http://dig.ccmixter.org/files/spinmeister/46822 Ft: Snowflake

Show Notes:

Welcome to data couture, the podcast about data culture at work at home. And on the go. I’m your host, Jordan Bohall. To stay up to date with everything data controller, be sure to like and subscribe down below. Furthermore, be sure to follow us around the internet at data to her pod on Twitter, at data couture podcast on Instagram, and at data couture pod on Facebook. Of course, if you’d like to help keep the show going, then consider becoming a patron at patreon. com forward slash data couture. Now, no under the show.

I’m your host Jordan, on today’s data bytes, we’re going to be looking at one particular approach to solving the problem of ethics and data science. But before we do that, I’d like to remind my listeners that I’m attempting to get data couture to be the number one podcast of the month over at podcast land. And to do that I need you. Yeah, you I need you to go vote for the podcast over a podcast land. And the way to do that is to visit data tour.org forward slash podcast land. Again, that is data couture.org forward slash podcast land. I appreciate everyone’s focus and attention for this endeavor that I’m trying to do. podcast land will allow me to be seen by a broader audience and at the end of the day, get more listeners.

Now today’s episode is following up on Monday’s episode where we talked about ethics and data science. And you recall in that episode, I was extremely cagey about what I meant by science. Of course, that’s only because there are so many options out there just like data literacy, just like digital literacy, any of these sorts of things are wrought with so many different pitfalls. However, there is a popular piece going around by DJ Patil and his co writers that talks about what’s the practical way to actually employ to implement an ethical standard and our data practices. And I won’t do a deep dive into their particular solution on this database. But I will take away one piece from their writings, and that is having a checklist, having an ethical checklist to make sure that you cover everything you need to do to make sure that your data products, the way you treat your customers data has everything in that vein, done in an ethical manner. And I’d like to specifically talk about kind of the brilliance of having this kind of checklist. So I don’t know about you, but I’m a list maker. I love making lists. Specifically I love crossing lists out or crossing items off the list out when I accomplish them each day, each week each month. And this ethics checklist is no different. The idea is come up with a standard operating procedure, so to speak, 10 to 20, whatever suits your particular development cycle needs.

And make sure that when you’re producing something, when you’re handling data, when you are sending data out to various people that that data, that data product satisfies every one of those pieces on the checklist. What are some examples of an item item checklists? Well, one item might be make sure that you remove any personally identifiable information. Now, this one is particularly near and dear to my heart, because in my industry and in my particular job role, we handle a lot of customer data that contains quite a bit of sensitive, personally identifiable information, for example, social security numbers, or 10s, tax ID numbers, addresses account numbers, phone numbers, all these sorts of things. And so whenever we get a requests for data to be handed over to either somebody internally or some third party, whatever it is, we’re very clear to or we’re very careful to make clear that we’re not going to be giving out socials, we’re not going to be giving out phone numbers or email addresses or whatever the whatever was called upon from the person requesting for the data unless there is a very, very good business case for why they might need that.

And one reason we do this is because we don’t know the downstream effects of that data. What do I mean? I mean, okay, somebody requested the data, and they requested it from perhaps a very good reason. Maybe it’s vital to their particular job function, or it’s vital to the business as a whole. Well, we don’t know where that data is going to end up after that person gets it. So say one of our vice presidents asked for data set, or they asked for a data product that includes, say, social security numbers. Now there are lots of reasons why we might need to give out our members social security numbers to somebody internally within the organization, say to satisfy various federal requirements, for example.

However, if we’re very if we even have the slightest hint, if we don’t know if maybe that BP is going to pass it on to somebody else who’s going to pass it on to somebody else, and then somebody comes to that whatever, one fifth or fourth party and say, Oh, this is some great data, this is a great data set, I need this for my sales function, can you throw me that data set, all of a sudden, now you have it in a position where that data might be seen out in the world, because it’s no longer kept internally, it’s no longer kept protected by the data governance within our own organization. So to get back to data ethics as checklist, it’s a great practice to have items like this on it that way, you make sure that every single time you produce something, be that in your QA process, or if you do QA on the fly, like sometimes we have to do, you at least have a level of certainty that you’re treating that data or that data product, or that machine learning algorithm in the most ethically sound way possible.

And of course, that checklists will grow and develop over time. So don’t expect to have a perfect checklist and the first instantiate of it. But it’s it’s a good habit to get into to make sure that you check these things off, which become very, very satisfying. if I do say so myself. And in that way, you have some level of security, you knew mitigate any sort of risk that you might fall into accidentally, like most of us tend to do if we aren’t very careful with how we handle our data. And so, I know I was cagey about which ethical theory I was going to really purport to and the first episode on Monday, and frankly, I prefer a bit more nuanced approach in this checklist approach, but it’s a fantastic way to start implementing ethical practices in your data profession.

Talk to you soon. That’s it for the show. Thank you for listening and if you liked what you’ve heard, then consider leaving a comment or like down below. To
stay up to date on everything data couture, be sure to follow us on Twitter at data couture pod to consider becoming a patron@patreon.com forward slash data couture music for the podcast. It’s called foolish game. God don’t work on commissioned by the artist spin Meister used under a Creative Commons Attribution 3.0 license, writing, editing. Production of the podcast is by your host, Jordan Bohall.

Liked it? Take a second to support Data Couture on Patreon!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.