NOEL: Hello and welcome to Episode 64 of the Tech Done Right podcast, Table XI's podcast about building better software, careers, companies, and communities. I'm Noel Rappin. Our guest this week is Bari A. Williams. Bari is the VP of Legal, Business, and Policy Affairs at All Turtles where she provides legal guidance to startups working with artificial intelligence. We talk about writing ethical Terms of Service and how to collect and use data properly and also some facial recognition and other data mining and machine learning topics in the news. Bari talks about how having a diverse user and testing base can prevent your AI company from making damaging mistakes. I hope you enjoy the conversation. Before we start the show, one quick message. Table XI offers training for developer and product teams. If you want me to come to your place of business and run an interactive hands-on workshop, I would very much like to do that. We can help your developer team learn topics like Rails and JavaScript integration, or testing, or managing Legacy code, or we can help your entire product team improve their agile process. Also, if you're in the Chicago area, be on the lookout for our new public workshops including our How To Buy Custom Software workshop. For more information, email us at workshops@TableXI.com or find us on the web at TableXI.com/workshops. This week on the show, we have Bärí Williams. Bärí, would you like to introduce yourself? BARI: Sure. Thanks for having me. I'm Bärí A. Williams. I am an attorney. I'm an Oakland native. I am a great lover of tech. My life was changed by being introduced to an Apple computer in 1987 and that's just kind of influenced everything I've done from there. And I have two kids that keep me busy when I'm not doing all those other things. NOEL: Today, we are going to talk about the intersection of law and tech. So, you are a legal advisor to software companies, correct? BARI: To a couple, yes. But my actual day job is I head legal and policy for an AI startup. I don't like to use the word incubator. The founder hates the word incubator. So I say we nurture AI-focused companies. So, everything that people don't read - Terms of Service, Privacy Policies, contractual issues, all the stuff that no one reads, I write all of that fun stuff. NOEL: [Laughs] Do you read it? BARI: I do have to read it. NOEL: Do you read it for software products that you don't actually write it for? BARI: Oh yeah, totally. And the reason being is that I have stopped more people probably than I can count from downloading apps that they didn't necessarily understand how they were collecting their data and using it. And I would read the policy and then also look at, "Well, what's your use case? Why are you using this app?" And if you're using it, let's say just to order food, why does it need access to your camera and your photos and your contact list? NOEL: And your microphone. [Laughs] BARI: Well, right. [Laughs] To me, that's just asinine. Why would I give you access to that? There's no reason why you need this, other than to harvest data and then do something I probably don't want you to do with it. NOEL: As the person who is responsible for writing these legal documents, I would imagine there's some tension between what the business wants to do for business reasons, like there are presumably it is financially advantageous for companies to collect that data, versus what is sort of the ethical or responsible thing to do. How do you navigate that kind of tension? BARI: For me, the answer is to always draft in plain English. And I know that that sounds simplistic but for one of the one in ten thousand lucky people who read something that I've drafted, you'll understand exactly what it's being used for and why. And so if you don't agree with it, you can choose to not continue to use the product or to stop downloading and if you've already started, or to deactivate your account if the terms have changed and you don't like where it's going now. But the reason why that's important to me is I don't like the idea of people being sold a bill of goods. And I also don't like the idea of playing hide the ball. And so, if we're collecting your data, and yes, we may be using it to run analytics and see how we can improve the service, but we also might be using it to sell to third parties who would be interested in seeing your use habits. But if you write it just as I said it, then I know what you're doing with it. If you're hiding it deep down in the language in some small little section where it says, "Oh, we may elect at our discretion to yada...yada..." those are too many qualifiers and people's eyes are going to glaze over. And it looks like you may do something you may not and most people are optimistic, so they may just lean towards, "Oh, they may not." That's usually not the case. NOEL: Yeah. So that sounds like one of the first ethical pieces here is writing something that is clear versus writing something that is designed to get people's eyes to glaze over, so they don't pay too much attention to it. BARI: Yeah. That's definitely step one. I've said this to people many times before but I do, oftentimes when I'm drafting something, I'll show a copy of it to my son and he's 8 or I'll show it to my mom and she's 68. My son is a bit more tech savvy but he, of course, is not schooled in the world of contracts to the extent that a 68-year old who's lived her entire life having to sign them has. But she's also not tech savvy. So if I look at whether both of them understand it, then that means that the document is good. NOEL: Yeah. I work with a publisher who makes a point of having very, very human readable contracts which is great. I very much appreciate it. What do you consider beyond sort of the clarity of the request for data? What do you consider the issues in legal and ethical collection of data? BARI: I would say it's always coming down to two things. What are you collecting this for and how are you going to be using it? And if that isn't something that's actually happening or it's something that's actually working, then you need to really go through and comb through and see how is this company going to use your information. That, to me, is the biggest issue. The ethical issue is, what are you collecting? Why are you collecting it? How are you using it? Who are you sharing it with? Can the person opt-out totally and completely? If they cannot opt-out totally and completely, are there elements in which they can opt-out, like maybe they don't want you to have access to their contact list but they're fine if you are tracking their cookies. So, it could be different flavors. But those are essentially kind of the big six things that I think are issues around the ethical collection of data. NOEL: Does that make it harder to collect data? I think you said in your RailsConf talk, that there isn't as much of a tension between the ethics and the profitability as we might think. Am I remembering it correctly? BARI: Yeah, it was a slide which had a greater than sign, principles over profit. You don't have to forsake financial success just because you're doing something the right way to do it. I think people often collect too much data or they collect data that they don't need. But I think that that's also an anticipation that at some point in time, whether they're scaling or they develop new products or enhance the one that this person who they're collecting all this data from may be using, there is a use case for it in the future. And that may be so, but I also don't think that you need to overreach in order to make a good profit. And I think if you do one thing and you do it well and you do it ethically, people appreciate that, it would actually be more brand loyal. So, I think the larger issue around that is people collecting data that they don't need just so they can have it and find a way to monetize it later on. NOEL: If I came to you and I am creating a startup that is using predictive analytics as using AI and I've come to you for your legal advice, is there a mistake that I'm probably making already? Am I already doing something illegal and I don't know it yet? BARI: You may not be doing something illegal but you could be doing something unethical. And I think that's also a tension that people don't necessarily correlate, and I think maybe they should. Just because something is legal doesn't mean it's ethical. The very easy example of that is slavery used to be legal. Does that make it ethical? No. It just means that you could do it. But I was also raised in a household where my grandparents would remind me all the time just because you can do something doesn't mean you should. So, that's another tension point. NOEL: Some professions have professional code of ethics that practitioners are expected to follow, but software developers generally don't. BARI: They don't, yeah. NOEL: And despite the fact that there are occasional calls for it. But what sorts of things would you say -- I guess we've sort of covered some of that -- but what would you say would be the kinds of principles that you would put in that kind of like ethical code of conduct for software in general and particularly for the kinds of things we're capable of doing using AI? BARI: I would probably start with all Terms of Service and Privacy Policies should be in plain English. And it's even better if you can write it in a Q and A format. I did a warranty for a client and I wrote it in a Q and A style because that's conversational, it's easy to read, your eyes don't glaze over and it was written in plain English. So, there was no legalese in it. Something simple is: this broke, what are my options? [Chuckles] BARI: But I mean, that's what people want when something does break. NOEL: Sure. BARI: You just want to know what is the fastest way to resolution. Like I don't want to walk through indemnity language. Just don't hide the ball. Another thing to think about is to ensure that AI products don't have disparate impact on marginalized populations. And I think the way to do that is that means that you have to ensure that you have as complete a data set as possible. Companies should do beta testing with people from marginalized groups and also ensure their privacy and protection by making sure any data collected is limited. And if you can anonymize it, fantastic. If it can be encrypted when you're storing and collecting it, fantastic. But we saw a What's App hack earlier this week, so I don't know. [Laughs] I'm not sure about it. I'm not sure about anything anymore. NOEL: It's all very reassuring. We've talked on the show in the past. We've had Carina Zona as a guest. We've talked about the algorithmic bias that can come just purely on the technology side. And I guess then on the ethical side, the obligation is to try and work around that or to try to mitigate that as best as possible. And there are clear benefits to doing that. When you don't do that, your AI program makes embarrassing and very harmful mistakes. What do you see is a developer's responsibility if they're perhaps inside a larger company where they don't necessarily have access to the person who's writing these Terms of Service who are making these kinds of decisions? BARI: For the developer, I think it is the job to make as complete an ethical a product as you can. And I do understand though that sometimes that's hard because we don't know what we don't know and sometimes, it can also be embarrassing to ask for help because you don't want people to know that you don't know something, or you don't want people to think that you are prioritizing one thing over another, or that something isn't as important as something else. And none of that may be true. But the problem is going to be if you don't know what you don't know and you're not asking for help and you continue to just push forward anyway, you're going to make something that has holes that's going to have gaps in it. So, it could be something as simple as if you are the developer, it's really trying to get as complete a data set as you can. I would also say before you even send something off for up-the-chain approval, I don't think a product should be embedded or integrated into products without testing for inclusion. So, what is the use case or how would someone who's differently abled use this? Does this have any disparate impact on people of color? Does this have any detriment to people of lower income? And all of that is incredibly important particularly when you think about a lot of this technology is being used to look at credit worthiness and insurance and medical decisions and predictive policing. All of those things use historical data. Historical data is already pretty biased due to if you're thinking about using AI system for credit worthiness or ability to buy a home or where could one afford to buy a home. You have redlining and restrictive covenants, so people of color are automatically not going to be shown to have lived in certain neighborhoods and they've already been deemed to have lower credit worthiness. So, you've already built in this bias as to what these people can or can't afford. Everyone wants to say, "Now, it's the technology because it's nameless and faceless." No. Humans are building it. And so, we need to be thinking about that when we are building these types of products. Of course, the housing data, redlining, predictive policing includes historical crime data that disproportionately targets black and brown people and people in low income areas. All of those things could be harnessed in a very, very bad way. NOEL: Right. You need to make sure that you're not using the data set to reinforce the biases of the past. You have all those very fraught issues and you even see it in something like image filtering where a lot of image filter data is based on test subjects dating back to the beginning of color photography that were not inclusive in terms of what kinds of people were used for test images. BARI: Yeah. And even when you look at it from that standpoint, it's also interesting because the test images are probably going to be inherently bad because the lighting was not designed for people of darker hue. So, it's six of one, half dozen of another. NOEL: Right. And that even moves into facial recognition software which sort of inherits those same biases in terms of the amount of information that it gathers about different people that it uses to make those facial recognition choices. BARI: Yeah. That's a can of worms. NOEL: Well, that sounds interesting. BARI: It's a huge challenge. I mean, you can look at facial recognition technology, I guess, is marginally better than when we first started really hearing about it a lot in 2014. But it's hard for me to get rid of the memory of, I think, it was Google's facial recognition technology. I think it was a young man in Best Buy who was an employee and he was...I don't know. He was playing around or testing something. And it basically said he was a gorilla. It was a younger black guy. And just last summer, I believe it was Amazon's facial recognition technology, took 28 members of Congress who are all members of the Congressional Black Caucus delegation and confused them for convicted felons. [Chuckles] I guess, that's better, at least. It shows some improvement. You've moved from animals to people, even if it's the wrong people. NOEL: Yeah, that definitely seems like a lack of either -- it was one of the two things we've already talked about -- it was either a lack of diversity in the training set or it's an algorithm that has a harder time getting differential information from different skin tones. And in neither case is that a good... BARI: Either way, those are both poor outcomes. [Laughs] NOEL: Right. And then you just wonder what kind of testing these products... BARI: That's exactly where I was going. How are you testing this? Who is part of the beta test round that you're doing this with? And that's all I was saying before that it's imperative for people to test products before they are shipped widely with people of all different stripes. You cannot build for a global population with a homogenous group of people. It's just not going to yield anything that is widely useful. NOEL: It's kind of amazing to me at this point that products like that can get out of testing. BARI: [Laughs] NOEL: Somebody should have a checklist of like... BARI: Doesn't turn people into animals. Are we mixing up felons with congresspeople? NOEL: Yeah. There's like a known range of skin tones. Can we recognize all of them? It seems to be like a basic. BARI: I mean, you would think. And I try not to put too much blame on people who are making this stuff because the truth of the matter is is we all look at the world through our own lens. And so, if the majority of people who are building this technology, their lens is going to be statistically that's going to be a straight white able-bodied man who ranges from about, I don't know, 23 to 40. And that's a very different world view than my world view. What I would build would be different. And so, I understand that. But I also don't want to demonize those folks because I don't think that this is necessarily intentional. I think it's just not being aware of blind spots. And I think that not addressing that and going back to again saying 'I don't know what I don't know. Can I get a range of people to help test this before we do anything else with it' would just go a long way. NOEL: Yeah. That key question of what do I think I know that isn't so. What am I just assuming about the world but it's just my experience? A lot more software developers could ask that question. I mean, I've been a software developer for a long time. I've been in situations where I've not asked those questions when I should have which is why it's good to have a team around where you have a bunch of people that will ask that. It also seems like facial recognition has been a lot, as we're recording this, there's been a lot of recent sort of legal activity about facial recognition. We were talking in the pre-show about Amazon. There's a lot of question about the developers trying to prevent or deal with their companies selling their facial recognition technology to people who they think aren't going to use it well. BARI: Yes. NOEL: Google has had some issues with that. Amazon has had some issues with that. What does that look like to you? How do you think companies should address that sort of issue? BARI: I think one thing that's interesting is the employee activism that's happening internally. You have people who work at these companies and are not necessarily huge fans of what they're building and they're saying, "We don't want to build this." Or, "If we're going to build this, we don't want to sell it to XYZ groups, organizations, countries," what have you. And I think that that's pretty powerful. Now, the first thing I think of from a legal standpoint is how does that play out in terms of HR and retaliation. Because if the company feels that you're prohibiting them from making profit because of some principled reasoning or activity, I don't know that Amazon would argue that they're in the principles business. And that's not to say that they aren't. But that's not what they're doing right now. Like the goal is to maximize shareholder value. And that's just the truth. So again, it goes back to just because it's it's legal doesn't mean it's ethical. And we've seen Amazon, their investors this week are trying to stop sales of facial recognition technology to governments at their shareholders meeting. And Amazon's lawyers retort to that was essentially telling the SEC that any of those risks with facial recognition technology are just conjecture at this point because Amazon hasn't heard of any abuses yet. Like, if no one's complaining, it should be fine. NOEL: Yes. Nobody in the potentially restrictive governments that they're selling it to is complaining yet. BARI: [Laughs] Yeah. No one's going to complain about that. Not at all. NOEL: Oh, man! BARI: That is just such a huge minefield. Makes me think of China, how they're using surveillance and facial recognition to create these social scores. That is just the worst slippery slope thing I've ever heard of. So now, you can decide that because I jaywalked five years ago, that my credit score should go down 100 points. Or because I put a water bottle in the trash and not the recycling bin, now it's going to be harder for me to get a visa. That whole idea is crazy to me. NOEL: It definitely sounds like we're getting to the point where that technology is just getting sort of creepier, to use a technical term, creepier and creepier. And there is a story going around about, I think, JetBlue. BARI: Oh, yeah. The woman on the international flight. NOEL: She didn't need a boarding pass. They just scanned her face and she had no idea how they had the recognition data to match it against. Which is convenient and creepy, I don't know. That's definitely disturbing. BARI: But that's the hard part as you have to discern, are we all willing to give away our biometric data, for one. It's like, "No, thank you." But you're giving away biometric data to get through the clear lane. You're doing it for TSA pre-check. So at this point, what's the difference? I mean, what I thought was interesting about that story is she was tweeting back and forth with JetBlue about this. And when she said she didn't understand how they had the ability to recognize her, to match her face with her boarding pass or her name or whatever, JetBlue said, "Well, the Department of Homeland Security. We transmitted that quickly and they give the green light." And she's like, "Hold up. The Department of Homeland Security now? Say that again?" NOEL: This doesn't make it better, honestly. BARI: No, it makes it worse. It makes it worse and in myriad ways. NOEL: Right. Because it's one thing for me to say my phone can recognize my face but I have some assurance that that data doesn't go outside of that device. And it is another thing to say I have given this airline permission to use my face for some reason because I don't want to pull something out of my pocket to board a plane. Or like, I go to Disney World and I put this bracelet on so that when I start to walk into a restaurant, they can greet you by name, which is also a little creepy, but I've sort of consented to it. But to go to the airport to have them just go like, "Oh, hi Noel." And because it's a government database, that seems to me to be a whole nother level. BARI: Yeah. And I think what was interesting about that and I talked about this a bit with Ruby on Rails Conference is the same thing, is that the idea of the Housing Urban Development Department suing Facebook. And the only reason why they decided to pursue that is because Facebook would not give them unfettered access to user data. And my thing is, OK, so we've already seen JetBlue situation happen and now, Ben Carson, who doesn't even know the correct acronyms for the own department that he's running, he's like, REO somehow equals Oreo in his head. I don't want that guy having unfettered access to anything that has to do with me. But that to me was a very interesting kind of intersection point is, OK, so you have this woman dealing with Department of Homeland Security can verify her face and identity and then the only reason why another department of the government decided to sue a social media company is because they would not allow them to have unfettered access. And so, the first place I go is, what are you going to do with that? NOEL: The Facebook thing. One thing that was interesting about Facebook -- so Facebook relatively has gotten in some legal trouble for the way that they have allowed real estate developers to limit ads for housing. I think that's what we're talking about here, right? BARI: Yes. There was a self-serve ad tool, I believe they tweaked it last summer or fall in order to get rid of that ability. But essentially, for years, you could get on and create your own ad. And then because they're targeted ads, you could choose who you wanted to see it. That is a perfect example of you are building what the customer wants. That's what all businesses would want. The problem is without some checks and balances in place, you're enabling them to break the law and discriminate. NOEL: Right. Because there are specific laws about housing, in particular, that prevent you from advertising certain housing listings to one racial group and not another racial group, or one set of people and not another set of people. And you could totally imagine the developers at Facebook not even being aware of this. They may also have had malicious intent. Let's not give them too much credit. But you could totally imagine them not being aware that targeted advertising is perfectly fine for most things. I mean, it's not perfectly fine. It is perfectly legal for most things. But here's the boundary where this particular kind of targeted advertising is suddenly illegal for good reason. And then suddenly, that causing all kinds of problems. BARI: Right. In a situation like that, it's just a matter of I would say -- I don't presume that the developers that were building that tool knew that. I don't think they had some deep vast knowledge of employment law or housing discrimination law. So, they probably were really just building what they thought the customers wanted. The problem is that there should have been some type of testing on the backend that was able to support this. Now, another thing that's interesting is that a product counsel or someone somehow didn't identify that. So, that is where you -- sometimes when you're working with developing certain things, it might make sense to involve your lawyer earlier rather than later, because a lawyer would have seen that and said, "No, you can't do that." NOEL: You can imagine where a scrappy startup like Facebook wouldn't have the resources to bring a lawyer to bear on that kind of issue. BARI: [Laughs] NOEL: In general, I would recommend that if you are a developer and you are doing stuff that seems to interact strongly with finance, I would definitely make sure that somebody was around because you can actually get yourself into a fair amount of legal trouble pretty quickly with financial data. BARI: Yes. NOEL: I wanted to talk about -- I don't know much about this, but we're talking about facial recognition and San Francisco took steps last week to limit the use of facial recognition on the part of the government. BARI: Just in the city, which is great. But part of me also kind of feels like it's a little too late and that's not their fault. But it's what we talked about. The government has your biometric data and is able to scan your face to allow you to board a plane. So, whether someone is doing some surveillance and using it for facial recognition technology in San Francisco, it doesn't matter. They're doing it everywhere else. They're doing it in the airport. They're doing it when you volunteered, willingly give it to them. You're doing it when you're handing over your DNA for an Ancestry.com test. NOEL: [Chuckles] I was just going to ask if you had ever done that. BARI: Oh, dear. That's a very interesting field to me. I think it's still sort of, not sort of in its infancy. Well, I guess, maybe now it's a toddler. [Laughs] NOEL: Yeah. BARI: But it's one of those things where you definitely have privacy concerns because on another, not Ancestry itself, but another sort of genetic history database -- but that was open source and free to use -- the police were able to find the Golden State killer through DNA that somebody else who was related to him apparently had uploaded, which to me is crazy. NOEL: That's a whole nother level of privacy concern because it's not even... BARI: It's not even you. NOEL: Right. Somebody else is uploading their data to these sites and it's impacting my privacy. And while, I think, we can all get behind the arresting of serial killers, that does have a lot of implications for other things. BARI: Yeah. I mean, there is, to me, no greater privacy than your genetic information and what you choose to do with your body. Well, that's a different topic, particularly this week. But seriously, that is a privacy issue. So, when we talk about right to privacy, it's bodily autonomy, it's genetic information. It's all of those things. And people don't understand you're mailing away this information. And if you haven't necessarily read the privacy policies and gone through, there are ways that you can opt-out of certain information being shared or retained. But people aren't reading that stuff. And I'm sure Ancestry and the other companies are hoping that you're not reading it because they want to see if they can learn something else and monetize that information later. NOEL: The idea of what kind of bad machine learning algorithms are happening over like 23andMe's genetic data set is a little disconcerting. BARI: The other thing is you're literally giving away your genetic information and other people are profiting off of it. You can opt-in to research or you can not. But you don't know what happens once that research is conducted or who's conducting it or where's it going. I wouldn't touch that. But that's me. NOEL: That's reminding me of the story of Henrietta Lacks. Is that the...? BARI: Yes. Yes. NOEL: A woman in the 40's or 50's. There's a very good book about it which we'll put in the show notes. But a woman who happened to have cells that were very easy to get to divide in a lab situation. Just for some reason, her cells enabled that. And therefore became the basis of a ton of biologic research. And she never knew and her family never knew for decades. BARI: She never knew. Family didn't know. Nobody was compensated for that. And that to me -- so the important thing about that is like she -- this is this relatively lower to mid income black woman and is just sick. And then essentially, you become a guinea pig, literally a guinea pig. And that to me, when I think of some of these tests, it to me harkens back to that story. It harkens back to the Tuskegee Airmen where you deliberately give people syphilis and don't cure them just to kind of see what happens. You're literally using humans as test tubes. And without their knowledge or consent or compensation, it would be different if people had this information and opted-in versus not being told at all or even opting-in and being compensated for the participation. That, to me, is a a huge issue because you're using people as crash test dummies and they don't know. That goes back again to, at this point, these are private companies. So, it's writing things in plain English and not burying the lead. NOEL: Right. It's giving people enough information to meaningfully opt-in or opt-out. And this circles back to Facebook which intentionally -- I shouldn't say intentionally -- but appears to obfuscate their privacy controls in the hope that people will just not deal with it. I should probably say allegedly there. BARI: [Chuckles] Well, I will say they've been better about changing some things and giving you the privacy health check updates and the ability to opt-in and opt-out for certain things. And that's good. But the larger issue is that people have already given all of this information for, at this point, when was that? 2004? So, for 15 years. I mean, there's nothing you can do with the 15 years worth of information you've already collected. I guess the only answer would be if some regulator said, "You have to come in and wipe everything completely clean," and essentially adopt the EU's notion of right to be forgotten, then you can allow people to start fresh and opt-in to all of these things if they like. Now, do I believe that that will ever happen? No. And even if it ever could happen, I don't even think that that's possible for them to delete all the information that they've already saved on people. That's just not going to happen. NOEL: Yeah, it does seem unlikely. Bärí, where can people reach you if they want to talk to you more on a privacy invading social network of some form or another? BARI: [Laughs] Just come harvest all my data. I'm giving it away. I am giving it away on Twitter. My twitter handle is my name. So it's @BariAWilliams. And I write a ton of things. If anyone is having insomnia and wants to be bored by privacy articles and whatnot to sleep, you can find that at www.bariawilliams.com. NOEL: And yeah, coming to a Terms of Service agreement near you. BARI: That's right. And plain English. NOEL: Great. Thank you very much for being here. This was a great conversation. I'm really glad we got a chance to do it. BARI: Thank you. It was fun. NOEL: Tech Done Right is available on the web at TechDoneRight.io where you can learn more about our guests and comment on our episodes. Find us wherever you listen to podcasts. And if you like the show, tell a friend, or your social media network, or your boss, or your pet, or your boss's pet, or your pet's boss, any of those would be great. Also, leaving review on Apple podcasts helps people find the show. Tech Done Right is hosted by me, Noel Rappin. Our editor is Mandy Moore. You can find us on Twitter @NoelRap and @TheRubyRep. Tech Done Right is produced by Table XI. Table XI is a custom design and software company in Chicago. We've been named one of Inc. Magazine's Best Workplaces and we are a top-rated custom software development company on clutch.co. You can learn more about working with us or working for us at TableXI.com or follow us on Twitter @TableXI. And we'll be back in a couple of weeks with the next episode of Tech Done Right.