Every week, we talk about important data and analytics topics with data science leaders from around the world on Facebook Live. You can subscribe to the DataTalk podcast on iTunes, Google Play, Stitcher, SoundCloud and Spotify.
This data science video and podcast series is part of Experian’s effort to help people understand how data-powered decisions can help organizations develop innovative solutions and drive more business.
To keep up with upcoming events, join our Data Science Community on Facebook or check out the archive of recent data science videos. To suggest future data science topics or guests, please contact Mike Delgado.
In our recent #DataTalk, we had a chance to talk with Jenna Lake about how computer vision and machine learning is helping sports teams understand, evaluate and improve their performance.
Mike Delgado: Hello and welcome to Experian’s Weekly Data Talk, the show featuring some of the smartest people working in data science. Today we’re excited to talk with Jenna Lake, who is a computer vision engineer at Second Spectrum. Jenna earned her master’s degree in computer vision from Carnegie Mellon University, and a bachelor’s of science degree in computer science in biometric systems at West Virginia University. Jenna, thank you so much for being part of our show today.
Jenna Lake: Thank you for having me.
Mike Delgado: I always ask our guests to share a little bit about their journey. What led you into data science, and particularly computer vision?
Jenna Lake: Well for me, I just had a hard time choosing. I loved writing code but being a biometric system, as having that as one of my majors, I could do a lot of image processing and then when I graduated I was lucky enough to work with a company where I had a very supportive boss. That just pushed me into more computer vision and more kind of data processing techniques and it was a great experience that let to me getting my master’s in computer vision.
Mike Delgado: That’s awesome. So, when you were in high school, were you just always just interested in math’s and sciences? Were those pretty much your focus?
Jenna Lake: Yes, I was in the honors math society at my school and I was horrible at history.
Mike Delgado: I was the complete opposite. I loved reading and I loved writing, and I stayed away from math when I was in high school. When I went to college I was an English major, I stayed far away from engineering. I took one computer science class. This was back in like 1999 so it was C++, Visual Basic and I just now have huge respect for anybody in the computer sciences after that one class.
So, Jenna, computer vision is a very hot topic these days. I was just reading about Google lens now allowing people to be able to take photos of an object and AI can identify what that object is, what type of flower, so there’s a lot of cool things that are happening in computer vision. And what I love about the work that you’re doing is you’re active in using computer vision to help with sports analysis, and I’m wondering if you can share a little bit about how you’re helping computers understand basketball in the NBA.
Jenna Lake: Yes. Well, it’s actually really interesting. We take in a bunch of different inventory of data we try to get of players and where are the balls. At the beginning, it’s where is the basketball in 3D, for every frame, and then you have a completely separate team that can also go look at that and say, “What does this mean?” What does it mean when the basketball’s here and maybe the opposite team is close to it? Are we having an offensive or defensive play that’s happening. And that’s something that’s quite exciting because that’s typically something that was done in a very manual fashion. People would just re-watch, re-watch, re-watch. And, you know, there’s options if you want to do those types of things, in a very easy manner. Don’t watch the entire game, here’re just the interesting bits you want.
Mike Delgado: I’ve never been involved with sports so I can only imagine what it’s like inside of a locker room. So, when I see movies of coaches with their teams re-watching plays over and over again, you’re now aiding in that process by teaching computers to understand the relationship between the players, the ball and what’s happening?
Jenna Lake: Exactly. So, we’ve been focused in on the people that we had. We said, “How is this player in this position with these other people?” And you can watch every single person, so you can say, “I would like to see the defensive moves that this person can do. I’d like to see is this person on the court most effective. You can be looking at maps and say, “Of this player, on the court, he is successful more on the left than the right. Or he’s more successful if he works with these people or those people.” So, it’s a very interesting kind of way of exposing the data even more for these coaches, for them to understand in even greater detail what’s happening.
Mike Delgado: Yeah, so is it like the part of the data that you’re examining is how the players interact with other players?
Jenna Lake: Well, we do know where every player is in every frame in 3D. We know where the ball is. Out of our output that we say, “Here are the coordinates,” so we can annotate on top of it, and we can even pull out those exact time stamps when things happen. You can then query and put that together and say, you know this is what happening. This is my in-depth analysis of the game. And what we try to do is enable coaches to do this as fast as possible. We would love it if you could just turn it on and say, “That’s what I wanted. That’s interesting. I hadn’t thought about that.” Just re-watching games in the NBA season, there’re over a thousand games in the season. I mean, fans know all the teams, but sometimes understand how your team works or your opponents, so it’s just a lot of data for humans to manually take in.
Mike Delgado: Yeah, no doubt, and I think I read on your website that you’re working with, was it a third of NBA teams right now?
Jenna Lake: We actually have the exclusive contract with the entire NBA for tracking and the kinds of statistics that you would say on a fantasy basketball site or any other site, that’s generated by us. In addition, we have coaching software and many of the teams subscribe to a premium level of that software as well.
Mike Delgado: That’s awesome. So, you know, for deep learning, machine learning, you need a lot of training data. I was curious about what types of basketball data you’re pulling in and analyzing.
Jenna Lake: Well, we’re very fortunate in the fact that we have our own cameras, so we can collect what we want. However, we have the same data problems every single person has. Labels, labels, labels, labels. If you want to train anything, you need to know what’s happening there and it’s wonderful to have this huge amount of data, but at the end of the day we need to be able to make sense of it. And it’s completely inefficient to say, for each label in every single frame, every single player and the ball. I mean, that would just take forever. And so, you must sort of walk that fine line. I think almost everyone in machine learning and computer vision right now, must understand how can I try a model in an effective manner with some data. Or what data is most important to represent. Because you also want to represent your normal cases as well as your edge cases effectively.
Mike Delgado: Now one of our comments just came in about why the focus is on NBA and that’s because basketball’s just starting up so we’re excited about NBA. But obviously, the work that you’re doing expands beyond basketball, right?
Jenna Lake: Yes. I’m working with some soccer teams in the MLS, so that’s been very exciting and it’s interesting to see the differences in the sports. Just very simple things like in basketball you play indoors, but MLS all outdoors. So suddenly now we have all those things, it’s no longer a laboratory setting, and outdoors you have weather. You can just imagine how that’s so different. And not only that, but suddenly you have a very different number of people on the field. So, it’s a very interesting, exciting time for us.
Mike Delgado: Yeah, no doubt. You guys do analysis of soccer, the National Hockey League. Just amazing the amount of video footage you must be analyzing and sensor data that you must be pulling in.
Jenna Lake: Yes, so we primarily work with the NBA and we do some MLS as well. And it’s interesting because sometimes I talk about this with people and they say, “Do you just watch basketball?” I was like, “Yes, but I think of it in a very different way than someone simply watching it.”
Mike Delgado: Yeah, I imagine it must be funny sitting down watching the Superbowl with you, because you’re probably watching the game in a completely different way.
Jenna Lake: Yeah.
Mike Delgado: So, I’m curious about what sorts of models do you work with, or models that you build to help you identify basketball players and how they’re performing.
Jenna Lake: Yeah, so we do have a number of different models. Of course, we need to have models for the players and we’re very lucky in sports that unlike faces in the wild, people wear jerseys with their names on them, colors, and numbers. But you know that itself has its own challenges. You could imagine that if you have two of the same number on the court, you know that’s a little bit different, but what if you have some numbers that kind of look the same?
Mike Delgado: Yeah.
Jenna Lake: It’s hard for us. It’s hard for a computer as well. Models for the ball that we’re looking at, and so in all of this. I think the one really interesting thing that people analyzing sports is that nobody will know in half-time, say, we wish we could talk about this but the statistics, they’re going to take another half hour. So, it needs to be practical, you must say that I can understand what each of these individual people are. I know exactly who they are and I know where the ball is. And do that in an efficient manner so that it can be consumed and used by the people both on and off the court. Maybe it’s a sports broadcast, or maybe you’re just sitting at home, like, “Oh man, fantasy basketball time.” Yeah. It’s a hard challenge, but an exciting one at the same time.
Mike Delgado: It must be exciting for the coaches to have so much more data to pull from than just what they’re seeing on the court or wherever they’re at.
Jenna Lake: I think we’re enabling them to enjoy their job. The fun part of their job is not sitting and watching same, like, you know, footage reel four times. Their exciting thing is, “Aha, I see this is going on. Do we change this now? Or do we improve on this.” Or, you know, “I see that this player, he’s doing hard shots, and he’s succeeding some of the time.” And then maybe you have another player, he just does all the easy ones, all day long, right? Which sounds better? So, look at that and qualify that. What is more important for my team right now or in the future?
Mike Delgado: So, I’m kind of curious, Jenna. Are you playing fantasy basketball?
Jenna Lake: Oh, yes.
Mike Delgado: I have a feeling you’d be dominating.
Jenna Lake: We have some people in the office who are extremely serious about it. The bracket is very intense.
Mike Delgado: Can you share a little bit about different ways that NBA coaches and teams could be using your data in real-time, either before, after or during the game?
Jenna Lake: Yeah, we’re going to face a very hard team where it’s been neck-in-neck. You could think about going to the playoffs, you know, how are they strategizing. So, one thing you could do is to prepare through the strength and talents of your opponent’s team, as well as that time. And you can always look at the games where you’ve kind of previously seen, “Oh this is where we collapsed. This is where we really succeeded.” And that’s a really interesting thing on just a strategy level.
But if you look deeper than that, you know we can talk about this with like trades and free agency, right? I know a lot of things happen in the off-season of the NBA with this. And, you know, they have the tools in place to say, “I know this is what my team has, but this is what my team needs. And who do I look for, for that.” I think that’s just an interesting thing I secretly hope for that when these things happen that somebody just sat at the computer, looked at our software and said, “That’s the person I need.” You know? So, I think that’s a cool thing.
Another cool application that’s kind of on the forefront that we’re working on, is we’re planning on doing, and it’s been spoken about at Recode by Steve Palmer is we’re all different, we all have different needs when we watch sports. Is there a way I could augment sports for me that makes it more enjoyable? And so, we’re looking at all these different ways of how we can invent virtual reality to customize this viewing experience, maybe you want to really impress your friends in your fantasy basketball league. Is there a way that we can show some of our things to you and you can say, “Ha! I learned something new today.” I think that’s something exciting that we’re starting to be able to do it.
Mike Delgado: Yeah, that’s super cool and I’m just thinking about the future how you know just in the last year we’re seeing leaps and bounds with virtual reality and watching possibly sports in the future in virtual reality where we can be down on the court, you know, possibly watching a game and then overlaying it with data that you guys provide over the players that we’re interested in. That would just be phenomenal to be able to have access to that in real-time. This is very, very exciting for anyone who’s a sports fan or into virtual reality because the data that you guys are providing in real time is just fascinating and definitely going to be helping improve coaches and teams as they make their decisions.
Jenna, what are some of the biggest challenges you notice right now for those working in computer vision?
Jenna Lake: Well, I think in computer vision you always have this curse now. You can look at so many things, but it’s slow, you know? You could sit there and we’re like, “We could look at everything that ever happens and come up with this really interesting analysis, but you’d wait for four weeks for it.” If that’s what you need, if you’re doing like predictions of storms in twenty years, then you can spare four weeks. But I think we’re all running into this real performance failure. How do I make this accessible and useful because you know I think everything when it comes to data, there’s an expiration date on it. If you’re like working with stock trading, you know, you want to know what’s happening soon. You don’t want to know what’s happening after it’s happened.
I used to work with detecting IEDs for the military. Nobody wants to know they ran over an IED ten minutes ago because they’ve run over it at this point. And so, it’s interesting how do we balance performance and how do we balance accuracy, so that’s usually the trade-off you must play. You know, can I get you something really, really fast? But is the result good enough and is that something we care about. So, I think overall with computer vision, it’s something we’re tackling and what I’ve liked is that when I went to the big computer vision conference, CVPR, I saw some research that’s pretty much all facets of the industry that deal with computer vision now is coming out with GPUs that optimize for these kinds of, you know, processing techniques and things like that. You can see there’s a real shift in the entire industry and academia that would say, “Okay, what we thought wasn’t possible is now possible. How do we make it?” You know?
Mike Delgado: Yeah. I mean, it’s just amazing to see all the growth that’s happening in computer vision, and when I originally thought about computer vision, I was thinking about automobiles, I think that’s the very first time I heard about it with like Google developing a self-driving car and how computer vision was helping that car navigate and recognizing is that a piece of wood on the road or is that a paper bag? Like, there’s so many technicalities to computer vision, but it’s so cool to see now how it’s being related to help with sports, help with teams make better decisions so I’m just very, very impressed by the work that you’re doing.
You shared some of the challenges with computer vision. What are some of the most impressive applications that you’ve seen with computer vision? And where do you see things headed?
Jenna Lake: Well, that’s a good question. I mean, of course I think everything we do here is very cool, but other than that. Other things that I’ve just been like, “Oh I haven’t thought about that before” and you know, I hate to say it, everyone’s working on self-driving car, everyone’s working on VR, and it is all very cool, but I have some very novel things that I’ve seen and I’m like, “Wait, what?”
When I was at CMU, I was lucky enough to be working in the illumination and invention laboratory. And in there, they’re working on a smart headlight. So, if you imagine, you could drive with your high beams on all the time and not worry about blinding anyone. And so, they have the ability to turn off a part of the headlight and so it won’t blind the other car. I thought that was the coolest thing as someone who grew up on the East Coast. I thought that was just great. There’s also things where they can turn off the light in those pixels or areas where there’s a lot of reflections from like rain or snow. I thought that was interesting and novel application I thought of computer vision that I’ve seen recently. It comes out of MIT, and they erase subtle changes in color in your face that you can barely see with the human eye, your heart rate from a regular webcam video.
Mike Delgado: That’s crazy.
Jenna Lake: Yeah, it’s so weird. And they’ll dial up, you can see it in people’s face, just red-white-red-white-red-white. It’s the weirdest thing, but it works.
When I was in school, I was like I must make sure this works for real, and so I played around with it. It’s one of those things that it shouldn’t make sense, but it does. It’s just so interesting. And I must say one thing that I think is interesting with this eve of like great computer vision and machine learning that we have. Personally, I’m a computer vision person. I live to make dots and rectangles and bounding boxes and segmentation of things on images, but I will be the first one to say that my output looks hideous. It looks horrible, and it’s a lot of data. You think about, “Oh my God, we’re increasing the orders of magnitude of what you see, right? Usually, if say basketball for instance, we’re used to saying, “Oh that’s the guy with the ball over there”, and then that’s all we care about. Suddenly, now, for every single frame, every single player, how do we distill this down until we can use this.
And I think that’s such an amazing thing and I am forever grateful for the great user experience people and the creative people and the full stack engineers who think about this every day, because I think that is impressive. How do you take my dots and these like outputs and you make them so people can use them? I’m just always impressed by that kind of thing.
Mike Delgado: Wow, we’ll it’s super impressive of what you’re able to do to get that data over to those various scientists to help making, help make it more meaningful. But for you, it’s like you not only part of creating the models and finding what’s important, but you also need to have a good understanding of the game, so it’s just … your work is fascinating and very, very cool to see the developments happening in your space.
There are a lot of data scientists in our community that are looking at getting into computer vision. There’s also people who are thinking about … that are currently students now in high school or college and they’re thinking about pursuing a degree in data science. And I was wondering if you have any advice for them.
Jenna Lake: Yes. I have two bits of advice that I’ve gotten from my professors and my bosses in the past as well. First thing is stay curious. You know, I got into computer vision by basically asking, “Hey, that over there, can I play with that for a while?” I also got into GPU programming, which is general purpose, GPU programming. I got into that because saying, “You know, that guy left that used to do that, do you mind if I start doing that?” And they were like, “Oh yeah, sure why not.” You know, so stay curious. Get yourself into things you think are exciting and interesting.
And then secondly, this one’s been a hard thing I think that most people struggle with, prepare to try lots of things and fail most of the time. Whether you’re in machine learning, or even software engineering. I mean, most things in life, people try to do things 10, 20, 30, 40, 50 times and you will fail. But there’ll be that one time you say, “Hey this actually works. That’s great.” So, don’t be afraid to fail. That’s completely normal and I’ve been very lucky to have talked to some of the greats in computer vision while I was at CMU, and it’s the same thing they said, you know, “I failed all the time, but nobody remembered that. They just remember the times that I actually succeeded.” And I think that’s a great bit of advice to take because it’s very easy to beat yourself up about it, you know?
Mike Delgado: Yeah, I love that. That’s wonderful advice. One last question. There’s a lot of debate about, for somebody’s who’s just getting into this data science, which program to learn first. Right now, I’m seeing that Python is considered the language to learn, and I was wondering if you have any advice for those just breaking in for a language to start with.
Jenna Lake: So, I would personally go with something that interests you. The thing that’s going to keep you learning it. So, if you wake up in the morning and you said, “Hey, I am so interested in embedded systems. I think it’s so cool.” I’m one of those people. I’m going to run for the hills for C++. I’m going to be like, this is awesome! But if you’re personally, like, I have this book and they talk about this cool library and I want to learn how to do computer vision or machine learning, go use it. Whether it’s some outdated language that you just think is funky and weird and you’d love to learn. Whether it’s something new and fresh. Here in the office people are into Elm. You know, if you’re excited about it, go learn it. There’s really no wrong language to go into, because the secret is once you learn one language, it’s very easy to learn another language. And then when you’ve learned two, it’s easy to go learn a third.
And so just invest your time in what interests you and you know if you’re interested in and want C++ and Python or I want Java and C, you know, for examples, go with the one you’d say, “I want a job in this, and what would they want.” But I would say that’s your secondary concern. Go with what excites you first.
Mike Delgado: Jenna, thank you so much for your time today. Where can everyone learn more about you?
Jenna Lake: Oh, well you can find me on LinkedIn. It is linked on the site. So if you want to learn more about Second Spectrum, we search particularly for interns, so if you guys are excited about machine learning or computer vision, check out the site for that.
Mike Delgado: Awesome. Thank you so much, and we’ll make sure to put links in the “About” section of the YouTube video as well as in the comment section of our Facebook Live video. Jenna, thank you so much for taking the time to talk with us and share your insights with us about computer vision. I want to thank everyone for watching, for all your hearts, for all your comments and we’ll talk with you all next week. Jenna, thank you again.
Jenna Lake: Thank you so much, Mike. It’s been great.
Jenna Lake is a Computer Vision Engineer at Second Spectrum.
She earned a Master of Science in Computer Vision from Carnegie Mellon University and Bachelor’s of Science Degrees in Computer Science and Biometric Systems at West Virginia University.
She is passionate about finding creative solutions to improve the efficiency of Computer Vision algorithms. When she is not obsessing over milliseconds, she is busy geeking out over GPUs and lambda functions.
Make sure to follow Jenna on LinkedIn.
Check out our upcoming live video big data discussions.