Deep Learning Demystified w/ Dr. Anima Anandkumar @Caltech (Episode 4) #DataTalk

Listen to the podcast:

Every week, we talk about important data and analytics topics with data science leaders from around the world on Facebook Live.  You can subscribe to the DataTalk podcast on iTunes, Google PlayStitcherSoundCloud and Spotify.

This data science video and podcast series is part of Experian’s effort to help people understand how data-powered decisions can help organizations develop innovative solutions and drive more business.

To keep up with upcoming events, join our Data Science Community on Facebook or check out the archive of recent data science videos. To suggest future data science topics or guests, please contact Mike Delgado.

In this week’s #DataTalk, we talked with Dr. Anima Anandkumar, Principal Scientist at Amazon AI and Bren Professor at Caltech, about what data scientists need to know about deep learning and how to scale deep learning frameworks.

Here is a full transcript of the interview:

Mike Delgado: Hello, and welcome to Experian’s Weekly Data Talk, a show featuring some of the smartest people working in data science. Today we’re excited to talk about deep learning with Dr. Anima Anandkumar. Anima serves as the principal scientist at Amazon Web Services. She’s also a Bren Professor at Caltech teaching in the Department of Computing and Mathematical Sciences. Anima earned her BTech in electrical engineering from the Indian Institute of Technology. She also earned her Ph.D. in electrical engineering from Cornell University, and then after that she served as a postdoctoral researcher at MIT. She’s the recipient of dozens of awards. I can’t even name them all. They include the Alfred P. Sloan Fellowship, the Microsoft Faculty Fellowship, the Google Research Award, the IBM Fran Allen Ph.D. Fellowship. It goes on and on and on. Anima, thank you so much for being part of Data Talk. It’s an honor to have you.

Anima Anandkumar
: Thank you, Mike. It’s a pleasure to be here.

Mike Delgado: So, I wonder if you could share with us your journey and what led you to begin learning about data science and eventually working in data science?

Anima Anandkumar: Since an early age, I’ve always been fascinated about maths and sciences, and then back in undergrad I did the electrical engineering, as you just said. And for my Ph.D., I started working on this problem of distributed estimation. Now we call it internet of things or IoT. Back then, it was still in the drawing board stage. Like, if we had this number of different devices, how do we connect them together and how do we learn from it? So, it could be cameras, temperature sensors, pressure sensors, all these different measurements. How do we put them together? There are battery constraints, so you cannot transmit a lot. There are bandwidth constraints.

And so I worked on that problem of distributed estimation for a few years in my Ph.D., and then went on to thinking about probabilistic models, like if we had all these different kinds of data, how do we draw influences from them, and that led me to going even deeper with mathematical tools and I came upon this technique called tensors. A lot of us have gone through … If you’ve done engineering, have a linear algebra course or matrices … matrices are these two-dimensional arrays, have rows and columns. So, there is a lot of it that’s very well-developed there, but then what I started asking was how we can extend it to more dimensions. So why just leave it two dimensions when our data is so rich that our data has multiple different dimensions.

If you think of an image, it’s width and height, but it also has colors so that’s like a third dimension. If you have video, then it is also time. That’s the fourth dimension. If you also have text going along with your video, then it’s another dimension. Our data is very rich, so how do we explore these multiple dimensions effectively? That got me going tracing all the research that’s been done in tensors in different areas, including quantum systems, and at the same time asking how we can develop practical techniques at scale to process such multidimensional data. And yeah, most recently I’ve been studying how we can encode more tensored architectures in our deep neural networks, and so it’s an exciting journey for me for sure.

Mike Delgado: So what I’m hearing is you love challenges because of all this unstructured data and how to play with it. Have you always just been curious by nature and loving challenges?

Anima Anandkumar
: Absolutely. One of my mentors was telling me when I was complaining too much, “You wouldn’t be doing it if it was not hard.” I guess the key about research is to over time develop a better intuition of what challenges we think can be overcome in a reasonable amount of time to ones that are hopeless. So I think that’s the hardest one to figure out.

Mike Delgado: And I think also what’s key here is you’re very creative, the way you’re thinking about data. You’re saying three dimensions, four dimensions, five … You keep digging to see all the different types of data that you can use, and then a lot of this, like you were pointing out, is unstructured. So how did you even begin to think about how to use this data that is very difficult to work with?

Anima Anandkumar
: There is some amount of domain knowledge you need. Because even the relevant kind of processing, preprocessing you need to do for the data is unclear. In fact, if you talk to any practical data scientist who is working on real-world scenarios, they’ll tell you that the majority of time they spend, more than 70 percent of their time, is data wrangling — getting the data, putting it in the right form. See what’s missing, trying to go back, collect more data. A lot of energy is spent there, and we still don’t have a good scientific technique to take it all the way from the extremely raw form to the finished model. Science wants to break it down into simple stages, so we can then analyze them more effectively, but in practice there is still a gap.

Mike Delgado
: I was looking at your bio and you really have a heart to teach others. I mean, you worked at UCI as a professor. You’re now at Caltech. Can you talk about what you enjoy about teaching data science as well?

Anima Anandkumar: Absolutely. In fact, I’m just coming back from India, where we held the largest deep learning workshop there. There was so much hunger to learn machine learning. People had already looked up online a lot of resources, but they wanted to get a sense of the right part, how we get to the experts. I was also in South Africa a few months ago …

Mike Delgado: Oh, wow.

Anima Anandkumar: My goal was to democratize AI and make sure that everybody on the planet has access to it in one form or the other. And I indeed like teaching as a passion. I’ve done it in the universities. Now my goal is to ask how we can scale that up even more. At Amazon Web Services, we are launching these extensive setup tutorials. So if you go onto this website on MXNet, Gluon — that’s the framework I’ll talk about shortly — you’ll see that there are Jupyter Notebooks. But that’s not just code. It’s like a first-principals view of what it means to think about different machine learning concepts and have a code that’s from scratch. If you had to understand this principle, you can go through this more extensive code base to understand every step.

But of course, if you want to run the notebook then there is another piece of code that’s automated and more succinct. This goes from very basic, like without any assumptions of machine learning knowledge, to more advanced topics. This is something my team has been extensively working on, and we want to spread the knowledge as much as we can.

Mike Delgado
: That’s awesome. I love that you’re going global with this, heading all over the world, teaching others, and also exciting people with your passion for the subject. I see so many articles with these buzzwords — AI, machine learning, deep learning — and they’re almost used interchangeably at times. Can you briefly summarize how they’re different and how they’re the same?

Anima Anandkumar: Right. First, let me distinguish machine learning and deep learning. Deep learning is one specific area of machine learning. When we say deep learning, it typically means you have a deep network. You have multiple layers of processing of a certain class of models called neural network models, but that’s just one way of processing. Machine learning is so much broader than that and can involve so many more kinds of algorithms and techniques. In terms of distinguishing artificial intelligence and machine learning, there’s quite a bit of back and forth, so to me it’s more a historical one. Back in the ’80s, there was this so-called AI winter. People had so many expectations out of AI, and when that didn’t pan out, AI in fact became a dangerous term to use. After that it went into more. Machine learning became more widely used, and now when deep learning came back and showed so much promise, AI’s again coming back.

So at least to me, as a field, it’s practically indistinguishable. Some people argue machine learning is more focused on the algorithms and the processing, whereas AI takes a broader view of intelligence and agents. But to me, the fields have merged to such an extent that they’re indistinguishable.

Mike Delgado: Okay, so is AI kind of like the broad term and then within that machine learning and then within that deep learning? Is that a way to view it?

Anima Anandkumar
: Yeah, you could say that.

Mike Delgado: I’m curious about some of your favorite examples of how deep learning is being used today to improve our world.

Anima Anandkumar: The beauty of it is it’s so broad. And we are just touching the tip of the iceberg. The most famous one, the most common use case, is computer vision — how we can automatically recognize different kinds of objects and images, recognize faces. Recognize text and images, and that has such broad applicability. Say it’s surveillance or you have cameras that can automatically detect intruders or content moderation for different websites like Pinterest, you want to automatically moderate harmful content and doing it at scale is manually impossible. Those have been the most prominent arenas that employ computer vision. The next is speech recognition, like if you’re talking to Alexa or Siri. Those systems are vastly improved from even five years ago, and that’s thanks to deep learning. We still have a ways to go in terms of making that global and having support across languages, but it’s a great start.

Text processing, so natural language processing has seen a lot of improvements as well, and if you think about predictive typing on your iPhone or automatically understanding the reviews that people have given. So all this used to be done manually, and now we are using deep learning. Lastly of course, autonomous driving has seen a lot of excitement. We’ve moved more fast than anybody could have imagined, and that’s thanks to deep learning.

Mike Delgado
: It’s really amazing thinking about the progress just in the last couple years how deep learning with autonomous driving, computer vision has just been taking off. I was reading some articles that were saying how far AI and deep learning have come when it comes to computer vision. Computers are now better than humans at identifying objects. Would you say with that, would you … ?

Anima Anandkumar
: Yeah, I would definitely have a bit of caution and …

Mike Delgado
: Okay, okay.

Anima Anandkumar
: … refrain from making statements like that because what does better mean? We have to be very careful. I mean, even an infant within a few months can recognize its mother and close relatives and can just do it under so many different conditions, poor lighting, different angles or different variations. We humans have this ability to understand the world around us and these images under lots of challenging conditions. On the other hand, for these deep learning systems it’s not as robust, because they’ve been trained on a limited amount of data compared to what in the world we consume, and the nature of learning is also very different from how humans learn. To me, we have shown a lot of promise, but there is a long way to go to claim that computers can be better than humans when it comes to computer vision.

Mike Delgado
: Okay, very good. Thank you for correcting me. I see these articles all the time, and I guess some of them are more like tabloid headlines that get people excited, right?

Anima Anandkumar
: And that’s why I’m here.

Mike Delgado
: Thank you.

Anima Anandkumar
: Yeah. We separate the hype from reality, right? To give you an example of still how brittle some of these systems are — if you take a set of these training examples. These are images that you send to an AI system to make it learn what the different categories are. If you add noise that’s input acceptable to the humans, humans just don’t see any changes in the image at all, but then you can fool the AI system and you can have it … And now we can send completely garbled data and it thinks it’s a horse. So there is a lot of new research being done, what we call adversarial systems, like if you’re trying to be adversarial to your AI system, then it can fail pretty miserably.

Mike Delgado
: Thank you so much for explaining that. I guess it goes back to that problem with fake news going on right now, right? A lot of hype.

Anima Anandkumar
: And in fact, there was also the very prominent coverage about the Tesla accident. Because there was this white truck and in the training set of examples, it was never shown this white truck. So it just didn’t know that was a dangerous scenario, and unfortunately someone died. So we still need to be very careful and to come up with better techniques. That’s why I said this is the beginning. This is definitely not the end.
Mike Delgado: I like what you said because it’s so true about human perception of things. For instance, when we’re driving down the road, if somebody were to put a fake sign on the road that said “Stop” in an area where we shouldn’t be stopping, we would know as human drivers that’s someone playing a trick. A machine would need to learn that is also a trick, right?

Anima Anandkumar
: Yeah, unless it’s being told that. And the problem is you can’t tell it every single scenario in the world. You can do that with games like Alpha Go. You can have a brilliant set of opportunities for it to play. In the beginning it would lose a lot, but it would get progressively better. But we can’t do the same when it comes to self-driving cars. We can’t simulate every possible scenario in this world. And that’s why this is still very much a challenging problem.

Mike Delgado: Some of your work as a principal scientist at Amazon is working with deep learning. Can you share a little bit about how you’re using deep learning at Amazon?

Anima Anandkumar: I’m at the Amazon Web Services. Our mission is to enable machine learning and AI on the cloud and that comes across the stack. All the way from infrastructure, getting the state-of-the-art GPUs and efficient CPUs, to the customers to platforms such as SageMaker that we recently launched. That makes it very easy to learn large-scale machine learning to manage services for different domains and allows customers to directly consume the results of machine learning without having to train their own models. So we are offering machine learning solutions across the stack for a range of customers, and when it comes to Amazon more broadly, AI has been used in almost every aspect.

Every team employs AI in one form or the other. When it comes to managing the supply chain and recommending products, when it comes to Prime Air, which is trying to see how drones can be used to deliver products, Amazon Go is a very exciting venture where you can feel like a shoplifter because you can pick up things you already know what you need from the store and you keep walking without going through any cashier. The idea is computer vision can help track and figure out what were the products that were taken, and these are just a … And Alexa of course, a very famous example of how we can enable different kinds of interaction with the users in the home scenario. That’s just the tip of the iceberg.

Mike Delgado: That’s awesome. How would you like to see, because definitely voice assistants are probably one of the most practical use cases. A lot of people are buying them for their homes, using them to check the weather, and now there’s a bunch of other capabilities. What are some things you’d like to see voice assistants do in the near future?

Anima Anandkumar: The one thing I was very happy about was recently it was in the news that Alexa is now a feminist, and …

Mike Delgado: Nice.

Anima Anandkumar
: … pushing back on any … So you should try that. So there’s that.

Mike Delgado
: That’s cool.

Anima Anandkumar
: Yeah. So the thing I would like to see more broadly is these voice assistants have different kinds of personalities. Be much richer in their interaction with humans and maintain a much longer dialogue. Right now, they are doing very useful functions in terms of getting some tasks done, but that’s always just one step. So can it now maintain our range of tasks and have more of a memory and more understanding of the human needs? I think that’s going to be a game-changer, and more on the social implications. All the voice assistants right now are female, and there are at least societal implications of the kind of personalities we endure, as more of a subservient female trying to satisfy the needs of the customer. And to me that’s problematic if you are bringing in the biases from the society and further amplifying them. So I hope to see a more diverse set of voice assistants that do not just mirror our current biases.

Mike Delgado: No doubt. I love that. I’d like to have a British Batman servant responding, right?

Anima Anandkumar
: I would like that.

Mike Delgado: Yeah. I’d love that. So I’m curious about your personal use of Alexa or voice assistants. How do you currently use them at home?

Anima Anandkumar: I wake up and I’m half awake and say, “Alexa, what’s the weather today?” or “What’s the time?” So it’s the first person I talk to when I wake up. And then it’s just a range of tasks. Like I want to order something on Amazon, look up my meetings, call somebody. It’s just now become so much easier and seamless since Alexa has been around.

Mike Delgado: Cool. I’m curious about what excites you most about the future of deep learning?

Anima Anandkumar: To me it’s not just limited to our current deep networks, but asking how there can be convergence across different fields. Probabilistic models traditionally have been about modeling different kinds of uncertainties and understanding relationships between them. Now the question is whether we can bring more uncertainty-based modeling into our deep networks. How can we make these different neurons be randomized and not just deterministic? And what kind of Bayesian modeling can we do, and at the same time how we can still retain computational efficiency and get good gains out of that? Practical gains out of that. To me that’s an interesting area because that’ll also help us get a better handle about uncertainty. Like when is the system uncertain and when can we trust it?

Another area is, as I said, adding in more dimensions. So our current deep networks involve matrix operations, and that’s mainly historic because we had very efficient software libraries for linear algebra and we built further on that. Now the question is whether we can extend that to tensors and what kind of interesting architectures emerge out of that. There are a lot of interesting things to be done there.

Mike Delgado: Awesome. We have only four minutes remaining, so I want to quickly go through some of these questions that we end every single broadcast with. We love to ask our data scientists these questions. What is your favorite programming language and why?

Anima Anandkumar: So instead of picking one language, I would pick the framework and that’s MXNet because the name is Mixed Net Programming. There’s a whole set of languages that are imperative style and others that are declarative style. And what MXNet lets you do is mix the two. In fact, with the latest Gluon framework, you can write in simple imperative style by calling that framework, but then you can hybridize and get the symbolic programs that can be then … you can do autoparallelism, you can do memory optimization, so you can get very high-end performance and at the same time ease of programming. That’s what I would pick.

Mike Delgado
: Very nice. Okay, what advice would you have for those — you were just speaking in South Africa, in India, and obviously you teach at Caltech. For those who are interested in getting involved in data science, what would be your advice for them?

Anima Anandkumar: Go to gluon.mxnet.io. There is a whole set of tutorials that does not assume any machine learning background. You can learn from first principles, and at the same time you have working code. So if you already need to solve some practical problems, you can use them right away. That’s the best form of learning anyway, that you’re more hands-on.

Mike Delgado
: What was that URL again, Anima?

Anima Anandkumar
: Gluon.mxnet.io.

Mike Delgado: Okay. Very good. And just a reminder for those who are watching, if you’d like to learn more about Anima and follow her on LinkedIn, you can go to this URL on the screen. It’s ex.pn/anima, and that’s just a redirect over to her LinkedIn profile, where you can follow her. Highly encourage it so you can keep up with the things that she’s doing and working on. Okay, a couple more questions, Anima. What advice do you have for leaders who are looking to build a great data science team?

Anima Anandkumar
: To focus on getting good talent, and that involves separating the hype from reality. So you will want to interview people who can also understand the shortcomings of the current machine learning. That would be the top question I would ask. What can the current systems not do? If someone doesn’t come up with the relevant answer, I would be very careful in hiring them.

Mike Delgado: That’s a good one. And then the last question is there’s a lot of unknowns about the future of AI. We talked about some of these crazy headlines we see in the news about rogue robots taking over the world. I’m curious about your guess about what the future holds.

Anima Anandkumar
: I think the future in the long run will be very rosy because it will do all the heavy lifting, and there will still be some creative opportunity for humans that I don’t think AI systems in any near future will be able to do, so that’ll free up the human mind for more creative pursuits. That’s the utopian ideal, but to get there, my worry is whether that will be accessible for everybody. And that’s where we need to have social policies that ensure that some people don’t get left out.

Mike Delgado
: Great. Thank you so much, Anima. I want to recommend everyone follow her on LinkedIn. You can see her profile over at ex.pn/anima, and I want to encourage you to follow her there. As a reminder, we have this Data Talk show every single week. If you’d like to learn more about upcoming episodes, past podcasts, you can always go to ex.pn/datatalk or just do a Google search for Data Talk. I want to thank everyone for tuning in today. Take care and have a great weekend.

Anima Anandkumar
: Thank you.

Mike Delgado: Thanks so much, Anima.

If interested in learning more about data science, Anima recommends that you visit: http://gluon.mxnet.io/

About Dr. Anima Anandkumar

Anima Anandkumar’s research interests are in the areas of large-scale machine learning, non-convex optimization and high-dimensional statistics. In particular, she has been spearheading the development and analysis of tensor algorithms for machine learning. Tensors are multi-dimensional extensions of matrices and can encode higher order relationships in data. At Amazon Web Services, she is researching the practical aspects of deploying machine learning at scale on the cloud infrastructure.

She is the recipient of several awards such as the Alfred. P. Sloan Fellowship, Microsoft Faculty Fellowship, Google research award, ARO and AFOSR Young Investigator Awards, NSF Career Award, Early Career Excellence in Research Award at UCI, Best Thesis Award from the ACM Sigmetrics society, IBM Fran Allen PhD fellowship, and several best paper awards.

She received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She was a postdoctoral researcher at MIT from 2009 to 2010, an assistant professor at U.C. Irvine between 2010 and 2016, a visiting researcher at Microsoft Research New England in 2012 and 2014, a Principal Scientist at Amazon Web Services since 2016 and a Bren professor at Caltech since 2017.

Check out our upcoming live video big data discussions.