Listen to the podcast:
Every week, we talk about important data and analytics topics with data science leaders from around the world on Facebook Live. You can subscribe to the DataTalk podcast on iTunes, Google Play, Stitcher, SoundCloud and Spotify.
This data science video and podcast series is part of Experian’s effort to help people understand how data-powered decisions can help organizations develop innovative solutions and drive more business.
In this DataTalk, we spoke with Favio Vázquez about the future of data science.
To keep up with upcoming events, join our Data Science Community on Facebook or check out the archive of recent data science live videos. To suggest future data science topics or guests, please contact Mike Delgado.
Here’s a complete transcript of the conversation:
Mike Delgado: Hey, everybody. Welcome to our weekly #DataTalk, a show where we talk with data science leaders from around the world.
This is our brand new season. We took December off, and we’re excited to be back. We actually had our guest, Favio Vazquez, as a guest back in season 1, and we’re super excited to have him back with us today. Favio, how’s it going, man?
Favio Vazquez: It’s awesome to be here again. Thank you for having me. A lot has changed since the last podcast, so I think it’s going to be an interesting follow-up.
Mike Delgado: You are so active right now on social media. I see your posts on LinkedIn all the time, you’re writing articles and you have your own show, which I want you to talk about today. You also have your day job. You’re a physicist, you’re a computer engineer, you’re working in data science and computational cosmology, and you’re the founder of Ciencia y Datos. Can you talk a little bit about that initiative?
Favio Vazquez: Ciencia y Datos started out as a project for expanding the knowledge of data science in Spanish. Right now, it’s a community blog where people can add their articles, their ideas, whatever they want. We have a lot of followers and people writing articles now.
In the near future, I’m planning to transform that into a bigger platform for sharing, and for courses in Spanish, data science, culture, engineering, philosophy, whatever we want, because the name of the “company” is data and science, so it’s not only restricted to data science. It’s more like a global thing.
Mike Delgado: What have you found? I love the fact that you’re actually breaking into the Spanish market to get a lot more content in Spanish. Is there a lot of data science stuff written in Spanish right now?
Favio Vazquez: There a lot of people doing data science in Spain, in Latin America, in the United States, who are Spanish speakers. But we don’t have as many resources for learning or sharing. Right now, Coursera and edX have some Spanish courses, but the gap between Spanish and English is huge. When you go to books, like reading papers or articles, there’s no comparison at all.
So the idea for this project was to enable people who don’t know how to speak English to access information we all get from the English world. One of the other things I’m doing is contacting people and translating their articles. They’re going to put the article in their name, because they have a Medium account. I just give them the Spanish translation, and they put it there so people can read that too.
Mike Delgado: I think that’s wonderful. Aside from that, you’re also teaching, right? I saw you’re now a Data Science Instructor for Business Science University.
Favio Vazquez: Yeah. That’s one of the bigger projects we have for this year. I’m working with Matt Dancho, who’s a rock-star data scientist. He works in Business Science; he founded that company. I joined him last year to create the Python version of his courses. He has other courses, and they’re all related to data science for business. That’s the main goal of the university.
Right now I’m building one course that will be live very soon.
It’s business, with Python — I’m going to teach people how to write Python code but with a business idea in mind, like how to solve business problems, how to use the libraries, how to do all of these different things. We’re also working on methodologies for data science, like workflows, cheat sheets. We’re trying to help the community and give them the power to really use data science with a business perspective.
Mike Delgado: For those who are interested in taking one of your classes, what’s the best way to learn about that?
Favio Vazquez: Right now, you can go to University.Business-Science.io and see Matt’s courses. They’re live right now. My courses are still in progress. I think I’m going to be launching my first course there next month. Right now, there’s no course for me to teach you, but make sure you read the articles. We’re preparing for that so you can see what you can expect from the courses we’re creating for you.
Mike Delgado: What I love about you is that you have a passion for helping others learn about data science, and I also appreciate the way that you write, the way that you teach. You make it easier to understand, especially for those of us like me, who are out of the data science world. You’re able to explain concepts at my level, which is a lot of work.
Favio Vazquez: I think writing is an art in itself. You keep getting better and better when you start writing. I never thought I would be writing articles in any field. I was just a consumer of all these different things, but then I realized there are some benefits to writing. One of the first ones is you really can concentrate all your knowledge into one piece you have forever. In one year or 10 years, it’s still going to be there.
It will also help you to really understand the topic you’re talking about. It’s not the same when you talk to a colleague or a friend. That’s easy, because it’s informal talking. But when you’re writing for an audience you need to understand what you’re saying, read and reread what you’re saying, and research — read more articles and books, watch some webinars. So you learn even more.
For me, this is a way of giving back all of the things I’ve learned. I am very grateful that, in the last years, all of the great minds in the field came together and created courses and articles and blogs that really changed the way we saw the topics. When you are starting and you have an article and a book, like a paper, you are very lost. It’s hard to understand what you’re reading. When you have these blogs with a very simple explanation, then you see, “Oh, this is it,” and you can go back to the book or the paper and really understand what they’re saying.
Mike Delgado: I love that you’re constantly learning. You’re constantly challenging yourself into these … data science, and you’re just actively, continuously learning and growing. How important is it for those who are just graduating school to have that kind of growth mindset?
Favio Vazquez: I think every career needs this, not only data science or programming. I think if you’re starting on law or engineering, you need to keep studying forever, because things are changing, and they’re changing very, very fast. When you’re in the day-to-day job, you are stuck in one theory of working. When you’re working in a company, you don’t have the freedom to do whatever you want. You do what they tell you to do, and that’s fine because it works. You get experience, but outside of that company, there’s a whole world of knowledge that you should be aware of.
Favio Vazquez: One of the easiest ways to do that is to be curious and read articles, watch some videos. Going to conferences is an easy way to get knowledge very fast. When you go to a conference, if you see like 10 different talks by people active in their research field, you can say, “I know the state of art of X,” and that was in one day. You didn’t need to read a thousand articles.
Another important thing for this — when you’re learning these things in the outside world, you can bring that into your daily job. That’s when you’re adding a lot of value to your company. If you’re aware of what’s going on in the research field or in other industries, and you can grab that and apply it to your day-to-day job, you can really change the way things are being done in that company.
Mike Delgado: You’re definitely proof of this. I love to see how invested you are in learning about data science, continuing to grow in that field, and just being a thought leader in helping other people become great data scientists. One of the things I want to talk to you about is how you’re doing that through your blog writing and the articles you’re producing.
I love how you are taking your different interests in cosmology, physics, even philosophy, and bringing them into these articles about data science. How did you start to merge all those different fields?
Favio Vazquez: If you’ve seen my latest articles — I think I’ve written three of those — this has been a change in the way I perceive data science. It’s the same definition I gave last year, or even two years ago, but now I’m trying to give a more general way to see data science and machine learning.
In my latest article, called The Data Fabric for Machine Learning, I gave a comparison between Einstein’s Theory of Relativity and machine learning and data science. That was something that came to my mind, and I said, “I’m going to write it, and let’s see what people think.”
The main concept in those articles I’m writing is that we may need to change the way we do data science right now. I’m writing an article right now, which is going to be posted very soon, that says that I’m really exhausted by having a thousand tables and writing a million lines of code, of sequel joins, to get the number of customers that bought the X product between X and Y in this state.
This shouldn’t be this hard, and I know it’s not that hard, but we take so much time to try to understand this original databases. We’re very advanced in the computer science, but right now, I’m taking a shift into graph databases. I think they can add something very different to what we’re seeing in the world.
Relationships in graph databases are first-class citizens, and that’s important because it’s one of the things we don’t have in original databases. It’s kind of ironic that relations in original databases are not first-class citizens. If we want to understand what’s in common between two tables to get an average or create two different tables, we need to add a lot of code to be able to join those tables. So that was one of the first parts.
The other part is the concept of ontology. Ontology, in computer science, means the way two entities are related to each other. When we are doing all of these different things, we can think of something called a knowledge graph. This is what Google’s doing right now. There’s a great video that you can find on YouTube — search “Knowledge Graph Google.”
There’s an explanation of how Google changed. Before they understood this theory of graph. They were searching by keywords and images to have an idea of what you meant with the words you were searching. Now they really understand the words you’re searching, because they understand the meaningful relationship between the words you’re building. Not only Google. Facebook and Amazon are doing that too.
We’re seeing a shift in the way we do data, toward knowledge graphs, semantics and ontology. I was not aware of that until I started doing my research. So with these articles I’m trying to say to people, “Hey, something’s changing.” I don’t mean that what you know right know is worthless, but you should take a look at what’s happening in the research field, because it might interest you too.
Mike Delgado: I like how you brought in the example of the Google Knowledge Graph. You talked about in Search, or even Voice Assistance, how they’re getting smarter as you’re beginning to look for things. For example, I might have looked for something yesterday, and if I add in a variation today, Google will remember the search I did yesterday, and will either pull up things I already looked at or bring in some other things it thinks I might be looking for.
Favio Vazquez: There are three lines in my article that are the most important. They say that, when you’re doing a query on a graph database, the query becomes part of the knowledge of the graph. That’s one of the biggest differences.
When you’re doing a sequel query on a table or three tables, you have the query, and that’s it. Nothing will change with the database. When you’re building graph databases and you’ve built ontologies on top of the data you have, when you do a new query, that query will also add information to that knowledge graph. So you’re getting data when you’re searching for data.
One of the things I like about graph databases: When you’re trying to explain something to someone, normally you would just use a graph form to explain. “This is here, and this is here and here.” It would be very weird for someone to explain … Imagine your teacher trying to explain a concept using tables and joins. That’s not happening. The way we do databases right now is not the way we think. That’s one of the things that I like, because it’s closer to the way we develop our thoughts and the way we see the world.
Mike Delgado: Is this changing the process of how you’re visualizing the data afterward?
Favio Vazquez: Right now, I’m building two things. While writing this, I’m building my picture of the future of data science. I don’t have the full picture right now; I have an idea of what it could be. So I’m not exactly sure to what will be the end of this research I’m doing.
If you followed me last year, my research was on deep learning, and I wrote a lot of articles on deep learning. This year, I’ll be focusing on this new topic. The next year, I don’t know.
Right now, the way we see data is just, “Throw some columns in the table.” That’s a lot of what we’re seeing right now when we talk to someone. I’ve done consulting for a lot of different companies. When you go there, you see that the main problem they have isn’t doing machine learning, or deep learning, or how to build X algorithm. It’s “What data do I have, and how can I understand my data? When do I know I can cross my data?”
Favio Vazquez: We’re on a path to changing the way we see data, because right now, people are collecting data just to have data, and that didn’t happen before. Before, we collected data to do something, like, “I want to improve this process, so I’m collecting the data to do that.” Now it’s backward. People are getting data just to have it, and then they think, “What should I do with this data?” This is happening to a lot of companies. Companies have said to me, “I have these databases, and they’re huge, and I have no idea what to do with them.” Data, now, is one of the most important things we have in companies. I think the way we see it and understand it will be changing in the near future.
Mike Delgado: I think it’s wonderful that we’re collecting more and more data to try to find out what’s useful, but then, Favio, you’re saying there’s a huge challenge around the amount of data being collected. There’s just so much, and companies need someone like you to consult with them, to come in and help advise them on, “What do we do with this data to make it useful?”
Favio Vazquez: It’s now more common that companies have their own center of excellence. They have people that know their data, they know their databases, but what they’re missing is trying to connect the business to the data.
Favio Vazquez: It’s like, “I have all these business ideas that I believe will change the way we do things, and I have all this data here, but I don’t know how to jump from the data to the business problem.” That’s one of the things we’re trying to teach people in the courses with Matt I mentioned before. I was a victim of that thinking too.
Favio Vazquez: When I was starting to learn data science and machine learning and deep learning, you get all these theories and articles, and you program all day long. When you go to your first job, you’re completely lost. You have no idea how to behave in a meeting. You have no idea what questions to ask. If you have a deadline, you don’t know how to accomplish that goal. They’re not teaching us a lot of things. For some reason, I think 50 percent of scientists right now are doing data science. They never teach us how to be business guys; we’re scientists.
Favio Vazquez: So one of the things that we’re trying to, and I’m trying to do with my articles and my teaching, is close the gap between the data you have and the business problems you want to solve.
Mike Delgado: What would be your recommendation for those who are just getting out of school or just beginning to work in the data science field? They’re going to meetings and, like you said, they don’t know what kind of questions to ask, because they’re brand new, and they’re coming from the science world, the academic world. What are some tips you’d provide?
Favio Vazquez: If you’re interested, we have a data science live video that was a conversation with Gabriela de Queiroz. She works for IBM, and the topic was the best questions to ask.
Favio Vazquez: I’m going to just extract something from there. One of the most important things you need to understand is that if you’re in a meeting, there will be different types of knowledge. There will be business guys, and engineers, and marketing people. They don’t know what you know, and you don’t know what they know, so first you have to build a good conversation and a platform for discussing ideas, not to propose and kill other ideas.
Favio Vazquez: When you’re in a meeting, and you’re the guy who knows data science and machine learning, it’s very common to think, “I’m the best one here because I know all this math, and all the other ideas are just crap.” That’s just not the way you should be doing meetings, because they know the business. You don’t know the business. When you are starting in data science, you may know a lot of math and calculus and deep learning algorithms, but they understand their business because they’ve been working there for years. What you should do is listen, really listen to what they have to say.
Favio Vazquez: A good question is one that will lead to a good discussion on a topic, not to a problem. “What’s the goal of your area? What are your KPIs? What are you expecting from this model? Has someone else created a model before, and what were the results— do you have that report? Can you show me your data and what you understand by data in your field? What does production mean to you?” All these different things that may be obvious to you, but they’re not obvious to them. So you need to use the Socratic method to try to extract all this knowledge that they have but maybe didn’t know how to transmit.
Mike Delgado: That’s solid advice. I love that, being curious, asking lots of good questions to get to understand, “What is the business need? What are the business problems?” And then, from that, “How can I leverage this data to help provide answers to that?”
Favio Vazquez: And “What do they expect from their model?” I’ve been in meetings where, after months of working, you show the results, and they say, “This isn’t what I asked for.”
Mike Delgado: That’s painful.
Favio Vazquez: And then you’ve lost three months of your life. It’s a problem when the objectives of the data science area are not the same as the business area. You need to work for their goals, not for your goals. You have to be humble. Maybe you have an idea that is huge and that will change the world but, right now, your boss just wants to solve a problem, not change the world. If every project you build is going to change the world, you’re not going to have time to do that, so you have to focus and solve the problem. When you solve a problem and they’re happy with that, then you can go and change the world.
Mike Delgado: You just shared advice for the data scientist to ask all these very good questions. How about advice for the business leaders, the people in the room that are giving direction to the data scientist? What would be your advice to the business leaders to give solid direction?
Favio Vazquez: One of the most important things for business leaders is to be aware of what’s going on in the data science world. They don’t have to be an expert, but they have to know what machine learning is and what its limits are. Sometimes they don’t know the limits. If you don’t know the limits, you will ask for the world, when you can only get a small piece of that.
Favio Vazquez: If you’re aware of the limits of the fields you’re working, it’s much easier to have a realistic goal in mind. That’s why I believe there are a lot of CTOs and chief data officers meeting, getting together and talking about data. They’re still business guys, but there’s no harm in reading an article or understanding an algorithm. If you go to that place, it will be so much easier to speak to your data scientists. It will be a faster and more effective communication, and you will be setting goals that are more realistic and can be accomplished in a period of time. One of the other things that is important in data science is that projects should end.
Favio Vazquez: When you’re in science, you’re thinking, “I have infinite time to solve this problem,” and it’s true because the infinite time is when you die. But that’s not the case in business. If you say to your boss, “I’m going to need four years to solve this problem,” you may get fired. So you need to understand that the rules change when you’re working for someone. And when you’re solving a problem, you need to be fast, because there are more people trying to solve the same problem, and the first one who solves it is going to win the race. That’s why I really like to talk about agile developments in data science, being an agile developer.
Favio Vazquez: I have an article that is a continuation of one of Matt Dancho’s articles on an agile framework for doing data science. The goal of this framework is, “Don’t make the same mistakes a lot of people are encountering because they don’t know how to ask questions or they don’t know the goal of the business.” It’s an agile framework that is strictly related to the KPIs of the area and the business components you have in one company. It’s called the Agile Business Science Problem Framework. It’s strictly related to business problems with data science in an agile way. You can search for that online too.
Mike Delgado: Awesome. I’ll find that article and put it in the comment section of this Facebook Live video and in the comments of our YouTube video when we post it there.
Mike Delgado: We have a question from somebody watching, from Sarah. She asked, “Favio, do you have any tips on dealing with the intimidation of imposter syndrome for newly graduated data scientists who have just started working in the field?”
Favio Vazquez: That’s going to happen, and you have to be aware of that. It is scary to start in any new field, but it’s a good scary feel. You shouldn’t be like, “Oh, my God, they’re going to know that I’m not an expert. They’re expecting for me to be like the Andrew Ng of this, but I really don’t know how to do these things.”
Favio Vazquez: A good way to deal with imposter syndrome, or just to deal with entering a new field, is to be honest. You need to be able to say, “I don’t know.” It’s much better to say you don’t know something than to make up some weird algorithm or idea to justify your ignorance.
Favio Vazquez: A lot of people will deny that they don’t know something forever, to never come across as the guy who doesn’t know. It’s much better to just say, “I don’t know this. I think I have an idea, I read some articles on this, so let me go and I’ll read about it. I’ll create an example, and then I’ll get back to you.” That’s much better than just making something up.
Favio Vazquez: Make sure you’re studying regularly. Make sure you’re coding, seeing what others are doing and doing projects on your own time. And you have to be honest with your boss and yourself too. When you’re lying to someone, saying that you’re an expert in the field when you’re not, you’re lying to yourself too. Because in the end, the problem’s going to be with you, not with them.
Mike Delgado: Amen. I think that honesty and humility are key not only if you’re just starting out, but throughout your career.
Mike Delgado: You were talking about how the senior leader should keep up with the basics of what’s happening in the data science world so they can communicate better and more effectively with data scientists on their team. This is where humility and asking questions, being curious, come in, whether you’re just starting out or have 20 years in the field. “I don’t know anything about this field, I just read an article on this bigger topic, but tell me, Favio, is this kind of what it’s saying, or am I off here?”
Favio Vazquez: Yeah. That’s the only way of really being aware of something and understanding a new field. You have to read the whole thing, because sometimes people are stuck in high-level articles that are just the idea, or people are stuck in hardcore papers, and neither of the two sides is good. You have to see both sides. You have to be able to read hardcore papers and simple blogs because, when you’re trying to explain something to your boss, you are not going to explain from the paper [inaudible 00:34:53]. You have to be able to explain to him like the blog, like you’re explaining to your family what you’re doing.
Favio Vazquez: So it’s important, when you start reading these different types of books and articles and blogs, that you really understand. When you really understand a topic, you’re able to talk about it like you’re talking about the weather.
Favio Vazquez: If you really want to be a successful data scientist, you need to be able to explain hard concepts in an easy way, and that’s what science has been doing forever. I mean, imagine if, in science, we only wrote papers. There would be no advances in any field. A small amount of people can understand a paper because they’ve spent their lives trying to understand a field. When you write a very hardcore paper, only a few people will get it. We need people from the career of that article, or other articles, to say, “Okay, this is important. I’m going to try to write something or explain this concept in an easier way for people to see the value it has on life, or a field, or a theory.”
Mike Delgado: I think what you are doing right now with the way you teach, the way you talk about data science, the articles I’ve read that you’ve written … I’m not in the data science world, but you can articulate things to make something very complex easier to understand. Even bringing in, like you said, the theory of relativity from Einstein, to bring these very complex subjects that would make my mind explode and breaking them down is a skill that shows you’ve worked very hard to teach yourself and then thought about how to relay it to everybody else.
Favio Vazquez: Thing to keep in mind is I’m not trying to make things more complex. I’m thinking in a different position here. It’s “I just added a new complication to data science, but that complication will help people understand what machine learning is.” Sometimes you need to get a little more complex to lower down the level or the way people think about a subject.
Favio Vazquez: I gave my definition on machine learning before the data fabric, and then, after I presented the data fabric, I explained what machine learning is inside of the data fabric. You can compare both of them, and I think the newer one is easier to understand. That was, for me, a great accomplishment. There was a point when I was reading the article like, “I’m building something very complex, and I don’t know if I’m going to be able to explain what I’m thinking in simple words.” But after a lot of trial and error and reading the concept to people I knew — when they understood it, and when they understood the article, I said, “OK, I added some complexity, but in the end, it worked.”
Favio Vazquez: It’s easier to understand what machine learning is. It’s finding an insight. It’s finding that piece of hidden information in the data you already have. If you think about it in that way, and you have a graph on all of these different things I wrote in that article … I think if people just give it a chance, it’s going to be easier to understand what machine learning is. In the next article I’m writing, I’m going to try to explain in an even deeper way what I mean by machine learning in a graph and how that can help us do better, be better, in data science.
Mike Delgado: Wonderful. For those listening to the podcast, I’ll have a URL for that article. You can go to EX.PN/DataTalkFavio, and it will redirect to that article on machine learning where Favio proposes the new definition, as well as explaining these different concepts. I think everyone needs to check out that article.
Mike Delgado: I love that you don’t just stick with current definitions. As things are evolving and changing, you’re willing to propose new definitions. I think that takes a lot of guts.
Favio Vazquez: I did that last year with data science. I gave my definition of data science. I’m not doing that to say that my definition is the best one, just trying to make this a serious field of study. If you compare physics in the 1900s, in the start of modern physics, like quantum mechanics and relativity, people were just lost. “What’s going on? We were happy with Newton. Please just take me back.”
Favio Vazquez: They really thought that everything was invented, like, “This is it. We understand the world. It’s Newton’s Theory of Mechanics, it’s Maxwell’s Theory of Electrodynamics, it’s [inaudible 00:41:04] Law.” But then these guys, like Einstein and Schrodinger and Heisenberg and Planck, came together and realized that there was something missing. They tried to explain that in very different ways, and they didn’t know they were creating a new field for science. I think this is happening with data science right now. People are trying to build different things, and we really don’t know what we’re doing. We’re solving problems, we’re clustering algorithms, we’re doing linear regression and [inaudible 00:41:44] progression, we’re building deep learning algorithms and neural nets. But what’s the bigger picture?
Favio Vazquez: I’m trying to understand everything people are doing, and it’s very hard to create a bigger picture for data science. What’s data science? Is it a new field? Is it statistics on steroids? Is this just data mining, or are we creating a new thing without knowing it? I’m trying to put myself at a higher level, trying to see things from the top and understand the patterns people are trying to do in deep learning and data science and machine learning, trying to give a simple definition. That’s hard, but necessary.
Favio Vazquez: If you ask someone, “What is physics?” they will give you some definition. Maybe it’s not the best definition, but they have a definition. But if you ask people, “What’s data science?” — right now, they don’t have a definition. That’s important, because if it’s the sexiest job of the 21st century, it’s the highest-paying job in the world right now, and we don’t know what it is … That’s weird. So I’m trying to give descriptions and definitions of the field so people can tell their families, “I’m a data scientist, and this is what I do.” Sometimes you don’t know how to say that.
Mike Delgado: I’ve asked different data scientists on this show how they explain their job at a family gathering. It is very complicated because of all the work that you’re doing with mathematics and programming and working with data sets, cleaning data sets, all the math involved.
Favio Vazquez: It’s complicated, but we need a way to explain what we’re doing to ourselves and to others. Sometimes it’s going to be hard to … I mean, you don’t have to explain everything you do. If you’re writing a paper, not everyone has to understand the paper. If you said that everyone will understand the new advances in critical physics in the [inaudible 00:44:26] field … no one will understand that, but it’s important for them.
Favio Vazquez: So for us, it’s important to be able to speak the same language when you’re talking to colleagues. But when you’re talking to someone else from outside of the field, you need to speak a language that you both understand. I think that’s why people really don’t understand data science. There are some important people in the world saying, “Data science is just a new term for statistics” or “Machine learning is just a saying for …” They’re saying that these are just the same things over and over again, and in part, that’s true. But we’re really creating a new field. We’re trying to explain data in a new way.
Favio Vazquez: Someone told me once, “Everything in science is data science, because we all use data.” That may be true, but this is something different, because data is the main topic here. It’s not something we use to solve something. Sometimes you understand data just to understand data, not to solve a bigger problem. That’s one of the biggest differences. When you’re doing science, you’re collecting data to solve a nature problem. Here, we’re trying to solve something, but sometimes we’re only solving a problem for the sake of understanding the data.
Favio Vazquez: So this is really a new field, and the name “data science” is what we have right now. I don’t think it’s going to be the same name forever, as I mentioned before. I think it’s going to evolve, but we’re creating an organizer of fields. Data science is now joining math and computer science and machine learning and deep learning and business. People with MBAs are happy to do data science now. It’s a good field to be in. You can be a physicist or a biologist or a mathematician or a lawyer and do data science.
Mike Delgado: Before we go, what excites you about the future of data science?
Favio Vazquez: I’m excited to see what’s going to happen, and semantics growth. I dream of a day when I can say to my computer, “Hey, computer, please give me the average time people are spending on my website” without writing any code. I don’t think we’re that far away from that, because there are ways to connect Alexa and Google Home to databases right now. In Google Sheets, you can write stuff in this real language, and it will create queries in Excel and give you the answer. So we’re not that far away from being able to speak to data that way.
Favio Vazquez: I think the jump to graph databases and semantics and ontology, all these different things I’m trying to build right now, is going to help the way we do data science and make it easier. The connection with alto machine learning and alto deep learning is just making the job easier and easier, and that’s awesome, because we can focus on solving problems, not on building machine learning algorithms. If you’re spending so much time trying to build a graph or a plot to understand a pattern, you’re missing time. If there’s a tool that can help you do that while you’re focusing on writing a report, understanding the financial impact on what you’re doing and how to tell the story that you’re solving, that’s the value of data science, not building machine learning algorithms. That’s just the way we get from there.
Mike Delgado: I think, too, that point is about how staying active and learning, and growing, finding out where certain libraries are located and what certain tools do for you, will help you accelerate your work.
Favio Vazquez: Yeah. Last year, I had a project called Weekly Digest for Data Science. I did that before for myself; I had a little note saying that, “This week the best library was ‘this’ on [inaudible 00:49:23].” And then I realized this could help other people too. I have a newsletter, I’m going to give that a continuation this month where I highlight the best libraries and packages and blogs for data science.
Mike Delgado: That’s awesome.
Favio Vazquez: There are so many things going on, and bad things are going on too, so you really need to be aware to determine what’s good and what’s bad. That’s not easy, so I’m trying to give people a good direction on what they should be seeing, trying and testing on data science. They are in Python and R. I’m building for those two languages. I think I’m going to add one more language this year. I’m not sure which one, but I’m going to do it.
Mike Delgado: That is so cool. Favio, for those who want to keep up with the work you’re doing, all the educational content you’re developing or even your email list, what’s the best way to get in contact with you?
Favio Vazquez: The best way to contact me is through LinkedIn. Please write a note. Right now, sadly, I can’t accept as many people anymore because I’m reaching the 30,000 limit on LinkedIn, and I have a lot of people waiting for me to connect with them. But if you leave a message saying the purpose of your connection, I’m happy to talk with you. You can write me an email — my email is on my GitHub and my LinkedIn. And you can follow me on Twitter too. My username is FavioVaz. I’m writing some stuff there, too, and posting about my articles and things I’m finding online.
Mike Delgado: I have a short URL: EX.PN/DataTalk70. That will take you to the Experian blog posts that feature Favio, as well as links to his different social media profiles. Because he’s definitely the man to follow to keep up with what’s happening in the data science world, and also just to keep up with all the cool things he’s doing, teaching others through Business Science University, doing his own interviews with different data science leaders and writing a ton. He’s doing a lot to help our data science community. You definitely want to follow him on LinkedIn.
Mike Delgado: By the way, Favio, can you talk briefly about your interview series?
Favio Vazquez: Last year, or … I don’t know when that started. We had something called Data Science Office Hours, and there was —
Mike Delgado: That’s right.
Favio Vazquez: Right now, I think all of us are the top in LinkedIn voices for data science. Some of those guys don’t have the time to be on YouTube or talk about something. So Kristen Kehrer and I got together and said, “Hey, I have time, and I’m interested in doing this.” So we created something called Data Science Live. You can go to DataScienceLive.com to see our past webinars.
Favio Vazquez: In the beginning, we were just talking about data science as a field. And then we realized that we both have good connections, so now what we’re trying to talk with them and share our experiences, listen to what they have to say, ask them questions. You can go there and ask live questions. Tomorrow, we’ll be interviewing Kirk Borne, who is our number one influencer in the data science world. Before, we had Matt Dancho. We had Gabriela, and we’re getting more and more people to talk about what they’re doing or about a specific topic that will help people to do better in data science.
Mike Delgado: Favio, thank you again for your time this morning to share with our community about the work you’re doing and your thoughts on where data science is headed. Also, thank you again for all the work that you’re doing to support and encourage and help the data science community to continue to grow and evolve. Thank you so much, and I hope that we can have you on the show again.
Favio Vazquez: Thank you, Mike. I’d be honored to be here again, and thanks, everyone, for listening. Make sure to just keep studying, keep trying and never give up.
Mike Delgado: Awesome. Thanks, Favio.
Favio Vázquez is the founder of Ciencia y Datos and data science instructor for Business Science University. He also serves as the chief data scientist at Iron AI and Raken Data Group. He has his Master of Science degree in physics, cosmology and data science for cosmology, and he also has his bachelor’s in science and computational engineering. To learn more about Favio, follow him on LinkedIn and Twitter.