In this conversation, Angela talks about her journey in Data Science and Machine Learning, her motivations behind authoring ‘The Data Resource‘, advice for beginners, and much more!
AT: I grew up in the Philippines and moved to the US for college. I actually knew nothing about data science and the tech world until my senior year at university, when I went to a hackathon on a whim. Coming from a more patriarchal and traditional society, I initially found it difficult to find opportunities to gain exposure to the field. Additionally, at that time, there existed few role models for young women to look up to. When I was growing up, it seemed that women were persuaded by pursuing a career in computer science, and I actually only knew a handful of women who were working in tech. After that hackathon I attended during my senior spring, it seemed like a new world had been opened up to me, and I was excited by the tangible and immediate impact people could make by utilizing advances in technology. It amazed me how quickly and efficiently we could build solutions and execute them, and from there I made it my goal to not only learn about data science but find ways to apply this knowledge in a concrete and socially-impactful way.
Shortly after graduation, I started my first job working as a risk analyst, and I got a glimpse into how machine learning models are deployed “in the wild”. At that time, I was lucky enough to meet data scientists and industry experts who inspired me, and who eventually became my mentors. Thanks to their advice, I realized that if I wanted to begin a project or pursue an initiative that I believe would make a difference in the field, I could just start. I realized that I would never truly “be ready”, in the sense of having all the tools and all the knowledge to strive for that mission, but I also realized that that was okay. Having a growth mindset and working towards incremental and consistent improvements made innovating on a product, whether it be a book or a machine learning project, a lot more efficient and impactful, as compared to aiming to produce a perfect manuscript on the first shot. This mental paradigm shift was incredibly helpful as I was writing my first book, and along with the advice of my mentors, I’m grateful that opportunities like becoming a fellow at SharpestMinds and working on the editorial team at TDS have opened up for me. Since then, I have decided to pursue a masters degree in data science full time at New York University’s Center for Data Science, where I am currently in my second semester of studies.
SDD: You recently authored a book titled ‘The Data Resource’. What inspired you to write this book?
AT: After graduation, when I was initially applying for jobs back at home, I realized that there were very few, if any, innovative engineering roles available — particularly in the space of data science. I was curious to understand why there weren’t enough data science jobs back at home, despite articles alluding to the demand for data scientists being at an all-time high. At a time when the amount of data generated per day is skyrocketing, it is important to create and curate a discussion about how our technology and access to it have shaped history, and what policies we need in order to keep up with an increasingly globalized and data-saturated world. My hope for this project is that it opens up a discussion about the role of data science in emerging markets, as well as why diversity and representation are important in technology as an industry. The importance of socially impactful applications of technology is crucial to the holistic growth of data science. With the field’s quick growth, it is our responsibility to ensure that innovation positively impacts more than just the one percent, but that this positive change also spreads throughout segments of society who could benefit from it the most.
SDD: Towards Data Science provides thousands of data science resources to data enthusiasts across the globe. What is the experience like to be an editor of Towards Data Science?
AT: It’s great! Working on the editorial team at TDS is immensely fulfilling, and we have such a collaborative team dynamic that I definitely feel like I learn something new every day. We have a small and tightly-knit team, so launching new projects and discussing specific articles is really streamlined. It’s great to be able to work with such a talented team of editors and editorial associates, and I love hearing their thoughts on articles and data science as a whole. The passion and excitement that each member on the team has for tech truly promote an environment that’s conducive to innovation. Additionally, the opportunity to read through articles written by such talented and innovative authors is always very exciting. The growth of the field in terms of article topics as well as the diversity of backgrounds the authors have is really exciting, and I’m grateful to be a part of this. Working with the editorial team at TDS has helped me grow both technically and as a writer, and I can’t think of a better team to work with.
SDD: What advice do you have for the women in our community to get started with data science and stay engaged with it?
AT: Start. I think one of the most underrated recommendations is to just start. When I was first exploring data science, I found it really intimidating to “just start”. As someone who sees herself as a “humanities concentrator”, I had pretty bad impostor syndrome, even though it is clear that the humanities/STEM dichotomy is not a dichotomy at all, and that positive innovative change happens at the intersection of disciplines. It’s easy to hear all the voices in your head saying that you can’t do it or you can’t learn something new, or that you’re not good at math and not cut out for the field, but I learned that even just attempting to begin a challenging task that initially scares you, leads to a feeling of accomplishment that could motivate you to continue moving forward. I had this assumption that to get into tech I had to have been coding since I was in grade school, but clearly, that isn’t true. I wrote my first line of code after graduation, and have never looked back since then. Don’t be scared of having goals that are “too big”, because I think that if you work hard enough and smart enough, and have a bit of luck, one day you’ll wake up and realize that those dreams that you had are now a reality.
Find a mentor. There is a ton of information out there, most of which are freely accessible as long as you have a laptop and an internet connection. I actually am a self-taught coder, and most of the resources I used to learn how to code are either very cheap, or completely free. However, because there are so many available resources and so many paths I could take, I found that having a mentor (or mentors) has really skyrocketed my career. It was incredible to see the difference it made to have someone who was more knowledgeable about the field help me navigate through all these resources, and evaluate which paths were a good fit for my goals and preferences. It also really helped to get constructive feedback on my projects and it made iteration faster and more efficient.
AT: Keep building. It takes time to learn something, and even more, an effort to become good at it. There’s a steep learning curve, and it’s sometimes easy to get discouraged, but I think that taking the time to reflect and think critically about my work and my process has been extremely helpful in improving my output, and becoming more efficient at producing that output. I also think that while it’s important to discover what projects you like working on, it’s equally important to figure out what kind of work you don’t like doing. Building projects, playing with datasets, and experimenting with different areas of data science, like analytics and data engineering, have been helpful when I was trying to decide what facet of data science I want to focus on. Additionally, sharing my work with others, either through blog posts on Medium or on platforms like MadewithML, are great for discussing your results, finding future projects, and getting feedback on your work.
SDD: Are there any recent advancements in NLP or Machine Learning in general that excites you? Any papers or projects that particularly caught your interest?
AT: Although it’s a paper from 2017, a somewhat recent publication that I read earlier this year is called “Text Preprocessing For Unsupervised Learning: Why It Matters, When It Misleads, And What To Do About It”, by Denny and Spirling. I thought this paper was really cool because, in the past, I had always assumed that “text preprocessing” was a black box-type procedure, and I hadn’t really understood in detail what happened under the hood. However, this paper sheds light on how and why different preprocessing methods can impact the results of your model. It’s also really cool because the researchers of this paper (one of whom is my academic advisor!) created software that assesses the effects of text preprocessing decisions. (For reference, the package is called preText and it’s a vignette written in R. The Github repo can be found here https://github.com/matthewjdenny/preText and the paper can be found here https://www.nyu.edu/projects/spirling/documents/preprocessing.pdf).
SDD: What inspires you every day?
AT: I’m really inspired by the positive impact that data science can have in emerging countries. The democratization of data, information, and cloud computing has a lot of potentials, and I’m excited to see that barriers to entry into the field are slowly diminishing. Seeing technologists who push through those barriers motivates and inspires me because they give hope that countries could have the ability to leverage their own data and create sustainable economic opportunities. I’m also very excited by the potential that data science and tech could have in the social good space, and am optimistic that accessibility to the industry will encourage more socially impactful applications of the tech.
Join our She Drives Data community on SHEROES (You can also download the app on iOS and Android) to connect with thousands of women data enthusiasts across the world, hear from data science experts and get updated on the latest tech news every day ❤️️