Leader’s Voice: Parul Pandey, Data Evangelist at H2O.ai, Kaggle Grandmaster

We are extremely excited to present to you our conversation with Parul Pandey who is a Data Evangelist at H2O.ai and a Kaggle Grandmaster.

In this conversation Parul talks about developing a career in data science by enhancing one’s skill and gaining visibility in the field, her journey with Kaggle and how to get started with it and much more.

SDD: Parul, currently you are a Data Evangelist at H2O. What was your first encounter with data science like and can you tell us a bit about your journey in data science and machine learning?

PP: I have always been fascinated with numbers, but my real tryst with Data Science occurred when I joined my first job. I had been inducted into a department that was responsible for the analysis and planning of the Power Distribution network . There I learnt how to crunch numbers, analyse them to get insights, and perform predictive analysis. This got me hooked into nuances of data science and machine learning.

However, all these analyses were being done using proprietary tools. I would sometimes wonder if the processes could be replicated using open source tools and the current best practices in data analysis, predictive analysis, etc. Sadly, due to a hectic schedule, I couldn’t give much time to these thoughts, but then I went for my maternity leave which gave me a chance to reflect on my life and career, and if I was actually enjoying what I was doing. In a way, I reinvented my whole career during my maternity leave!

SDD: Apart from your work at H2O, you are also a Kaggle Grandmaster. How did you get started with Kaggle? Do you have any advice for the women in our community to get started with Kaggle?

PP: I first came to know about Kaggle through one of the online courses that I was doing at that time. Initially, I had the misconception that Kaggle was only a data science competition platform. When I look back now, I realize that I was so wrong about that. Kaggle is a platform that has something to offer to everyone, be it a beginner or an expert in the field.

There are number of free courses on Kaggle which approach machine learning problems in an applied way. Kaggle also provides open datasets that you can download for free and use. Apart from this, Kaggle provides access to notebooks which are virtual Jupyter notebooks that can be run on the cloud, so you don’t need to download them. And they’re free of charge! 

There is a complete Kaggle Forum tab where you can ask for advice from other Data Scientists and people are more than willing to help. Ultimately, there are Competitions. After you’ve spent some time with Kaggle Datasets and Notebooks, you can move on to the competitions. Kaggle Competitions are a great way to test your knowledge and see where you stand in the world of data science. It is also a great way to collaborate and you can team up to participate.

When starting new, your major emphasis should be learning and not medals and for this playground competitions are great. Look for top voted notebooks to see how others have approached the problem and then try and solve the problem yourself. When stuck ask for help in forums. This way you will not only learn a lot but also enjoy your Kaggle journey.

SDD: H2O focuses on open source automatic machine learning. Can you tell us a bit about H2O and what inspires your work as a Data Evangelist there?

PP: As an evangelist, my job is to interact with the community and people to spread the word about data science in general and H2O.ai’s products in particular. I help people understand the use of products like Driverless AI and H2O’s open-source offerings. As Guy Kawasaki put it- Evangelism isn’t a job title; it’s a way of life. My work requires a lot of self-awareness and willingness to stretch and grow in the role.

SDD: Most women in our community enroll in online courses to learn data science and complete them. However, once the course is over, they are unsure about how to apply those skills. Apart from finishing these online courses, what else should they be doing so that they can develop skills that can be readily applied on projects?

PP: Application is a very important aspect of Data Science. Online courses can help you get to know a topic but real understanding is achieved only when you apply the concepts in real-time. Here are some of the ways which I used to put the learnings into practice:

  • Start writing a blog
  • Try answering questions on forums
  • Volunteer to speak at Meetups
  • Use GitHub to host and share all your analysis

These activities will not only help to enhance one’s skills but also gives a lot of visibility in this field.

SDD: Are there any recent advancements in Machine Learning that excited you? Any papers or projects that particularly caught your interest?

Machine Learning is a dynamic field that is undergoing advancements at a rapid pace. I have been particularly fascinated by the significant breakthroughs in the Natural Language Processing(NLP) area.

The recent landmark breakthroughs in NLP architecture, in particular with regards to the Attention technique and Transformer models have made it possible to apply techniques, which once were mainly restricted to the research area, are now becoming much more mainstream and translating into real-world business applications.

SDD: What inspires you every day?

I enjoy what I do and the fact that I get to interact with our community each day is something I do look forward to and that inspires me.


Join our She Drives Data community on SHEROES (You can also download the app on iOS and Android) to connect with thousands of women data enthusiasts across the world, hear from data science experts and get updated on the latest tech news every day!