14

Finding New Professional Opportunities

After introducing how to better highlight your work and achievements in competitions in the previous chapter, we will now conclude our overview of how Kaggle can positively affect your career. This last chapter discusses the best ways to leverage all your efforts to find new professional opportunities. We expect you now have all the previously described instruments (your Kaggle Discussions, Notebooks, and Datasets, and a GitHub account presenting quite a few projects derived from Kaggle), so this chapter will move to softer aspects: how to network and how to present your Kaggle experience to recruiters and companies.

It is common knowledge that networking opens up many possibilities, from being contacted about new job opportunities that do not appear on public boards to having someone to rely on for data science problems you are not an expert in. Networking on Kaggle is principally related to team collaboration during competitions and connections built during meetups and other events organized by Kagglers.

When it comes to job opportunities, as we have often repeated previously, Kaggle is not a widely recognized source used by human resources and hiring managers for selecting candidates. Some companies do take your Kaggle rankings and achievements into good consideration, but that’s a special case, not the general rule. Typically, you should expect your Kaggle experience to be ignored or sometimes even criticized. Our experience tells us, however, that what you learn and practice on Kaggle is highly valuable and it can be promoted by showcasing your coding and modeling efforts, and also by being able to talk about your experiences working alone or in a team.

Here, we will cover:

  • Building connections with other competition data scientists
  • Participating in Kaggle Days and other Kaggle meetups
  • Getting spotted and other job opportunities

Building connections with other competition data scientists

Connections are essential for finding a job, as they help you get into contact with people who may know about an opportunity before it becomes public and the search for potential candidates begins. In recent years, Kaggle has increasingly become a place where you can connect with other data scientists, collaborate, and make friends. In the past, competitions did not give rise to many exchanges on forums, and teams were heavily penalized in the global rankings because competition points were split equally among the team members. Improved rankings (see https://www.kaggle.com/general/14196) helped many Kagglers see teaming up in a more favorable light.

Teaming up in a Kaggle competition works fine if you already know the other team members and you already have an established approach to assigning tasks and collaborating remotely. In these situations, each team member already knows how to collaborate by:

  • Taking on part of the experimentation agreed by the team members
  • Collaborating with another team member to build a solution
  • Exploring new solutions based on their skills and experience
  • Preparing models and submissions so they are easily stacked or blended

If you are new to teaming, however, you will find it difficult either to enter a team or to organize one yourself. Unless you have contacts, it will be hard to get in touch with other people on the leaderboard. Firstly, not all of them will want to team up because they prefer to compete alone. Furthermore, some of the other competitors might be interested in teaming but will be too wary to accept your proposal. When forming a team with Kagglers you don’t know, there are a few concerns:

  • The person entering the team won’t bring any value to the team
  • The person entering the team won’t actually collaborate but just be a freeloader
  • The person entering the team has infringed (or will infringe) Kaggle rules, which will lead to the disqualification of the entire team
  • The person entering the team is actually interested in spying and leaking information to other teams

Most of these situations are pathological in a competition, and you should be aware that these are common considerations that many make when evaluating whether or not to team up with another Kaggler for the first time. You can only dispel these perceived potential problems by presenting yourself as someone with a strong background in Kaggle; that is, someone who has taken part in some competitions alone and, in particular, published Notebooks and participated in discussions. This will add great credibility to your proposal and more likely bring you acceptance into a team.

When you have finally joined a team, it is important to establish efficient and dedicated forms of communication between the team members (for instance, by creating a channel on Slack or Discord). It is also essential to agree on daily operations that involve both:

  • Deciding how to divide your experimentation efforts.
  • Deciding how to use the daily submissions, which are limited in number (often a cause of conflict in the team). In the end, only the team leader chooses the final two submissions, but the process of getting there naturally involves discussion and disagreement. Be prepared to demonstrate to your teammates why you have decided on certain submissions as final by showing them your local cross-validation strategy and results.

After you have experienced working together in a team in a positive manner, you will surely have gained the respect and trust of other team members. In future competitions, you will probably find it easier to team up again with the same people, or join a different team that they are part of with their help.

The people you will meet and get to work with on Kaggle include data scientists, data enthusiasts, students, domain specialists, and more. Below, we speak to a diverse cross-section of Kagglers, who describe their day jobs and how Kaggle fits into their lives.

Yirun Zhang

https://www.kaggle.com/gogo827jz

Yirun Zhang is a final-year PhD student at King’s College London. A Notebooks and Discussion Grandmaster, he was a member of the winning team in the Jane Street Market Prediction competition (https://www.kaggle.com/c/jane-street-market-prediction).

Can you tell us about yourself?

My research area lies in the field of applying machine learning algorithms to solving challenging problems in modern wireless communication networks such as time series forecasting, resource allocation, and optimization. I have also been involved in projects that study AI privacy, federated learning, and data compression and transmission.

Apart from daily PhD research, I have been active on Kaggle for almost two years, since the second year of my PhD. The first competition I took part in on Kaggle was Instant Gratification, in which I utilized a diversity of machine learning and statistical methods from the sklearn library. This competition helped me develop a general sense of what a machine learning modeling pipeline is for Kaggle competitions.

I have been actively sharing my knowledge with the community in terms of Notebooks and discussion posts on Kaggle, and am now a Kaggle Notebooks and Discussion Grandmaster. Through sharing and discussing with others on the forum, I have gained precious feedback and new knowledge, which has also helped me finally become the winner of a Kaggle competition recently.

Tell us a bit about the competition you won.

Jane Street Market Prediction was a really tough one. The reason is that it was hard to build a robust cross-validation (CV) strategy and lots of people were just using the public leaderboard as the validation set. They were training a neural network for hundreds of epochs without using a validation strategy to overfit the public leaderboard. Our team tried hard to maintain our own CV strategy, and survived in the shake-up.

How different is your approach to Kaggle from what you do in your day-to-day work?

Kaggle competitions are very different from my daily PhD research. The former is very tense and contains instant feedback, while the latter is a long-term process. However, I have found that the new knowledge and methodology I learn from Kaggle competitions is also very useful in my PhD research.

Osamu Akiyama

https://www.kaggle.com/osciiart

Osamu Akiyama, aka OsciiArt, is a Kaggler whose day job does not involve data science. He’s a medical doctor at Osaka University Hospital and a Competitions Master.

Can you tell us about yourself?

I’m a second-year resident working at Osaka University Hospital. I received my master’s degree in Life Science from Kyoto University. After I worked in an R&D job for a pharmaceutical company, I transferred to the Faculty of Medicine of Osaka University and I obtained a medical license for Japan.

I started to learn data science and AI on my own because I was shocked by AlphaGo. I started participating on Kaggle in order to learn and test my skills in data science and AI. My first competition was NOAA Fisheries Steller Sea Lion Population Count in 2017. I participate in Kaggle competitions constantly and I’ve got three gold medals.

Has Kaggle helped you in your career?

Because I’m not educated in information science, I used my results in Kaggle to demonstrate my skill when I applied for an internship at an AI company and when I applied to be a short-term student in an AI laboratory. As I’m just a medical doctor, I’ve never used my data science skills in my main job. However, thanks to my Kaggle results, I sometimes have the opportunity to participate in medical data research.

What is your favorite type of competition and why?

My favorite kind of competition is medical data competitions. I love to try finding some insight from the medical data using my knowledge of medicine.

How do you approach a Kaggle competition?

I love to find a secret characteristic of competition data that most other competitors are not aware of or to try a unique approach customized to the characteristics of competition data. Actually, such an approach is not successful in most cases, but still, it’s fun to try.

Tell us about a particularly challenging competition you entered, and what insights you used to tackle the task.

I’d like to mention Freesound Audio Tagging 2019, which was a multi-label classification task for sound data. The training data was composed of a small amount of reliably labeled data (clean data) and a larger amount of data with unreliable labels (noisy data). Additionally, there was a difference between data distribution in the curated data and the noisy data. To tackle this difficulty, we used two strategies. The first was multitask learning, in which training on noisy data was treated as a different task from clean data. The second was pseudo-labeling (a kind of semi-supervised learning), in which noisy data was relabeled by predicted labels from a model trained with the clean data.

Do you use other competition platforms? How do they compare to Kaggle?

I use Signate (https://signate.jp/) and guruguru (https://www.guruguru.science/). These are Japanese data science competition platforms. They are not as big as platforms like Kaggle; competitions on these platforms use smaller datasets than Kaggle in general, so it is easier to participate. Also, sometimes there are interesting competitions that are different from the ones on Kaggle.

Mikel Bober-Irizar

https://www.kaggle.com/anokas

Mikel Bober-Irizar, aka Anokas, is a Competitions Grandmaster, a Master in Notebooks and Discussion, and a machine learning scientist at ForecomAI. He is also a student of Computer Science at the University of Cambridge and the youngest ever Grandmaster on Kaggle.

Can you tell us about yourself?

I joined Kaggle in 2016, back when I was 14 and I had no idea what I was doing – I had just read about machine learning online and it seemed cool. I started in my first few competitions by copying other people’s public code from the forums and making small tweaks to them. Throughout a few competitions, I slowly gained an understanding of how things worked, motivated by trying to climb the leaderboard – until I started making good progress, which culminated in coming second in the Avito Duplicate Ads Competition later that year.

Since then, I have participated in 75 competitions, in 2018 becoming the youngest competition Grandmaster and the first ever Triple Master. I was since a Visiting Research Fellow at Surrey University, and I’m now studying Computer Science at the University of Cambridge, where I’m also doing research in machine learning and security.

What’s your favorite kind of competition and why? In terms of techniques and solving approaches, what is your speciality on Kaggle?

I really enjoy competitions with lots of opportunity for feature engineering, and those with lots of different types of data, which allow you to be really creative in the approach you take to solving it – it’s a lot more fun than a competition where everyone has to take the same approach and you’re fighting over the last decimal place.

I wouldn’t say I have a specialty in terms of approach, but enjoy trying different things.

Tell us about a particularly challenging competition you entered, and what insights you used to tackle the task.

A couple of years ago, Google ran a competition for detecting objects within images and the relationships between them (e.g., “chair at table”). Other teams spent ages taking a conventional approach and training large neural networks to tackle the tasks, which I didn’t have the knowledge or compute to compete with. I chose to attack the problem from a different angle, and using some neat heuristics and tree models I ended up in seventh place with just a few hours of work.

Has Kaggle helped you in your career?

Kaggle has led to lots of opportunities for me and has been a really great community to get to know. I’ve met lots of people and learned a lot throughout all the competitions I’ve participated in. But Kaggle is also how I got into machine learning in the first place – and I don’t think I would be in this field otherwise. So yes, it’s helped a lot.

What mistakes have you made in competitions in the past?

It’s quite easy to end up with a complicated solution that you can’t replicate from scratch, since chances are you’ll be using various versions of code and intermediate datasets in your final solution. Then, if you’re lucky enough to win, it can be very stressful to deliver working code to the host! If you’re doing well, it’s a good idea to pin down what your solution is and clean up your code.

It’s also easy to get into a situation where you use different validation sets for different models, or don’t retain validation predictions, which can make it hard to compare them or do meta-learning later on in the competition.

Are there any particular tools or libraries that you would recommend using for data analysis or machine learning?

I really like XGBoost, which still tends to beat neural networks on tabular data (as well as its newer cousin, LightGBM). SHAP is really nice for explaining models (even complex ones), which can give you more insights into what to try next.

What’s the most important thing someone should keep in mind or do when they’re entering a competition?

I think it’s important to try not to get bogged down in implementing ultra-complicated solutions, and instead try to make incremental solutions.

Competitions now are a lot harder than when I first started out, so it’s a good idea to look at other people’s code (lots of people make this public during the competition) and try to learn from them. You might want to consider joining a team with some other Kagglers: competitions in teams have been the most fun competitions for me, and have always been a fantastic learning experience.

And finally: most ideas tend to not work – if you want to win a competition, you need to persevere and keep experimenting!

Kaggle has certainly been influential in the previous three interviewees’ rich lives and careers, and they are only just getting started. Below we speak to two Kagglers who now hold senior roles in their respective companies, and who have had long and fruitful journeys also thanks to Kaggle.

Dan Becker

https://www.kaggle.com/dansbecker

First, we have Dan Becker, a Notebooks Grandmaster and Vice President of Product, Decision Intelligence, at DataRobot. Kaggle has played a significant part in Dan’s career.

Can you tell us about yourself?

I first tried using machine learning at a 3-person start-up in 2000 where we tried to use neural networks to help retailers optimize the reserve prices they set for items on eBay. We had no clue what we were doing, and we failed miserably.

By 2002, I was confident that machine learning could never work. I got a PhD in economics and took a job as an economist for the US government. I wanted to move to Colorado, but there weren’t many jobs there looking for economics PhDs. So I was looking for a less academic credential.

In 2010, I saw a newspaper article about the Heritage Health Prize. It was an early Kaggle competition with a $3 million prize. I still believed that simpler models like what I used as an economist would give better predictions than fancy machine learning models. So I started competing, thinking that a good score in this competition would be the credential I needed to find an interesting job in Colorado. My first submission to that competition was not last place, but it was pretty close. My heart sank when I watched my model get scored, and then saw everyone else was so far ahead of me. I briefly gave up any hope of doing well in the competition, but I was frustrated not even to be average.

I spent all my nights and weekends working on the competition to climb up the leaderboard. I relearned machine learning, which had progressed a lot in the 10 years since I’d first tried it. I’d learn more and upload a new model each day. It took a lot of time, but it was rewarding to march up the leaderboard each day. By the time my score was in the middle of the leaderboard, I thought continued work might get me in the top 10%. So I kept working. Soon I was in the top 10%, thinking I might get in the top 10 competitors.

When I was in the top 10, an analytics consulting company reached out to me to ask if I wanted to be hired and compete under their company name, which they would use for marketing. I told them I would do it if I could work from Colorado. So the Kaggle competition helped me achieve my original goal.

We finished in 2nd place. There was no prize for 2nd place, but everything I’ve done in my career since then has been enabled by this one Kaggle competition. It was a bigger success than I ever could have imagined.

How else has Kaggle helped you in your career?

Kaggle has almost entirely made my career. My first job as a data scientist came when someone recruited me off the leaderboard. My next job after that was working for Kaggle. Then I worked at DataRobot, whose recruiting strategy at the time was to hire people who had done well in Kaggle competitions. Then I went back to Kaggle to start Kaggle Learn, which is Kaggle’s data science education platform. The list goes on. Every job I’ve had in the last decade is clearly attributable to my initial Kaggle success.

As I switched from economics to data science, my Kaggle achievements were at the heart of why I was hired. Being further in my career now, I don’t think in terms of portfolios... and I’m fortunate that I’m recruited more than I look for work.

What’s your favorite kind of competition and why? In terms of techniques and solving approaches, what is your specialty on Kaggle?

I’ve been around the community for a long time, but I haven’t intensely dedicated myself to a competition in 7 or 8 years. I enjoy new types of competitions. For example, I was first exposed to deep learning in 2013 as part of Kaggle’s first competitions where deep learning was competitive. This was before Keras, TensorFlow, PyTorch, or any of the deep learning frameworks that exist today. No one in the community really knew how to do deep learning, so everyone was learning something new for the first time.

Kaggle also ran an adversarial modeling competition, where some people built models that tried to manipulate images slightly to fool other models. That was very experimental, and I don’t know if they’ll ever run anything like that again. But I really like the experimental stuff, when everyone in the community is figuring things out together in the forums.

How do you approach a Kaggle competition? How different is this approach to what you do in your day-to-day work?

The last few times I’ve done competitions, I focused on “what tooling can I build for this competition that would automate my work across projects?”. That hasn’t been especially successful, but it’s an interesting challenge. It’s very different from how I approach everything else professionally.

Outside of competitions, I LOVE analytics and just looking at data on interesting topics. I sometimes say that my strength as a data scientist is that I just look at the data (in ways that aren’t filtered by ML models).

I also spend a lot of time thinking about how we go from an ML model’s prediction to what decision we make. For example, if a machine learning model predicts that a grocery store will sell 1,000 mangos before the next shipment comes, how many should that grocery store hold in stock? Some people assume it’s 1,000... exactly what you forecast you can sell. That’s wrong.

You need to think about trade-offs between the cost of spoiling mangos if you buy too many vs the cost of running out. And what’s their shelf life? Can you carry extra stock until after your next shipment comes? There’s a lot of optimization to be done there that’s part of my day-to-day work, and it’s stuff that doesn’t show up in Kaggle competitions.

Tell us about a particularly challenging competition you entered, and what insights you used to tackle the task.

I tried to build an automated system that did joins and feature engineering for the Practice Fusion Diabetes Classification challenge. The main thing I learned was that if you have more than a few files, you still needed a person to look at the data and understand what feature engineering makes sense.

In your experience, what do inexperienced Kagglers often overlook? What do you know now that you wish you’d known when you first started?

New participants don’t realize how high the bar is to do well in Kaggle competitions. They think they can jump in and score in the top 50% with a pretty generic approach... and that’s usually not true. The thing I was most surprised by was the value of using leaderboard scores for different models in assigning weights when ensembling previous submissions.

What mistakes have you made in competitions in the past?

I’ve screwed up last-minute details of submissions in multi-stage competitions several times (and ended up in last place or near last place as a result).

Are there any particular tools or libraries that you would recommend using for data analysis or machine learning?

Mostly the standard stuff.

Outside of Kaggle competitions, I personally like Altair for visualization... and I write a lot of SQL. SQL is designed for looking at simple aggregations or trends rather than building complex models, but I think that’s a feature rather than a bug.

Jeong-Yoon Lee

https://www.kaggle.com/jeongyoonlee

Finally, we have Jeong-Yoon Lee, a multiple-medal-winning Competitions Master and Senior Research Scientist in the Rankers and Search Algorithm Engineering team at Netflix Research.

Can you tell us about yourself?

My name is Jeong, and I’m a Senior Research Scientist at Netflix. I started Kaggle back in 2011 when I finished my PhD and joined an analytic consulting start-up, Opera Solutions. There, I met avid Kaggle competitors including Michael Jahrer, and we participated in KDD Cups and Kaggle competitions together. Since then, even after leaving the company, I continue working on competitions both as a competitor and an organizer. Lately, I don’t spend as much time as I did before on Kaggle, but still check it out from time to time to learn the latest tools and approaches in ML.

Has Kaggle helped you in your career?

Tremendously. First, it provides credentials in ML. Many hiring managers (when I was an interviewee) as well as candidates (when I was an interviewer) mentioned that my Kaggle track records had caught their attention. Second, it provides learning in state-of-the-art approaches in ML. By working on over 100 competitions across different domains, I’m familiar with more approaches to almost any ML problem than my peers. Third, it provides a network of top-class data scientists across the world. I’ve met so many talented data scientists at Kaggle and enjoy working with them. I translated Abhishek Thakur’s book, organized a panel at KDD with Mario, Giba, and Abhishek, and am interviewing for Luca’s book. ;)

In 2012, I used Factorization Machine, which was introduced by Steffen Rendle at KDD Cup 2012, and improved on prediction performance by 30% over an existing SVM model in a month after I joined a new company. At a start-up I co-founded, our main pitch was the ensemble algorithm to beat the market-standard linear regression. At Uber, I introduced adversarial validation to address covariate shifts in features in the machine learning pipelines.

What’s your favorite kind of competition and why? In terms of techniques and solving approaches, what is your specialty on Kaggle?

I like competitions with small to medium-size datasets, which are mostly tabular data competitions, because I can quickly iterate different approaches even on my laptop anytime anywhere. During my peak time at Kaggle in 2015, I often built my solutions on the airplane or in between my babysitting shifts. My triplets were born in late 2014 and I was working at a new start-up I’d co-founded.

I don’t think I have any special modeling techniques, but my specialty is more around competition management, which includes recruiting team members, setting up a collaboration framework (e.g., Git, S3, Messenger, Wiki, internal leaderboard, cross-validation splits), helping the team work effectively throughout the competition, etc. So I’m not a competition Grandmaster myself, but was able to reach the top 10 because other Grandmasters liked to work with me.

How do you approach a Kaggle competition? How different is this approach to what you do in your day-to-day work?

I try to build a pipeline that enables fast iterations and incremental improvements. The more ideas you try, the better chance you have to do well in a competition. The principle applies to my day-to-day work as well. The scope is different, though. At work, we start by defining problems and identifying the data, while at Kaggle, both are given, and we start from EDA.

In your experience, what do inexperienced Kagglers often overlook? What do you know now that you wish you’d known when you first started?

Recently, I noticed that many users simply fork a Notebook shared by other users and fine-tune it to get better scores. Eventually what matters is learning, not the Kaggle ranking or points. I recommend that new Kagglers spend more time building their own solutions.

What’s the most important thing someone should keep in mind or do when they’re entering a competition?

It’s about learning, not about winning.

Do you use other competition platforms? How do they compare to Kaggle?

I’m advising Dacon AI, a Korean ML competition platform company. It started in 2018 and has hosted 96 competitions so far. It’s still in an early stage compared to Kaggle, but provides similar experiences to Korean users.

Participating in Kaggle Days and other Kaggle meetups

A good way to build connections with other Kagglers (and also be more easily accepted into a team) is simply to meet them. Meetups and conferences have always been a good way to do so, even if they do not specifically deal with Kaggle competitions, because the speakers talk about their experiences on Kaggle or because the topics have been dealt with in Kaggle competitions. For instance, many Research competitions require successful competitors to write papers on their experience, and the paper could be presented or quoted during a conference speech.

There were no special events directly connected with Kaggle until 2018, when LogicAI, a company created by Maria Parysz and Paweł Jankiewicz, arranged the first Kaggle Days event in Warsaw, Poland, in collaboration with Kaggle. They gathered over 100 participants and 8 Kaggle Grandmasters as speakers.

More Kaggle Days events followed. Here are the events that were arranged, along with the links to the materials and talks:

Starting from the second event in Paris, smaller events in the form of meetups were held in various cities (over 50 meetups in 30 different locations). Participating in a major event or in a meetup is a very good opportunity to meet other Kagglers and make friends, and could be helpful both for career purposes or for teaming up for future Kaggle competitions.

In fact, one of the authors found their next job in just this way.

Getting spotted and other job opportunities

For some time, Kaggle was a hotspot where employers could find rare competencies in data analysis and machine learning modeling. Kaggle itself offered a job board among the discussion forums and many recruiters roamed the leaderboard looking for profiles to contact. Companies themselves held contests explicitly to find candidates (Facebook, Intel, and Yelp arranged recruiting competitions for this purpose) or conveniently pick up the best competitors after seeing them perform excellently on certain kinds of problems (such as the insurance company AXA did after its telematics competitions). The peak of all this was marked by a Wired interview with Gilberto Titericz, where it was stated that “highly ranked solvers are flooded with job offers” (https://www.wired.com/story/solve-these-tough-data-problems-and-watch-job-offers-roll-in/).

Recently, things have changed somewhat and many Kagglers report that the best that you can expect when you win or score well in a competition is some contact from recruiters for a couple of months. Let’s look at how things have changed and why.

Nowadays, you seldom find job offers requiring Kaggle experience, since companies most often require previous experience in the field (even better, in the same industry or knowledge domain), an academic background in math-heavy disciplines, or certifications from Google, Amazon, or Microsoft. Your presence on Kaggle will still have some effect because it will allow you to:

  • Be spotted by recruiters that monitor Kaggle rankings and competitions
  • Be spotted by companies themselves, since many managers and human resource departments keep an eye on Kaggle profiles
  • Have some proof of your coding and machine learning ability that could help companies select you, perhaps not requiring you to take any further tests
  • Have specific experience of problems highly relevant to certain companies that you cannot acquire otherwise because data is not easily accessible to everyone (for instance, telematics, fraud detection, or deepfakes, which have all been topics of Kaggle competitions)

Seldom will your results and rankings be taken into account at face value, though, because it is difficult to distinguish the parts that are actually due to your skill from other factors affecting the results that are of less interest to a company thinking of hiring you (for instance, the time you have available to devote to competitions, hardware availability, or some luck).

Your Kaggle rankings and results will more likely be noticed in the following cases:

  • You have scored well in a competition whose problem is particularly important for the company.
  • You have systematically scored well in multiple competitions around topics of interest for the company, a sign of real competency that means you are not simply labeling yourself a “data scientist” or a “machine learning engineer” without a solid basis.
  • Through your Kaggle participation, you are showing a true passion for data analysis to the point where you are investing your free time for free. This is a positive, but may also turn into a double-edged sword and bring lower monetary offers unless you show that you recognize your value.

While they might not make the difference alone, your Kaggle rankings and results can act as differentiators. Recruiters and companies may use Kaggle rankings to make lists of potential candidates. The two most noticed rankings are in Competitions and Notebooks (hence, they also have the more intense competition and larger numbers of Grandmasters out of the four ranked areas), but sometimes they also watch the rankings for a specific competition. When certain rare competencies (for instance, in NLP or computer vision) are sought after, it is easier to find them in competitions that require you to use them skillfully in order to be successful.

Another great differentiator comes at interview time. You can quote your competitions to show how you solved problems, how you coded solutions, and how you interacted and collaborated with teammates. On these occasions, more than the ranking or medal you got from Kaggle, it is important to talk about the specifics of the Kaggle competition, such as the industry it referred to, the type of data you had to deal with and why it interested you, and also to present your actions during the competition using the STAR approach, often used in job interviews.

The STAR approach

In the STAR approach, you should structure what you did in a competition based on the framework Situation, Task, Action, and Result. This method aims to have you talk more about behaviors than techniques, thus putting more emphasis on your capacities than the capabilities of the algorithm you have chosen; anyone else could have used the same algorithm, but it was you who managed to use it so successfully.

The method works principally when dealing with success stories, but you can also apply it to unsuccessful ones, especially for situations where you gained important insights about the reasons for your failure that stopped you from failing in the same way again.

To apply the method, you break down your story into four components:

  • Situation: Describe the context and the details of the situation so the interviewer can understand, at a glance, the problems and opportunities
  • Task: Describe your specific role and responsibilities in the situation, helping to frame your individual contribution in terms of skills and behaviors
  • Action: Explain what action you took in order to handle the task
  • Result: Illustrate the results of your actions as well as the overall result

Some companies do explicitly ask for the STAR approach (or its relative, the Goal-Impact-Challenges-Finding method, where more emphasis is put on the results); others do not, but expect something similar.

The best answers are those that suit the values and objectives of the company you are interviewing for.

Since just reporting the rankings and medals you got in a competition may not be enough to impress your interviewer, reformulating your successful experience in a Kaggle competition is paramount. The approach can work either when you have competed solo or in a team; in the latter case, an important aspect to describe is how you interacted with and positively influenced the other teammates. Let’s discuss some ways you could do that.

First, you describe the situation that arose in the competition. This could be in the initial phases, in the experimentation phases, or in the final wrap-up. It is important you provide clear context in order for the listener to evaluate whether your behavior was correct for the situation. Be very detailed and explain the situation and why it required your attention and action.

Then, you should explain the task that you took on. For instance, it could be cleaning your data, doing explorative analysis, creating a benchmark model, or continuously improving your solution.

Next, you describe how you executed the task. Here, it would be quite handy if you could present a Medium article or a GitHub project in support of your description (as we discussed in the previous chapter). Systematically presenting your experience and competence through well-written documentation and good coding will enforce your value proposition in front of the interviewer.

Finally, you have to explain the result obtained, which could be either qualitative (for instance, how you coordinated the work of a team competing on Kaggle) or quantitative (for instance, how much your contribution affected the final result).

Summary (and some parting words)

In this chapter, we have discussed how competing on Kaggle can help improve your career prospects. We have touched on building connections, both by teaming up on competitions and participating in events related to past competitions, and also on using your Kaggle experience in order to find a new job. We have discussed how, based on our experience and the experience of other Kagglers, results on Kaggle alone cannot ensure that you get a position. However, they can help you get attention from recruiters and human resource departments and then reinforce how you present competencies in data science (if they are supported by a carefully-built portfolio, as we described in the previous chapter).

This chapter also marks the conclusion of the book. Through fourteen chapters, we have discussed Kaggle competitions, Datasets, Notebooks, and discussions. We covered technical topics in machine learning and deep learning (from evaluation metrics to simulation competitions) with the aim of helping you achieve more both on Kaggle and after Kaggle.

Having been involved in Kaggle competitions for ten years, we know very well that you can find everything you may need to know on Kaggle – but everything is dispersed across hundreds of competitions and thousands of Notebooks, discussions, and Datasets. Finding what you need, when you need it, can prove daunting for anyone starting off on Kaggle. We compiled what we think is essential, indispensable knowledge to guide you through all the competitions you may want to take part in. That is why this has not been a book on data science in a strict sense, but a book specifically on data science on Kaggle.

Aside from technical and practical hints, we also wanted to convey that, in over ten years, we have always found a way to turn our experiences on Kaggle into positive ones. You can re-read this work as a book that describes our endless journey through the world of data science competitions. A journey on Kaggle does not end when you get all the Grandmaster titles and rank first worldwide. It actually never ends, because you can re-invent how you participate and leverage your experience in competitions in endless ways. As this book ends, so your journey on Kaggle starts, and we wish for you a long, rich, and fruitful experience – as it has been for us. Have a great journey!

Join our book’s Discord space

Join the book’s Discord workspace for a monthly Ask me Anything session with the authors:

https://packt.link/KaggleDiscord

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.186.92