How do you onboard newbies onto an existing codebase?

For several years there are more and more open source projects and the number of developers is increasing every year. This raises the question of the newbies onboarding on these projects. That's why it's easy to find articles on the internet explaining how to start participating in an existing codebase. But today we are going to reverse the question. As a team, how can we make new contributors understand and master the codebase faster and better ? Whether the project is small or big, closed source or open to the world, this is an essential and yet often forgotten point of the code quality. This topic is becoming more important with the current trend towards remote work.


Documentation, of course

We are going to start with an obvious, yet indispensable topic: the documentation. Of course, we are talking about the developer documentation and not the user documentation. In projects with many contributors on Github like React, tensorflow or Angular we regularly find two files for newcomers: a code of conduct and the contributing file.


The first one explains rules on how to be part of the team/community by defining standards of an environment that respects everyone. This is where we exclude unprofessional behaviours and demand respect for everyone and everyone’s work. Its rules, although self-evident, are important to write down in large scale projects. For smaller projects and especially for private ones it could be a global note for the all company/team.


The second file, on the other hand, is way more oriented towards the technical aspect of contributing to the project. It contains the processes and rules that drive the work on the repository. It’s the source of truth for developers, not only new ones. Whenever there is a question about merge strategy, pull request requirements or commit message for example, the contributing file must contain the answer. Everything that is decided on development strategy should be explained in plain sight. You can never be precise enough. It's better to be too detailed than leaving blurred lines that can be misinterpreted by outsiders. 


Now that new developper know how to participate, they need to understand the project and for that they should read the technical documentation. This is, for many programmers, the most difficult manual to write but it’s one of the most important. This is essential in order to discover the technical complexity of the project, but it is also a good companion during the early stages to guide new peoples on the codebase understanding.


Once you have all of that, don’t forget to update it. In addition to the project evolution, we cannot write down all the necessary information alone. So the documentation must be updated with the project itself. Having documentation is good, updated documentation is way better !


Your home made TDD

Now that your new member has read and learned alone, it’s time to share. Lots of teams jump this step and go straight to programming, but at Ponicode, we don't do  like everybody else. We love doing TDD! No, I'm not talking about Test Driven Development but Technical Deep Dive. The idea is for a senior developer to spend time with a junior one to explain the project architecture and answer questions. This may seem obvious to you, it is often the simplest things that work best. We simply choose to enforce this workshop for everyone. This helps us to answer questions quickly during onboarding and not leave developers in the dark for days.


Like said before, the next phase is programming. But once again, it’s not about dropping your new friend alone on the darkest part of the codebase. So we choose to set up a peer programming workshop. I think you know the principle: 2 developers or more work together on the same issue. That’s perfect to continue the previous deep dive, pass work habits, best practices and, in both ways. It’s not only about training a developper, It’s above all knowledge or ideas sharing. A fresh eye can bring good ideas and highlight some bugs or improvements to do. More time passed, less peer programming is really required but I think it’s a best practice to enforce even after onboarding to ensure team spirit and knowledge spreading.


Issue flag not pirate flag

It’s often difficult to find the ideal first subjects when a new member arrives. This is contradictory because projects are always full of small bugs or non-urgent functionalities that can be done but we don’t think about it at the moment. This is why many projects have a "new member" label on their Github issues to indicate that it is a good place to start. It doesn't matter If you are on Trello, Jira or even Excel, the idea is to maintain a list of topics that can be given to newcomers. To be really useful, this list should not simply contain the name of the feature/bug. The ideal is to have as much detail as possible such as:

  • the desired behaviour or the method to reproduce if it's a bug
  • the name of the reporter to ask question
  • A screenshot if necessary
  • The function/filename where the code must be updated (if possible)
  • Any indication on the implementation (if possible)

Basically everything that can guide the developer.


You need to test - Unit test

I'm not going to insult you by explaining all the benefits of tests and specially unit tests. We’re going to talk about avantage for newcomers. I think it’s very reassuring to know that some tests will fail if you break something. It’s literally the purpose of unit testing to detect regression. But once again, simple ideas are often the best ones.A real plus is to automate unit tests on a continuous integration pipeline so you won’t forget to run it. Tests are also good code samples to understand function or method usage.


Review

Last but not least, code reviewing. In my personal opinion, review is the most important part of the development process. In addition to reading, testing and asking updates on  the pull request, this commit should be an amazing sharing zone for the team. Indeed, pull request reviews should be the best place to ask questions about how a piece of code works or any implementation details. I think it's not worth seeing the comment section as a punishment but rather a real opportunity to do peer programming and knowledge sharing. Personally, I've never learned more about a project and the best practices than by reading and asking questions about other people's code. 


There would still be a lot to say about onboarding on a complex code base like code formatting, folders structure, comment policy or naming convention but here we address the most important for us. Our new member training program is constantly improved and we love sharing about it. So don’t hesitate to send an email to ping@ponicode.com to say hi !