Life of a D-Nitro Engineer - Operations Edition
D-Nitro is a developer productivity and enablement team within KPN’s Technology Solutions TechBase department, we help 2000 other engineers with a team of 5. What’s that like, and what is it really that I do?
I joined D-Nitro 2,5 years ago, yes, I joined a new team in the week before Lockdown. Luckily I was already quite familiar with my new colleagues, having used D-Nitro’s paved road and services for the 2 years before that. That made the whole onboarding to a new team experience more pleasant in such uncertain times.
My Colleagues have done a great job describing what a typical day looks like, and the range of things we work on. So I’d suggest you take a look at those first.
I’d like to show you another angle of our work.
As you can imagine, we get a lot of questions about the paved road, best practices, tools we offer, and even about subjects we don’t even touch or know anything about. Most people will reach out to us on our designated Slack channel, some people will email us, and others, well others just know how to find us, no matter how hard we try to hide.
It was quickly clear to the team that answering all these questions ad-hoc wasn’t making anyone happy. A lot of the innovation work we do requires concentration, and having that broken up every 10 minutes because a question has come in wasn’t doing anything positive for our happiness and productivity.
This is where we introduced the concept of Operator of the Week in our team. One person, that for a full week doesn’t do any sprint work, but only focuses on Operations. This would be the same person would be on call for our out of office hours incident phone line, which luckily hasn’t been called in a long time (knock on wood).
Start of the week
So what does my week as Operator of the week look like? Every Monday morning at 10am we have a week start meeting where we discuss the weekend, the previous week, and what we’ll be focussing on this week. This is also the hand over moment for our Operator of the week. Anything that needs to be chased? Any people waiting for answers? At 10 AM my wrist will also start to vibrate to let me know that the out of office incident line has been forwarded to my phone number.
One of the things we offer is ‘Jenkins as a Service’, every team can request a Jenkins instance in the cloud and have it running within 15 minutes, and we make sure it’s always running the latest updates. We also manage an array of shared agents that teams can run their pipeline on. With 160 teams with their own instance (sometimes multiple) you can imagine that Jenkins alone could cause quite a lot of support questions.
Just this week we got questions about a build being slower than normal, sometimes it’s on us that an agent is a little overworked, other times we can give the end user some pointers on how to optimise their build. We’ve built up quite a collection of knowledge base articles where we share best practices, examples, and pointers.
Looking into this we noticed we were losing a lot of time in the back and forth with users, not knowing what team someone was from, what instance was having an issue, or even what pipeline they were asking about. We didn’t want to become corporate requiring a ticket system, we like that the barrier to entry is low, especially since a lot of people also help each other in our channel. It’s nice to have that all out in the open.
Making everyones life easier
So let’s back track a little. Earlier on I already mentioned that the Operator of the Week doesn’t have to do any normal work. But what I didn’t explicitly say is that they are expected to contribute. Answering questions in Slack isn’t a full time job, nor do we want it to be.
The time between questions is used to optimise our operations work flow. Think about setting up alerting for Certificates expiring, building self-service flows for common questions we receive like creating an Artifactory repo for Docker, requesting a new ldap group and matching git project for your team and force restarting a Jenkins instance that got stuck on something. Identifying the biggest pain points and making life better for your self and our end customers.
D-3PO, pleasure to meet you!
One of those ‘Operations’ inventions is D-3PO. Our very own Slackbot that started out as an April Fools joke that would ‘letmegooglethatforyou.com’ your question. Over the last year D-3PO has become a full fledged Slack bot though. We can setup automatic replies for certain trigger words, if someone mentions JIRA during planned maintenance we can have it automatically reply when it will be available again. When someone asks a question about Jenkins, but doesn’t put a link to their instance in their message, D-3PO will ask them to do so. This saves us a lot of time in back and forth and gives us the room to focus on the actual problems.
Often we’ll get a request from a team that’s struggling with CI/CD or Releasing, or with any other D-Nitro adjacent subject and they’ll ask us for a session to discuss best practices, but also to try and pick our brain for better solutions. Although we love doing those, and we always try to make some time available it’s not the most scalable solution. When possible we will try and record sessions for later use. One of the great parts however is that we learn a lot from what we see other people struggle with, giving us both ideas and incentive to make things easier and more clear to the end user.
You’re never alone
Sometimes, when the Moon is positioned just right, and the workload is high enough we’ll hit a perfect storm where multiple things we’ll go wrong at once. Luckily we have both a great team and a great department. In those cases my colleagues will help out where possible to make sure our services are right back on track. Never have I experienced a moment while being part of D-Nitro and TechBase that it was just me by myself against the world.
We are hiring! If you read the above or the blogs of my teammates and are interested in joining us you are in luck, for the first time in 4,5 years we have a public vacancy so make sure to check it out.