Behind the Scenes at Digital Gravy: Engineering Practices That Power Etch

Many Etch users are curious about how we build our products at Digital Gravy. Here’s an inside look at our engineering initiatives that help us build the right product, stay on schedule, and keep our team energized.

Written by

Matteo Greco

Published on

24 October 2024
Engineering Leadership

Our Core Philosophy

Our goal is deceptively simple: build the right thing, at the right time, with the right level of depth, while keeping the team happy and healthy. I know, easier said than done, right?

To get into the nitty-gritty, let’s break this down into five key areas:

  1. Deciding what to work on
  2. Doing the work
  3. Documenting the work
  4. Deploying the work
  5. Supporting our users

How We Decide What to Work on

Some teams work like a “Widget Factory” [1]: users express needs, product people turn them into tasks, and engineers turn those into code. This often fails because engineers are a passive entity in this process: they are just told what to do, with little context and no decision-making power. They depend on product people to tell them what’s next, and they can’t push back on unattainable goals, because all decisions are made behind closed doors.

We believe in decentralized decision-making. Our teams are empowered to make autonomous decisions about their work, which builds leadership skills and accelerates learning. Kevin and I provide guidance through a mix of our product + technical vision and our users’ feedback – a “this is where we are, and that is where we’d like to go” kind of thing. The team then gets the map and compass out, and figures out the best way to get there.

To identify the right next steps, we use a lightweight version of Event Modeling [2]. This methodology helps us plan the most efficient path forward by:

  • Mapping out the user journey through each feature
  • Identifying the necessary code changes to support that journey

As teams split these into tasks, they sometimes ask me for guidance on what to work on next. I usually take each task one by one and ask them these four questions:

  1. Is this task strategically aligned with our mission? Should we even do this?
  2. Is this the right time to take this task on? Is anything else more urgent?
  3. Is this task actionable? Is anything blocking it?
  4. Is this task small enough? Should we split it further?

These questions help us identify problematic tasks that either need to be broken down further, de-prioritized in favor of something else, or not done at all. The tasks that make it through are then added to our project management software, where we monitor their progress throughout a development cycle of one week.

But our most impactful decision has no doubt been committing to at least one release per week, every Thursday. Our users’ feedback is crucial in validating our direction, and releasing gives us an opportunity to gather it and learn. If we’re wrong, we’d rather take a week instead of a month to find out. In fact, even a week is often too long, and we’re working toward even more frequent releases.

How We Do the Work

Doing the work is often thought of as simply coding, but we found there’s so much more than that. We focus in particular on efficiency and collaboration:

Continuous Integration & Delivery (CI/CD) [3]

We’ve embraced a workflow that prioritizes working in the main branch of our codebase, or merging our work back into it as quickly as possible. If you’re new to the term, think of a branch as a snapshot of the code that developers can work on independently. Like if you and your coworkers could build multiple, independent versions of the same page for your client’s website, allowing you to test different ideas without affecting each other.

While this independence is valuable, you’re only going to deliver one version of each page to your client. That means, at some point you’ll have to merge all those independent versions of your page together. And when you and your coworkers built the same section in different ways, you’ll have to pick: your version, their version, or a combination of them. This is called a merge conflict.

Working on separate branches for long periods of time is bound to create merge conflicts, and resolving them can be complicated if they drift too far apart. By staying close to the main branch, we make these time-consuming merge conflicts more rare and easier to solve.

We also maintain a strict policy of keeping the main branch in a releasable state. This means that the main branch is as error-free as we can guarantee it, and we could package up a release at any time, confident it would install and work without issues. In order for this to happen, the teams have to prioritize issues that show up in the main branch over any other coding activity. This way, we also make sure not to propagate an issue further, affecting subsequent changes to the code.

Quality Assurance

In order to ensure that the main branch is always in a releasable state, we employ testing automation. Our infrastructure runs a series of tests on every change committed to our codebase. When a change causes a test to fail, the entire team is notified immediately and can address the issue.

As we write new code, we simultaneously create tests to verify it works as intended. And when a new bug pops up, we turn it into more tests too. This helps us enrich our testing infrastructure, making old bugs less likely to resurface in the future.

Team Collaboration

To accelerate development while preventing knowledge silos, we encourage each team member to pair with other members at least once per week. During these sessions, team members review each other’s code, share new approaches to solving problems, and ultimately elevate everyone’s skills.

We also take a pragmatic approach to technical debt. Everyone takes some on, often as a tool to move faster. However, teams often let it sit there for too long, until it becomes a serious problem that’s hard to even get started on.

Instead of letting it accumulate, our teams are encouraged to perform small refactoring tasks as part of their regular development cycle. Small changes are easier to manage and safer to deploy than large-scale rewrites. We found this approach working really well.

How We Document

Most engineers would probably agree that documenting work is their least favorite part of the job. In an effort to keep the right information available to the right people, while keeping our team members happy, we take a lean approach here:

  • Architecture Decision Records (ADRs) [4]: We use simple text files stored with our code that capture important decisions, their timing, rationale, and implications.
  • Project Management: We use Linear [5] for task tracking, with plenty of automations to help keep our engineers focused on the most valuable part of the job.

One aspect we’re working on improving is user-facing documentation.

How We Deploy

When it comes to making our changes available to our users, we focused on building an infrastructure that would reduce manual effort to the minimum. That means our releasers are happier, and are less likely to make mistakes. Packaging up a new release requires just a few clicks, and every deployment goes through automated checks and verifications before being made available.

Looking ahead, we’re working on an even more streamlined process that will allow us to deploy changes directly to a testing website that you clone. This would give our users the ability to check out the absolute latest changes we produced, while maintaining the necessary safety checks. Less overhead for us means faster feedback from our users, means faster improvements and an overall better product.

How We Support Our Users

As most software companies, we have a dedicated team of support specialists – if you’re a user, you likely know them well. But we try to take it a step further by encouraging each team member, regardless of their job description, to spend 15 minutes a day on our community forums, looking for users who need help. If they already have the answer they’re looking for, they can just provide it. If they don’t, but can figure it out, they should make a note to help in the next couple of days (often sooner, if the matter is urgent). And if they can’t help at all, they should at least tag someone who can, with Kevin and I as a backup option.

This accomplishes two things:

  1. It gives our users answers faster
  2. It gives our team members a reality check: How are real people using our software? What are they struggling with? What should we work on next?

The Science Behind Our Practices

Most of these practices are based on research from the DORA [6] institute and are considered industry best practices. While our implementation of some practices is more mature than others, they’ve all proven effective for our team.

Have questions about how we work? Drop them in the comments below or ask me in an upcoming “Etch Developer Hours” – I’m happy to dive deeper into any aspect of our engineering practices.

  1. “Yes caviar is great, here’s a ham sandwich” | Swizec Teller ↩︎
  2. Event Modeling – Designing Modern Information Systems ↩︎
  3. Minimum Viable CD |Minimum Viable Continuous Delivery ↩︎
  4. adr.github.io ↩︎
  5. Linear – Plan and build products ↩︎
  6. DORA | Get Better at Getting Better ↩︎