Technology Apr 18, 2026 · 5 min read

Should We Still Care About Our Code?

As part of a new feature, I was looking at a "Human in the Loop" wrapper: func (h *HITLWrapper) Exec( ctx context.Context, startTime time.Time, input any, debug bool, hook *taskengine.HookCall, ) (any, taskengine.DataType, error) That’s a lot of code... but it’s not the...

DE
DEV Community
by Alexander Ertli
Should We Still Care About Our Code?

As part of a new feature, I was looking at a "Human in the Loop" wrapper:

func (h *HITLWrapper) Exec(
    ctx context.Context,
    startTime time.Time,
    input any,
    debug bool,
    hook *taskengine.HookCall,
) (any, taskengine.DataType, error)

That’s a lot of code... but it’s not the interesting part here.

I wanted to take a look at something smaller, something simpler... a little helper:

// buildDiff generates a simple human-readable unified diff for file-write
// operations (write_file and sed). It avoids external dependencies by doing a
// straightforward line-level comparison.

That’s a solid contract. And here is what we got:

// buildDiff generates a simple human-readable unified diff for file-write
// operations (write_file and sed). It avoids external dependencies by doing a
// straightforward line-level comparison.
func buildDiff(hookName, toolName string, args map[string]any) string {
    switch {
    case hookName == "local_fs" && toolName == "write_file":
        path, _ := args["path"].(string)
        newContent, _ := args["content"].(string)
        if path == "" || newContent == "" {
            return ""
        }
        oldBytes, _ := os.ReadFile(path)
        return unifiedDiff(path, string(oldBytes), newContent)

    case hookName == "local_fs" && toolName == "sed":
        path, _ := args["path"].(string)
        pattern, _ := args["pattern"].(string)
        replacement, _ := args["replacement"].(string)
        if path == "" || pattern == "" {
            return ""
        }
        oldBytes, err := os.ReadFile(path)
        if err != nil {
            return ""
        }
        newContent := strings.ReplaceAll(string(oldBytes), pattern, replacement)
        return unifiedDiff(path, string(oldBytes), newContent)
    }
    return ""
}

Yep, this is AI-generated code. It's part of ~3,000 new or changed lines. Everything compiles, everything is tested, coverage looks good.

So… why even write about this?

It’s in the title: Should we still care about our code?

When you can generate 1,000 lines within minutes—and every line compiles and passes tests—code like this starts to look “good enough.”

Is that a problem?

You know what, let’s make the AI review its own code:

1. Empty content check is wrong

if path == "" || newContent == "" {
    return ""
}

Writing an empty file is a valid operation. This silently skips the diff for a legitimate write_file that empties a file.

2. Missing file → empty diff, not a "new file" diff

When the target file doesn't exist, os.ReadFile returns an error—and we return "".

That means:

  • For a brand new file, the user sees no diff at all.
  • For sed on a missing file, same problem.

3. sed uses string replace, not regex

...

4. unifiedDiff is a black box

...

That’s it? Now everything is good?
😄

You know what… if a human had written this, I would very politely reject both versions.

Why? Sure, we could run another round of AI review. Then another. And another.

But that’s not the problem.

Let’s go back to where we started:

func (h *HITLWrapper) Exec(
    ctx context.Context,
    startTime time.Time,
    input any,
    debug bool,
    hook *taskengine.HookCall,
) (any, taskengine.DataType, error)

What stands out here?

It returns an error.

Now compare that to the helper:

func buildDiff(hookName, toolName string, args map[string]any) string

See the difference?

Let’s zoom out even further:

// Package localhooks provides local hook integrations.
package localhooks

import (
    "context"
    "errors"
    "fmt"
    "os"
    "strings"
    "time"

    "[github.com/contenox/contenox/hitlservice](https://github.com/contenox/contenox/hitlservice)"
    "[github.com/contenox/contenox/libtracker](https://github.com/contenox/contenox/libtracker)"
    "[github.com/contenox/contenox/taskengine](https://github.com/contenox/contenox/taskengine)"
    "[github.com/getkin/kin-openapi/openapi3](https://github.com/getkin/kin-openapi/openapi3)"
)

See that import "os"?

And in the helper:

oldBytes, _ := os.ReadFile(path)

Outside of the ignored error—how do we know that path actually refers to something we can safely read via os.ReadFile?

It could be:

  • A relative path (./config.txt) – relative to what? The tool’s working directory? The agent’s sandbox?
  • An absolute path (/etc/hosts) – but the process might be containerized or restricted.
  • A virtual path (workspace://project/main.go) – the local_fs hook might understand this, but os.ReadFile won’t.
  • A dangerous path involving symlinks, .. traversal, or special files (/dev/random, /proc/self/mem).

The code compiles. The tests pass. But the assumptions are undefined.

And the model filled them in anyway.

So—should we still care about our code?

Yes. Definitely.

But not in the way we used to.

We’re not going to review thousands of generated lines line by line. We don’t—and realistically, we can’t.

What we can do is define the boundaries the code is allowed to operate within:

  • What does path actually mean?
  • What filesystem is accessible?
  • What errors must be handled vs ignored?

If we don’t define those constraints, the model will.

And its guesses will compile. They’ll pass tests. They’ll even look reasonable.

That’s the real danger.

So no—we don’t scale code review anymore.

So no—we don’t scale code review anymore.

We scale constraints.

Design first. Generate second.

And honestly, this is exactly the class of problem that led us to introduce a Human-in-the-Loop layer in the first place—not to review every line, but to enforce the boundaries the model can’t reliably infer.

DE
Source

This article was originally published by DEV Community and written by Alexander Ertli.

Read original article on DEV Community
Back to Discover

Reading List