From Vibes to Specs

Two Ways I Built the Same Search-and-Replace CLI Tool

Oct 07, 2025

Vibe coding is often the fastest way to get an idea off the ground. Even though I work at *codeplain, where we focus on spec-driven development, I still vibe-code all the time.

One of those moments came up in a Node.js project I was working on, where I realized the LLM had hard-coded testing secrets into my repo. Manually patching each file felt tedious and error-prone. Instead, I saw an opportunity to automate it: Build a CLI that takes a target folder and a YAML secrets mapping, then recursively replaces real secrets with mocked ones. I called it Search-and-Replace CLI.

My goal was to build the CLI quickly and reliably without spending too much time writing code by hand. Vibe coding — building by chatting with an AI and iterating in natural language — is my usual way to get projects off the ground fast. However, I also wanted to see if a structured, test-backed approach could save me from the trial-and-error loop I usually hit with vibe coding.

So I built the same CLI twice — once by vibes, once by specs — to find out.

Approach (a): Vibe Code the CLI

My first approach was vibe coding. I wrote a prompt for Cursor, hoping for a one-shot solution.

Goal: Recursively replace secrets in all text files of a target folder

Inputs:
  --target-folder <path> (validate path exists)
  --secrets-file <yaml>. See secrets_config.yaml for the schema.

Rules:
  - Python CLI
  - Read YAML mappings (real → mock)
  - Skip binaries
  - Overwrite files in place
  - Log processed files and replacement counts for every file to stdout

Cursor produced a runnable draft in two minutes. In order to validate the produced code, I manually created test folder fixtures (e.g., example1-no-replace/, example2-replace-secret2/), ran the CLI while inspecting the diffs between outputs and fixtures.

The first snag was incorrect YAML schema validation of the produced code. Fixing it took a few iterative passes with the LLM and hand checks to confirm the mapping rules. Once “working,” I tried a refactor for readability, which broke earlier cases and sent me back into the generate → test → fix loop. After a couple of attempts, I reverted to the “working” version and retried the refactoring step.

Regenerations weren’t stable either. In one run, Cursor scaffolded test folders for me; in the next, it dropped them entirely. YAML validation appeared in some versions and disappeared in others. This variability meant I had to rely on manual testing for every change, since I couldn’t assume consistency between generations.

In short, vibe coding gave me speed up front — I had a running CLI in minutes — but without guardrails to ensure consistency, each edit risked breaking what was already working. Manual testing became the only way to build trust in the tool, and that quickly turned into the bottleneck.

Approach (b): Spec‑Driven

For the second attempt, I built the search-and-replace CLI with ***plain specification language we’re developing at *codeplain. Unlike vibe coding, where the AI generates code directly from a prompt, ***plain starts from a specification written in extended Markdown1.

I broke the work into three verifiable units:

CLI scaffold — set up the entry point and command structure. ***plain supports templating, so this will be automatically added with the “include” statement below.
Arguments & config — parse --target-folder and --secrets-file
Recursive replace engine — walk the directory tree and replace in place

In ***plain, we refer to each “building block” as a functional requirement. The decomposition in smaller units converts to the following ***plain specification2:

{% include “python-console-app-template.plain”, main_executable_file_name: “search_and_replace.py” %}

# Search and replace recursively

***Definitions:***

- The Secrets File is a YAML file that contains a mapping from the real secrets to the mock secrets.
- The Target Folder is a folder that contains the files in which to replace secrets.

***Functional Requirements:***

- The App should accept options `--target-folder` (The Target Folder) and `--secrets-file` (The Secrets File). The Secrets File [secrets_config.yaml](secrets_config.yaml) contains the expected schema of The Secrets File.

- The App should replace all original secrets with mock values recursively across The Target Folder.

For each functional requirement (FR), *codeplain generates not only the implementation code but also unit and conformance tests, and then runs them. It doesn’t proceed to the next FR until the current tests pass. When starting a new FR, all previously generated tests are run to catch regressions early.

For example, if the generation of FR3 (“replace recursively”) breaks the FR2 (“reading --target-folder and --secrets-file”), its tests fail and halt progress until corrected — a guardrail that matters more as the project grows.

In ***plain, both implementation and conformance tests come from the same specification, but the conformance tests are generated independently. They treat the program as a black box and verify only observable outcomes — exit codes, logs, and file changes — against the spec, not its implementation code.

This all happens with the specification as the source of truth: The implementation is derived from the spec, so understanding the spec maps directly to understanding the program’s behavior.

After about ten minutes of generation and tests, I ran the CLI on example1-no-replace/ and example2-replace-secret2/ and it produced the expected replacements on the first run.

Building with specs takes a bit more upfront investment to write and render, but results in more reliable code generation and less time-consuming iterations.

Updating the Search-and-Replace CLI

After testing both versions on sample folders, I tried running both CLIs on my Node.js project. Both versions immediately got stuck replacing strings in node_modules/ — a folder that should clearly be ignored.

With vibe coding, the codebase is the source of truth, so every fix triggers regeneration and a full manual diff review. Even though I accepted 95% of the generated code as-is, I still had to review the entire 100%. Excluding node_modules/ meant regenerating, reviewing the diff, and manually rerunning fixtures to verify the change.

With spec-driven development, the spec is the source of truth: update the spec and rerender only the affected parts. Instead of “prompt-and-pray”, we update the relevant functional requirement (FR3) and rerender just that portion.

- The App should replace all original secrets with mock values recursively across The Target Folder, excluding the node_modules/ directory.

Automatically generated conformance tests assert that files under node_modules/ are never modified. With that constraint in place, the tool did what I intended: safely search-and-replace secrets without touching the node_modules/ folder.

To summarize, here’s how steering the expected behaviour differs between vibe coding and spec-driven development:

Vibe coding: New prompt → update implementation → review a large diff → run manual tests for regressions.
Spec-driven: Update/add a FR → rerender impacted FRs → proceed with confidence.
A table of a head-to-head comparison between vibe coding and spec-driven development.

Building the same tool twice — once with vibes and once with specs — turned out to be a valuable learning experience in itself. This experiment showed me that vibe coding and spec-driven development aren’t opposites—they’re complementary approaches with different trade-offs.

While building the CLI, vibe coding gave me speed up front: I had a working CLI in minutes and could explore quickly. But without guardrails, every edit risked breaking something, and manual testing soon became the bottleneck.

Spec-driven development took a little more upfront effort, but once the specification was in place, the generated functionality was deterministic. Automatically generated tests provided guardrails that the generated code fully conforms to the specs. When I needed to exclude node_modules/, I only had to update the spec, while the tests gave me confidence that nothing else broke.

The trade-off is clear: vibe coding is great for hacking together ideas or prototypes, while spec-driven development shines when reliability, collaboration, and trust matter. Think Python: interactive mode when you’re exploring, but shifting to scripting mode when the idea matures. The point isn’t either/or — it’s choosing the right tool at the right time.

Beyond Vibe Coding: Introducing ***plain

The include statement here adds the CLI scaffold (FR1) and sets search_and_replace.py as the entry point.

A guest post by

Tjaž Eržen

founding engineer @ *codeplain

*codeplain

Discussion about this post

Ready for more?