Meaningful Tests Guide

A philosophical and practical guide to writing tests that matter.

Core Principle

Test behavior, not implementation.

A meaningful test verifies what a service does, not how it does it internally.

The Meaningful Test Checklist

Before writing a test, ask:

Does this test verify observable behavior? If it only tests internal implementation details, skip it.
Would this test fail if the service broke for a real user? If not, it’s not meaningful.
Is this test independent of environment? Tests that depend on process.env or external state are fragile.
Am I testing the service or its dependencies? Mock boundaries should be at side effects, not data providers.

When to Mock vs Use Real Dependencies

Dependency Type	Mock?	Reason
Database connections	Yes	Side effect, requires infrastructure
HTTP clients	Yes	Side effect, external network
File system writes	Yes	Side effect, modifies state
ConfigService	No	Pure data provider, instantiate with test config
Value objects	No	No side effects
Utility functions	No	Deterministic transformations

The ConfigService Example

Before (unnecessary mocking):

import { mockDeep } from "@tkoehlerlg/bun-mock-extended";

const mockConfig = mockDeep<ConfigService>({
  get: (key: string) => config[key],
});

After (use real dependency):

import { ConfigService } from "@nestjs/config";

const createConfigService = (): ConfigService =>
  new ConfigService({
    EMBEDDING_PROVIDER: "mock",
    EMBEDDING_DIMENSIONS: 1536,
  });

The real ConfigService is a pure data container. It has no side effects. There’s no reason to mock it.

Anatomy of a Meaningful Test Suite

EmbeddingService Example

describe("EmbeddingService", () => {
  describe("generateEmbedding", () => {
    // ✅ Behavioral: verifies the contract (correct dimensions)
    it("returns embedding with correct dimensions", async () => {
      const embedding = await service.generateEmbedding("test text");
      expect(embedding.length).toBe(1536);
    });

    // ✅ Behavioral: verifies mathematical property (normalization)
    it("returns normalized embedding (magnitude ≈ 1)", async () => {
      const embedding = await service.generateEmbedding("test text");
      const magnitude = Math.sqrt(embedding.reduce((sum, v) => sum + v * v, 0));
      expect(magnitude).toBeCloseTo(1.0, 5);
    });

    // ✅ Behavioral: verifies determinism (same input = same output)
    it("is deterministic for same input", async () => {
      const embedding1 = await service.generateEmbedding("hello world");
      const embedding2 = await service.generateEmbedding("hello world");
      expect(embedding1).toEqual(embedding2);
    });

    // ✅ Behavioral: verifies differentiation (different inputs differ)
    it("returns different embeddings for different inputs", async () => {
      const embedding1 = await service.generateEmbedding("hello");
      const embedding2 = await service.generateEmbedding("world");
      expect(embedding1).not.toEqual(embedding2);
    });

    // ✅ Behavioral: verifies error handling for invalid input
    it("throws for empty text", async () => {
      await expect(service.generateEmbedding("")).rejects.toThrow(
        "Cannot generate embedding for empty text",
      );
    });
  });
});

What We Removed (Not Meaningful)

// ❌ Tests environment behavior, not service behavior
it("should throw error when API key is missing", async () => {
  // This depends on process.env state
});

// ❌ Tests ConfigService, not EmbeddingService
it("should use configured dimensions", async () => {
  // If ConfigService works, this works
});

// ❌ Tests internal implementation
it("should call the OpenAI client with correct parameters", async () => {
  // Testing HOW, not WHAT
});

The Side Effect Boundary

Mock at the boundary where side effects occur:

┌─────────────────────────────────────────────────────────┐
│                    Your Service                          │
│                                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │ ConfigService │  │ Pure Utils   │  │ Value Objects │  │
│  │  (real)       │  │  (real)      │  │  (real)       │  │
│  └──────────────┘  └──────────────┘  └──────────────┘  │
│                                                          │
│  ═══════════════ SIDE EFFECT BOUNDARY ═══════════════   │
│                                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │ Database     │  │ HTTP Client  │  │ File System   │  │
│  │  (MOCK)      │  │  (MOCK)      │  │  (MOCK)       │  │
│  └──────────────┘  └──────────────┘  └──────────────┘  │
└─────────────────────────────────────────────────────────┘

Reducing Test Count

More tests ≠ better coverage.

The EmbeddingService went from 49 tests to 10 tests by asking “is this meaningful?” for each one.

Signs of Test Bloat

Testing the same behavior multiple ways
Testing framework/library behavior
Testing that mocks return what you told them to return
Testing environment-dependent behavior
Testing implementation details that could change

Mock Library Usage (When You Must Mock)

When you do need mocks (for side effects), use raw bun:test mocks. See Raw Mocks Guide for complete patterns.

import { mock } from "bun:test";

const mockDb: any = {
  query: mock(async (sql: string) => mockResults[sql]),
  findById: mock((id: string) => id === "123" ? mockUser : null),
  count: mock(() => 42),
};

Key insight: Use mockImplementation() for argument-dependent behavior rather than library-specific features like calledWith().

Summary

Ask “is this meaningful?” before every test
Test behavior, not implementation
Use real dependencies when they have no side effects
Mock only at the side effect boundary
Fewer meaningful tests > many meaningless tests