A philosophical and practical guide to writing tests that matter.

Core Principle

Test behavior, not implementation.

A meaningful test verifies what a service does, not how it does it internally.

The Meaningful Test Checklist

Before writing a test, ask:

  1. Does this test verify observable behavior? If it only tests internal implementation details, skip it.
  2. Would this test fail if the service broke for a real user? If not, it’s not meaningful.
  3. Is this test independent of environment? Tests that depend on process.env or external state are fragile.
  4. Am I testing the service or its dependencies? Mock boundaries should be at side effects, not data providers.

When to Mock vs Use Real Dependencies

Dependency TypeMock?Reason
Database connectionsYesSide effect, requires infrastructure
HTTP clientsYesSide effect, external network
File system writesYesSide effect, modifies state
ConfigServiceNoPure data provider, instantiate with test config
Value objectsNoNo side effects
Utility functionsNoDeterministic transformations

The ConfigService Example

Before (unnecessary mocking):

import { mockDeep } from "@tkoehlerlg/bun-mock-extended";
const mockConfig = mockDeep<ConfigService>({
get: (key: string) => config[key],
});

After (use real dependency):

import { ConfigService } from "@nestjs/config";
const createConfigService = (): ConfigService =>
new ConfigService({
EMBEDDING_PROVIDER: "mock",
EMBEDDING_DIMENSIONS: 1536,
});

The real ConfigService is a pure data container. It has no side effects. There’s no reason to mock it.

Anatomy of a Meaningful Test Suite

EmbeddingService Example

describe("EmbeddingService", () => {
describe("generateEmbedding", () => {
// ✅ Behavioral: verifies the contract (correct dimensions)
it("returns embedding with correct dimensions", async () => {
const embedding = await service.generateEmbedding("test text");
expect(embedding.length).toBe(1536);
});
// ✅ Behavioral: verifies mathematical property (normalization)
it("returns normalized embedding (magnitude ≈ 1)", async () => {
const embedding = await service.generateEmbedding("test text");
const magnitude = Math.sqrt(embedding.reduce((sum, v) => sum + v * v, 0));
expect(magnitude).toBeCloseTo(1.0, 5);
});
// ✅ Behavioral: verifies determinism (same input = same output)
it("is deterministic for same input", async () => {
const embedding1 = await service.generateEmbedding("hello world");
const embedding2 = await service.generateEmbedding("hello world");
expect(embedding1).toEqual(embedding2);
});
// ✅ Behavioral: verifies differentiation (different inputs differ)
it("returns different embeddings for different inputs", async () => {
const embedding1 = await service.generateEmbedding("hello");
const embedding2 = await service.generateEmbedding("world");
expect(embedding1).not.toEqual(embedding2);
});
// ✅ Behavioral: verifies error handling for invalid input
it("throws for empty text", async () => {
await expect(service.generateEmbedding("")).rejects.toThrow(
"Cannot generate embedding for empty text",
);
});
});
});

What We Removed (Not Meaningful)

// ❌ Tests environment behavior, not service behavior
it("should throw error when API key is missing", async () => {
// This depends on process.env state
});
// ❌ Tests ConfigService, not EmbeddingService
it("should use configured dimensions", async () => {
// If ConfigService works, this works
});
// ❌ Tests internal implementation
it("should call the OpenAI client with correct parameters", async () => {
// Testing HOW, not WHAT
});

The Side Effect Boundary

Mock at the boundary where side effects occur:

┌─────────────────────────────────────────────────────────┐
│ Your Service │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ ConfigService │ │ Pure Utils │ │ Value Objects │ │
│ │ (real) │ │ (real) │ │ (real) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ═══════════════ SIDE EFFECT BOUNDARY ═══════════════ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Database │ │ HTTP Client │ │ File System │ │
│ │ (MOCK) │ │ (MOCK) │ │ (MOCK) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘

Reducing Test Count

More tests ≠ better coverage.

The EmbeddingService went from 49 tests to 10 tests by asking “is this meaningful?” for each one.

Signs of Test Bloat

  • Testing the same behavior multiple ways
  • Testing framework/library behavior
  • Testing that mocks return what you told them to return
  • Testing environment-dependent behavior
  • Testing implementation details that could change

Mock Library Usage (When You Must Mock)

When you do need mocks (for side effects), use raw bun:test mocks. See Raw Mocks Guide for complete patterns.

import { mock } from "bun:test";
const mockDb: any = {
query: mock(async (sql: string) => mockResults[sql]),
findById: mock((id: string) => id === "123" ? mockUser : null),
count: mock(() => 42),
};

Key insight: Use mockImplementation() for argument-dependent behavior rather than library-specific features like calledWith().

Summary

  1. Ask “is this meaningful?” before every test
  2. Test behavior, not implementation
  3. Use real dependencies when they have no side effects
  4. Mock only at the side effect boundary
  5. Fewer meaningful tests > many meaningless tests