Meaningful Tests Guide
A philosophical and practical guide to writing tests that matter.
Core Principle
Test behavior, not implementation.
A meaningful test verifies what a service does, not how it does it internally.
The Meaningful Test Checklist
Before writing a test, ask:
- Does this test verify observable behavior? If it only tests internal implementation details, skip it.
- Would this test fail if the service broke for a real user? If not, it’s not meaningful.
- Is this test independent of environment? Tests that depend on
process.envor external state are fragile. - Am I testing the service or its dependencies? Mock boundaries should be at side effects, not data providers.
When to Mock vs Use Real Dependencies
| Dependency Type | Mock? | Reason |
|---|---|---|
| Database connections | Yes | Side effect, requires infrastructure |
| HTTP clients | Yes | Side effect, external network |
| File system writes | Yes | Side effect, modifies state |
| ConfigService | No | Pure data provider, instantiate with test config |
| Value objects | No | No side effects |
| Utility functions | No | Deterministic transformations |
The ConfigService Example
Before (unnecessary mocking):
import { mockDeep } from "@tkoehlerlg/bun-mock-extended";
const mockConfig = mockDeep<ConfigService>({ get: (key: string) => config[key],});After (use real dependency):
import { ConfigService } from "@nestjs/config";
const createConfigService = (): ConfigService => new ConfigService({ EMBEDDING_PROVIDER: "mock", EMBEDDING_DIMENSIONS: 1536, });The real ConfigService is a pure data container. It has no side effects. There’s no reason to mock it.
Anatomy of a Meaningful Test Suite
EmbeddingService Example
describe("EmbeddingService", () => { describe("generateEmbedding", () => { // ✅ Behavioral: verifies the contract (correct dimensions) it("returns embedding with correct dimensions", async () => { const embedding = await service.generateEmbedding("test text"); expect(embedding.length).toBe(1536); });
// ✅ Behavioral: verifies mathematical property (normalization) it("returns normalized embedding (magnitude ≈ 1)", async () => { const embedding = await service.generateEmbedding("test text"); const magnitude = Math.sqrt(embedding.reduce((sum, v) => sum + v * v, 0)); expect(magnitude).toBeCloseTo(1.0, 5); });
// ✅ Behavioral: verifies determinism (same input = same output) it("is deterministic for same input", async () => { const embedding1 = await service.generateEmbedding("hello world"); const embedding2 = await service.generateEmbedding("hello world"); expect(embedding1).toEqual(embedding2); });
// ✅ Behavioral: verifies differentiation (different inputs differ) it("returns different embeddings for different inputs", async () => { const embedding1 = await service.generateEmbedding("hello"); const embedding2 = await service.generateEmbedding("world"); expect(embedding1).not.toEqual(embedding2); });
// ✅ Behavioral: verifies error handling for invalid input it("throws for empty text", async () => { await expect(service.generateEmbedding("")).rejects.toThrow( "Cannot generate embedding for empty text", ); }); });});What We Removed (Not Meaningful)
// ❌ Tests environment behavior, not service behaviorit("should throw error when API key is missing", async () => { // This depends on process.env state});
// ❌ Tests ConfigService, not EmbeddingServiceit("should use configured dimensions", async () => { // If ConfigService works, this works});
// ❌ Tests internal implementationit("should call the OpenAI client with correct parameters", async () => { // Testing HOW, not WHAT});The Side Effect Boundary
Mock at the boundary where side effects occur:
┌─────────────────────────────────────────────────────────┐│ Your Service ││ ││ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││ │ ConfigService │ │ Pure Utils │ │ Value Objects │ ││ │ (real) │ │ (real) │ │ (real) │ ││ └──────────────┘ └──────────────┘ └──────────────┘ ││ ││ ═══════════════ SIDE EFFECT BOUNDARY ═══════════════ ││ ││ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││ │ Database │ │ HTTP Client │ │ File System │ ││ │ (MOCK) │ │ (MOCK) │ │ (MOCK) │ ││ └──────────────┘ └──────────────┘ └──────────────┘ │└─────────────────────────────────────────────────────────┘Reducing Test Count
More tests ≠ better coverage.
The EmbeddingService went from 49 tests to 10 tests by asking “is this meaningful?” for each one.
Signs of Test Bloat
- Testing the same behavior multiple ways
- Testing framework/library behavior
- Testing that mocks return what you told them to return
- Testing environment-dependent behavior
- Testing implementation details that could change
Mock Library Usage (When You Must Mock)
When you do need mocks (for side effects), use raw bun:test mocks. See Raw Mocks Guide for complete patterns.
import { mock } from "bun:test";
const mockDb: any = { query: mock(async (sql: string) => mockResults[sql]), findById: mock((id: string) => id === "123" ? mockUser : null), count: mock(() => 42),};Key insight: Use mockImplementation() for argument-dependent behavior rather than library-specific features like calledWith().
Summary
- Ask “is this meaningful?” before every test
- Test behavior, not implementation
- Use real dependencies when they have no side effects
- Mock only at the side effect boundary
- Fewer meaningful tests > many meaningless tests