Skip to content

ADR-002: Docs from Code via AST

Date: 2026-03-29 Status: Accepted

Context

The FCC framework has a large public API surface: approximately 50 modules, 120 classes, and 500 public methods. API documentation must stay synchronized with the code. Manual documentation inevitably drifts -- a function signature changes, a parameter is added, and the docs become wrong.

We evaluated four approaches for generating API documentation from code:

  1. Sphinx with autodoc. The standard Python documentation tool. Generates HTML from docstrings by importing modules at build time.
  2. pdoc / mkdocstrings. Lighter alternatives to Sphinx that also rely on runtime inspection.
  3. Python ast module. Parse source files into abstract syntax trees without importing, extract API information structurally.
  4. Regular expressions. Pattern-match function signatures and docstrings from raw source text.

Key requirements:

  • Must work without importing modules (to avoid optional dependency issues and import side effects).
  • Must produce structured data that can feed into the knowledge graph and template system.
  • Must support selective generation (one module, one class, or the entire package).
  • Must integrate with FCC's existing Jinja2 template system.

Decision

We will use Python's built-in ast module for API documentation generation.

The implementation consists of:

  1. CodeAnalyzer -- parses .py files into ast.Module nodes, walks the tree, and extracts a structured APIModel containing ModuleInfo, ClassInfo, FunctionInfo, and ParameterInfo objects.
  2. APIDocGenerator -- takes the APIModel and produces Markdown documentation using Jinja2 templates from src/fcc/templates/docs/.
  3. CI integration -- a pipeline step that runs the analyzer, generates docs, and verifies that all public APIs have docstrings.

The ast module is a Python standard library module with no external dependencies. It handles: - Multi-line function signatures - Complex type annotations (generics, unions, protocols) - Decorators (@dataclass, @frozen, @traced, @property, @classmethod) - Nested class definitions - Module-level docstrings and constants

Consequences

Positive

  • No import required. The analyzer works on source files directly. Optional dependencies (opentelemetry, streamlit, cto) do not need to be installed for documentation generation.
  • No side effects. Parsing the AST does not execute any code, eliminating import-time side effects.
  • Structured output. The APIModel is a structured data object that can feed into the knowledge graph (Chapter 2 of Book 3), the search index (Chapter 1 of Book 3), and the template system.
  • Selective generation. The analyzer can target a single file, a directory, or the entire package.
  • Template integration. Reuses FCC's existing Jinja2 infrastructure. Custom output formats (Markdown, JSON-LD, YAML) are supported via template selection.
  • Zero additional dependencies. The ast module is in the Python standard library.

Negative

  • No runtime information. The AST does not capture runtime behavior (dynamic attributes, monkey-patching, metaclass effects). In practice, FCC uses static patterns (dataclasses, Protocol) that are fully represented in the AST.
  • Docstring parsing is heuristic. The analyzer must parse docstrings (Google-style, NumPy-style, or reStructuredText) into structured sections. This is not 100% reliable for non-standard docstring formats.
  • No cross-reference resolution for external types. If a function parameter references a type from an external library, the analyzer cannot resolve it to the external library's documentation.

Mitigations

  • FCC uses Google-style docstrings consistently, which the parser handles well.
  • External type references are rendered as plain text; a future enhancement could add configurable external type URL mappings.
  • A CI check verifies that all public APIs have docstrings, catching omissions early.