How It Works

MorphArch turns source code and Git history into a repository-level dependency model that you can inspect from the terminal.

The basic idea is simple:

raw dependency graphs are useful for debugging
grouped views are better for understanding large systems

MorphArch builds both, but exposes them at different levels of detail.

The Pipeline

1. Configuration

MorphArch loads morpharch.toml if present.

This can change:

ignore paths and presets
scoring weights and thresholds
boundary rules
clustering strategy
semantic families, rules, and clustering constraints
presentation aliases, kind mode, and color mode

If no config file exists, defaults are used.

2. Repository Discovery

MorphArch walks Git history using gix.

During discovery it:

follows the repository's first-parent history
enumerates commits and file changes
detects workspace structure
skips ignored subtrees early

This keeps repeated scans and history replay deterministic and practical.

3. Parsing

MorphArch uses language-aware import extraction.

In practice that means:

safe fast paths that ignore comments and strings
AST fallback when the fast path is not reliable
accurate dependency edges for supported languages without plain regex matching

Supported languages include Rust, TypeScript, JavaScript, Python, and Go.

4. Dependency Graph Synthesis

Parsed imports are mapped into a repository-level dependency graph.

nodes represent packages or modules
edges represent dependency relationships
weights represent how many concrete imports sit behind a higher-level edge

This graph becomes the basis for scoring, grouping, and inspect mode.

5. Architecture Evaluation

MorphArch computes health with six debt dimensions:

cycle
layering
hub
coupling
cognitive
instability

It also applies:

explicit boundary rules
scale-aware expectations
hotspot and blast radius analysis

6. Semantic Grouping and Clustering

This is what keeps the TUI usable on large repositories.

MorphArch groups the raw graph into clusters using a hybrid approach:

semantic grouping from names and paths
structural grouping when naming is weak
quality passes that split overly generic fallback clusters
optional collapsing of external dependency families

Users can override semantic families, rules, hard grouping constraints, and presentation labels through morpharch.toml.

7. Persistence and Replay

MorphArch stores scan data in a repo-scoped local cache.

That cache includes:

snapshot frames for each scanned commit
checkpoints for efficient reconstruction
saved scan state for incremental updates

This is what makes repeated scans and timeline replay practical without starting from scratch every time.

8. Presentation Surfaces

The TUI is built from three semantic surfaces.

`Map`

Cluster-level repository overview.

major subsystems
strongest links
readable repo shape

`Cluster details`

Subsystem detail view.

diagnosis
top members or dependencies
incoming/outgoing link pressure
selected member or dependency lens

`Inspect`

Focused debug view.

selected member centered
one-hop inbound/outbound graph context
raw graph rendering reserved for debugging

This is why MorphArch does not need to keep the full raw graph on screen all the time.

Why Raw Graphs Are Not the Default

Large node-link graphs become noisy quickly, especially in a terminal.

MorphArch avoids this by:

starting with clusters instead of individual modules
summarizing link pressure before drawing it
using text-first cluster views when geometry would be noisy
keeping raw graph rendering for inspect mode

That tradeoff is deliberate. The goal is to make repository structure easier to review without removing graph-level debugging when it is needed.

Technical Stack

Component	Technology
Runtime	Rust
Git engine	`gix`
Parsing	fast paths + `tree-sitter` fallback
Graph algorithms	`petgraph`
TUI	`ratatui` + `crossterm`
Persistence	SQLite via `rusqlite`

Performance Characteristics

MorphArch is optimized for repeated scans and historical navigation.

Important techniques:

subtree-level skipping for unchanged directories
blob parse caching
parallel parsing and graph processing
repo-scoped SQLite checkpoint + delta storage for fast replay
saved scan state for incremental updates

That is what makes timeline scrubbing and repeated watch sessions responsive.

The Pipeline​

1. Configuration​

2. Repository Discovery​

3. Parsing​

4. Dependency Graph Synthesis​

5. Architecture Evaluation​

6. Semantic Grouping and Clustering​

7. Persistence and Replay​

8. Presentation Surfaces​

Map​

Cluster details​

Inspect​

Why Raw Graphs Are Not the Default​

Technical Stack​

Performance Characteristics​