Our Approach

Expeditions, Not Sprints

We don't run sprints. We launch expeditions — hypothesis-driven units of work with clear success criteria, tracked on a kanban board. Each expedition has:

A hypothesis to validate or refute
Success metrics defined before work begins
A captain (human or AI agent) responsible for execution
Test coverage across three tiers before it can merge

The metaphor is deliberate. An expedition discovers something. A sprint just burns time.

Hypothesis-Driven Development

Every piece of work starts with a testable claim:

H801.1: CQ-guided extraction produces ≥12 unique predicates per document, compared to ≤6 from unguided extraction.

We define the hypothesis, build the minimum implementation to test it, measure results, and record findings — regardless of whether the hypothesis is confirmed or refuted. Failed hypotheses are as valuable as confirmed ones.

Three-Tier Testing

We are a heavy TDD/BDD shop. No feature merges without test coverage at all three tiers:

Tier	Type	What It Tests
T1	Unit Tests	Individual functions and classes in isolation
T2	Integration Tests	Component interactions, fixture integrity
T3	Live Being Tests	A real being, via CLI, demonstrating the capability

The critical insight is Tier 3. We don't just test that the code works — we test that a being can actually do the thing. Live being tests drive the being through CLI commands like BDD browser automation:

# T3 pattern: CLI-first, no internal API imports
result = run_being_cmd("study", being_id, document_path)
assert result.returncode == 0

result = run_being_cmd("report", being_id, "--format", "json")
report = json.loads(result.stdout)
assert report["y1_count"] >= expected_minimum

No feature is complete until a live being demonstrates the capability.

Multi-Agent Development

The NuSy project is built by a fleet of agents — AI developers working in parallel, coordinated through a shared kanban board and Git:

Agent	Platform	Role
DGX Claude	NVIDIA DGX H100	GPU training, heavy computation
M5 Claude	MacBook M5	Development, testing, architecture
M4-Mini Claude	Mac Mini M4	Code review, testing, documentation

Agents pick up expeditions, execute them, submit PRs, and review each other's work. The human captain directs priorities and makes architectural decisions. The agents do the engineering.

Knowledge as Code

Every piece of definitional knowledge — domain ontologies, learning thresholds, safety rules, hypothesis definitions — lives in Yurtle format: Markdown with TTL (Turtle RDF) frontmatter, loadable into a being's knowledge graph.

This means:

Beings can reason about their own configuration — it's in the graph, not hidden in Python
Knowledge is auditable — versioned in Git, human-readable, diff-able
Changes are safe — modify thresholds without code deployment
Everything is queryable — SPARQL over all definitional knowledge

The principle: if it's knowledge, it goes in Yurtle. If it's procedure, it goes in Python.

Open Development

Our codebase, research, and expedition history are open. Every blog post on this site links to its source Yurtle file on GitHub. The research notes distill what we've learned from building NuSy — the successes and the failures.

We believe in radical transparency: if your AI can't show its work, you shouldn't trust its answers. The same applies to the people building the AI.