Encoding the Scientific Method

For research work, we strongly recommend treating Flywheel nodes as evidence-backed hypotheses, claims, questions, or intermediate conclusions. Flywheel does not require a special field for that structure. Put the reasoning in Markdown, attach the evidence that supports or rejects it, and use the graph to keep competing explanations visible at the same time.

This gives you a flexible schema without losing the discipline of the scientific method. A node can still be a plain note, dataset record, operational checkpoint, or reading log when that is the right shape. When the node is making a research claim, default to evidence-backed writing.

The recommendation

Use each claim-bearing node to answer four questions:

What hypothesis, claim, or question is this node about?
What evidence currently supports it?
What evidence currently argues against it?
What should happen next?

Map those answers to Flywheel primitives:

Use node content for the hypothesis, reasoning, caveats, and next steps.
Use the summary as the current read, especially after new evidence lands.
Use artifacts for concrete evidence such as logs, tables, plots, reports, diffs, checkpoints, benchmark outputs, and source excerpts.
Use executions for runs that generated evidence or tested a claim.
Use tags for research state such as open, supported, rejected, needs-replication, or blocked.
Use graph edges and branches for competing hypotheses, refinements, and follow-up experiments.

A node as a hypothesis

A useful research node is usually written as a small argument, not just a title. Keep the central claim near the top, then separate evidence for and against it.

## Hypothesis

State the claim or question being tested.

## Evidence for

- Link or summarize observations that support the claim.
- Attach durable artifacts for anything empirical.

## Evidence against

- Note failed runs, counterexamples, uncertainty, and alternative explanations.
- Attach the evidence even when it weakens the claim.

## Current read

Say what you currently believe and how strongly.

## Next step

Name the next experiment, branch, review, or decision.

The template is a default for research claims, not a required format for every node. The important part is that another person can inspect the node later and see the claim, the evidence, the counterevidence, and the next decision point.

Evidence for and against

Attach evidence as artifacts instead of burying it in prose. Prose can explain why the evidence matters, but the durable output should stay inspectable:

Tables and CSVs for measurements.
Plots and images for visual inspection.
Logs and transcripts for execution traces.
Reports and notebooks for analysis.
Diffs and patches for code changes.
Checkpoints and model outputs for reproducibility.

When evidence changes your view, update the summary. The content can preserve the longer reasoning trail, while the summary gives readers the latest read without requiring them to reprocess the whole node.

Competing hypotheses as branches

When there are multiple plausible explanations, branch them explicitly. Start from a shared observation or question, then create child nodes for each candidate explanation.

For example:

Parent: Why did benchmark accuracy drop?
Branch A: The data loader changed sample ordering.
Branch B: The new prompt increases invalid outputs.
Branch C: The evaluation script changed normalization.

Attach evidence to each branch as work proceeds. A branch can become supported, rejected, merged into another explanation, or kept open for more work. This is usually clearer than keeping several mutually exclusive claims inside one long node.

Tags and summaries

Use tags for state that needs to be scanned or filtered. Use summaries for the current interpretation. Keep the longer Markdown content as the audit trail.

Good tags are short and operational:

open
supported
rejected
needs-evidence
needs-replication
blocked

Avoid making tags carry the whole argument. If the state is complex, write it in the summary and point to the artifacts or child nodes that justify it.

What not to do

Avoid these patterns:

One large node that holds several competing hypotheses.
A research claim with no attached evidence.
Evidence that exists only as a sentence in Markdown when there is a concrete artifact available.
State that only lives in old prose after the summary or tags have moved on.
Branches that duplicate the same claim without a distinct question, experiment, or counterargument.

Flywheel works best when the graph shows the shape of the investigation, the content explains the reasoning, and the artifacts preserve the evidence.