Skill Improvement System

Telemetry, overlays, the curator, and autonomous mutation — how the skill set improves itself over time.

A static skill set decays: the world moves, edge cases accumulate, and yesterday's phrasing stops triggering on today's request. LisaOS treats skill quality as a closed loop — every run is observed, shortcomings are surfaced, and improvements are proposed, gated, and merged.

Telemetry

Skill invocations are logged as activity against the work that spawned them. Over time this produces a record of which skills fire, on what, how often, and with what outcome. Telemetry is the raw signal the rest of the system reasons over — it is descriptive, not yet a change.

Overlays

An overlay is a non-destructive amendment to a skill: a targeted adjustment layered on top of the base skill rather than a rewrite of it. Overlays let an improvement be tried, measured, and reverted without disturbing the canonical skill body. A proven overlay can later be folded into the base; an unproven one is discarded at no cost to the original.

The curator

The curator assesses the health of the skill and agent fleet and produces standing health assessments. At session start these surface as a health summary, so a session begins with awareness of which skills or agents are drifting. The curator is the judgement layer between raw telemetry and a concrete proposal.

The improvement pipeline

Proposals from the improvement pipeline are surfaced to the operator at session start, highest-risk first. Agent-definition proposals (which change how an agent behaves) rank above ordinary skill or source proposals, because their blast radius is larger. Each proposal routes through the same approval discipline a hand-authored change would face — nothing merges unreviewed.

Autonomous mutation

For skills whose quality can be measured mechanically, the lifecycle tooling supports an AutoResearch mutation loop: it generates candidate variations, grades them against a held-out set, and keeps a variant only if it beats the incumbent on the evaluation — and only if the margin is real rather than noise. A change that does not clear the bar is not shipped. This is the same principle as the grading tools, run in a loop and pointed at the skill's own description and body.

Two doors, one loop

The heavier improvement machinery — telemetry aggregation, overlays, the curator, the mutation loop — is anchored on the always-on server door, so it keeps running when the local shell is closed. The local shell consumes the results (the health summary, the surfaced proposals) at session start. Both doors read and write the same core, so an improvement proposed on one is visible on the other. See architecture.