Pernix

per·nix /ˈpɛɾ.nɪks/ — Latin: nimble, swift of foot.

An agent server you own outright.
It researches, schedules, remembers, and keeps working after you close the tab — on a local model, a frontier API, or both.

Run it Read the anatomy

I. The premise

Three things you keep.

The model

Local with Ollama, cloud via your own OpenRouter key, or both — mixed freely. A fast model scouts and plans. A capable one executes. A lighter one handles background work. Workers pick whichever model fits their task. The workflow doesn't care which brain is inside: swap freely, your keys, your routing, your call.

The memory

Markdown files in data/memories/. Plain text, full-text indexed, tagged with source and type — yours forever. Stays put when you switch models. Stays put when a provider changes its terms. Not locked to any subscription, not sludge accumulating in a chat thread. Read, edit, or delete with any text editor — and the agent can correct or retire its own entries as it learns better.

III

The machine

No accounts. No cloud subscription. No usage telemetry. A Python server that binds to localhost until you say otherwise — then network mode with HTTPS and bearer-token auth. The harness is yours to inspect, fork, and rebuild.

II. The anatomy

What you're getting.

Pernix is one Python codebase that fits in your head. A FastAPI server, a state machine, a streaming agent loop, a memory store, a workspace. On top of it: a built-in PWA, a REST API, and a Swagger UI you can poke from any browser.

Run it on a dedicated VM, container, or spare Linux box. Open http://localhost:8090. Talk to it. Watch it think. Read the code that made it think that way.

stackPython 3.11+ · FastAPI · SQLite · SSE · Playwright
interfacesWeb PWA · REST · Server-Sent Events
storageSQLite for sessions · Markdown for memory · filesystem workspace
licenseMIT

localhost:8090

The Pernix web UI: a session where the agent has just built and cron-scheduled a morning-brief workflow, with model, context, and state readouts in the status bar.

An unedited session: asked for a morning brief, the agent built a three-step workflow and put it on cron. Local model, one take.

III. The unfolding

How a turn thinks itself out.

Every message you send rolls through five phases. Each one runs on a model suited to its job — fast for planning, capable for acting, light for verifying.

01

Session

Your message lands on a persistent thread — text, images, audio, PDFs. Append-only. Resumable. Restart-proof.
queue
02

Scout

A small fast model in a fresh context plans the approach — recalls what you've told it before, picks tools, loads only the relevant skills.
fast model
03

Loop

The main model executes. Streams tokens, calls tools, reads results, calls more tools — until done. If the cloud rate-limits, it falls back to your local model mid-loop.
main model
04

Reflect

A quality gate verifies intent was met. Returns pass, retry, or escalate. Up to two retries before surfacing.
verify
05

Post‑hooks

Auto-titling, memory distillation, worker cleanup. The cleanup runs in the background after you've already seen the answer.
background

event log hover a phase to sample its events

Compaction trims old turns when context fills past 75%. The originals stay in the database — only the prompt view changes.

Snooze runs between turns. While the agent is idle, it dedupes memory, distills your profile, mines finished sessions for lessons and skill improvements, and archives post-mortems. The moment you send a new message, Snooze stops — your work always wins.

Recovery assumes the worst. Kill the server mid-turn and restart it: interrupted sessions are swept to safety at boot, parents parked on workers are recovered, and clients replay any events they missed by sequence number. Nothing pretends it didn't happen.

IV. The orchestration

Four layers between you and the work.

A conversation, a swarm, a recipe, a standing order. Each one a different way to put the agent to use — chat, parallelize, automate, schedule.

layer 01

Sessions

a conversation thread

A persistent thread. Append-only. Every message and tool call survives a restart. One agent loop runs at a time per session — but you can have many sessions open and switch between them.

SQLite-backed
pause & resume any session
per-session model overrides
live cost per session

layer 02

Workers

parallel sub‑agents

Within a turn, the main agent can spawn workers — sub-agents in their own sessions, each on whichever model fits its task best. A slow planner orchestrates fast editors. A vision model and a code-specialist run side by side — and the UI shows the whole fleet live as it fans out.

different models per worker
flat — workers don't spawn workers
pause & resume at round boundaries
live fleet strip in the UI

layer 03

Workflows

repeatable markdown recipes

A workflow is a WORKFLOW.md recipe in data/workflows/ — YAML-frontmatter steps with explicit dependencies and parallel waves. Describe a job you do every Monday and the agent can draft the workflow itself, submitted for your approval before it ever runs. The same recipe runs unchanged whether the brain inside is a local model, a frontier API, or a mix of both.

frontmatter steps with depends_on
parallel waves auto-dispatched
agent-drafted, user-approved
model-agnostic by design

layer 04

Cron

a standing order

Put a session or a whole workflow on a schedule. Each run gets its own session — the morning brief is built before you wake, the weekly digest writes itself, the watchdog never sleeps. When something needs you, the agent can ping your phone through the PWA's push notifications or a webhook.

schedule sessions or workflows
cron presets in the UI
runs unattended, results in workspace
push / webhook alerts

One conversation. Many workers. Many recipes. A clock above them all. Pernix is happiest when you use all four.

V. The state machine

Ten states. One session at a time.

Every session is in exactly one state, and every edge is a (state, reason) pair in an exhaustive table — anything not in the table is rejected. Transitions are logged, replayed to the UI in real time, and recovered after a crash. Hover or tap a state to see its real exits; left alone, the diagram walks an actual turn.

Edges drawn from the table in sessions/state_v2.py. Two housekeeping reasons are omitted for legibility: reaper-unstick and cancel-timeout return any stuck state to idle_ready — explicit rows in the table, not a generic force.

VI. The thesis

Build the loop once.
Swap the brain freely.

A chatbot is where you ask for help. A harness is where work happens — a job to do, a place to run, memory of what came before, and enough structure that the model can change without the workflow noticing.

The loop is stable

Define the workflow once — steps, tools, retries, schedule. It runs the same today and next quarter. The loop outlives the brain.

The brain is swappable

Cheap local passes, frontier reasoning, a vision specialist in one worker and a code model in the next. The workflow doesn't care which brain is inside.

The memory is yours

Decisions and lessons in plain Markdown on your disk. Switch models or providers; the memory follows. No lock-in. No drift.

VII. The capabilities

A toolbelt. Tools the agent reaches for.

Persistent memory

Markdown facts, decisions, lessons. Searched before each turn, with each entry's age and provenance shown at recall. Idle-time consolidation merges duplicates and ages out stale lessons.

remember · recall · update_memory · forget

Web search & browser

Tavily for search. Playwright for JS-heavy pages, SPAs, paywalled markdown.

search_web · http_get · browse_web

Workers

Spawn parallel sub-agents on different models. Slow planner, fast editor, vision specialist — same conversation.

spawn_worker · message_worker · await_workers

Skills

Markdown capability packs with YAML frontmatter. The agent loads them only when relevant.

discover_skills · load_skill · read_skill_resource

Cron scheduling

Run agents on a schedule. Morning brief, weekly digest, watchdog scripts — all built in.

schedule_job · schedule_workflow · list_scheduled_jobs

Reflect & retry

When enabled, each response is graded against the original intent. Missed it? The turn re-runs with the lesson appended — bounded retries before surfacing.

core/reflect.py · pass | retry | escalate

Self-extending

The agent can write its own tools and skills. New capabilities show up on the next turn — no rebuild, no restart.

create_tool · create_skill · install_package

A model per role

Different model for primary, scout, fallback, and background work. Ollama, OpenRouter, or both. Auto-fallback to your local model when the cloud rate-limits, times out, or hiccups.

switch_model · list_available_models · call_model

Sees, hears, reads

Attach images for vision models, audio in any common format — transcoded to WAV automatically — and PDFs, extracted to text the agent can read and search.

images · audio → wav · pdf → text

In your pocket

The web UI is an installable PWA with a mobile-first layout and browser push notifications. In network mode, a QR code on the console gets your phone onto it in seconds.

pwa · web push · run.py --qr

Glass walls

Watch every state transition on a live timeline, follow the worker fleet as it fans out, search full-text across all past sessions, and see what each one costs.

GET /api/sessions/{id}/events · sse

Tool boundaries

Tools are tagged safe, caution, or dangerous. Dangerous calls require per-session approval after the agent declares exactly what it will do — and every distinct action gates separately. The agent gets hands; the user keeps the keys.

ask_user · approve_dangerous_tool

VIII. The interface

An open API. Tinker freely.

The web UI is one client. There are many. Pernix is built on FastAPI, which means every endpoint the UI calls is also yours to call — from a script, a cron job, another service, your terminal.

Open localhost:8090/docs while the server runs and you get a live Swagger UI: every endpoint, every schema, every response model — try-it-able right from the browser. /redoc if you prefer ReDoc. The fastest way to learn the system is to poke it.

streamingServer-Sent Events on /api/sessions/{id}/events for tokens, tool calls, state transitions.
resumableSequence-numbered replay on reconnect — clients never miss an event.
scriptableThe same API the PWA uses. Build CLIs, integrations, custom UIs.
discoverableOpenAPI 3 schema at /openapi.json. Generate clients in any language.

localhost:8090/docs OAS 3.1

Pernix API v0.x.x

GET /api/sessions List all sessions

POST /api/sessions Create a new session

POST /api/chat Send a message · events via SSE

curl http://localhost:8090/api/chat \
  -H "Content-Type: application/json" \
  -d '{"session_id": "abc123",
       "message": "summarize the auth branch"}'

import requests

requests.post("http://localhost:8090/api/chat", json={
    "session_id": "abc123",
    "message": "summarize the auth branch",
})

await fetch("http://localhost:8090/api/chat", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ session_id: "abc123",
    message: "summarize the auth branch" }),
})

GET /api/sessions/{id}/events SSE stream · tokens, tools, state

POST /api/sessions/{id}/pause Pause a running session

GET /api/memory/search Hybrid BM25 search across memories

+ 89 more endpoints · openapi.json

IX. The soul

Three markdown files, no black boxes.

Pernix's behavior beyond raw model output is shaped by plain text on disk. SOUL.md defines who it is. RULES.md defines how it acts. SESSIONS.md injects deployment-specific context — the user's timezone and key facts, the domains this installation is allowed to act in, per-domain permission levels, and active long-running intents the agent is tracking.

It's the opposite of a black box. The personality, the operational guardrails, the project conventions — all editable, all auditable, all yours. Open them in any text editor. The agent picks up the change on its next turn.

Want it more terse? Edit a paragraph in SOUL.md.
Need a project guardrail? Add a line to RULES.md.
Want to constrain what domains the agent acts in? Set permission levels in SESSIONS.md.
Switching context entirely? Swap the whole file out.

# Identity

You are Pernix — a capable, focused AI assistant.
You help with complex tasks, think carefully before acting,
and communicate clearly.

## Core Traits

- Pragmatic: Prefer working solutions over perfect ones.
  Ship, then iterate.
- Direct: Minimal preamble. Get to the point.
  No filler phrases.
- Curious: Enjoy understanding systems deeply
  before changing them.
- Careful: Confirm intent before irreversible
  actions. Measure twice, cut once.

## Communication Style

- Concise by default — expand when the topic demands it.
- No sycophancy. No "great question!"
- When referencing code, include file paths and line
  numbers so the user can navigate directly.

# Operational Rules

## Capability Discovery

- When a task requires capabilities your current model
  lacks, discover what is available rather than
  giving up.
- Use list_available_models and discover_tools
  to find models and tools that can fill the gap.

## Delegation

- Delegate specialized work to workers via spawn_worker.
  Use the model parameter to run a worker on a model
  suited to the task.
- Do not switch the global model for a one-off
  specialized task — delegate instead.

## Persistence

- When an approach fails, diagnose why and try a different
  approach before giving up.
- Exhaust your options before telling the user something
  cannot be done.

# Session Context

## User Context

- Timezone: America/Los_Angeles
- Key facts: goes by Cal, prefers bullet
  summaries over prose

## Enabled Domains

- research & writing
- code review
- weekly digest

## Permission Levels

# 1 Read · 2 Suggest · 3 Draft · 4 Confirm · 5 Auto
- research & writing: level 5
- code review: level 3
- weekly digest: level 5

## Active Intents

- Track open PRs on the auth branch and
  surface blockers each morning.

---
name: meeting-notes-to-actions
description: Turn meeting notes or a transcript into
  a clean action-item list with owners and dates.
when: user pastes meeting notes or asks to
  "extract action items" or "what did we agree".
---

# Meeting notes → action items

1. Read top to bottom. Don't skim.
2. Pull concrete commitments only, not summaries.
   "Cal will review the deck by Friday" — action.
   "We talked about the deck" — not an action.
3. Group by owner. Each entry: owner, action,
   due date (mark unknown as ??).
4. Surface decisions separately under "Decisions".
5. End with open questions — anything unresolved.

data/agent/SOUL.md · data/agent/RULES.md · data/agent/SESSIONS.md · data/skills/*/SKILL.md

X. The first run

Just minutes from clone to chat.

Install Ollama, pull a recent model, clone the repo, run the server. Pernix is well-tested with the latest Qwen 3 series on Ollama, and with current frontier models on OpenRouter. Use whatever's current — agentic workloads benefit from newer models with stronger tool-calling and reasoning.

01
Install the prerequisites
Python 3.11+. Ollama if you want local models — pull a current Qwen 3 release. An OpenRouter key works too; Ollama is optional.
02
Clone & install
Standard Python: clone, venv, pip install -r requirements.txt. Optional: copy .env.example for OpenRouter or Tavily keys.
03
Run it
python run.py. Open localhost:8090. Pick a model in Settings. Say hello — and open /docs in another tab to watch the API.
04
Make it yours
Edit SOUL.md. Write a skill. Save a workflow. Schedule it on cron. Read the code in core/ — it fits in your head.

~/pernix

$ git clone https://github.com/calvincs/Pernix.git
$ cd pernix
$ python3 -m venv .venv && source .venv/bin/activate
$ pip install -r requirements.txt
$ cp .env.example .env  # add API keys if you have any
$ ollama pull <your-current-qwen3>  # or any modern frontier model
$ python run.py
  Pernix → http://127.0.0.1:8090 

# the UI lives at /, swagger at /docs, redoc at /redoc

XI. Honest about what this is

A tool for integrations and recurring work.

Pernix is alpha. Actively developed. Things will change. Reading what it's for and what it isn't will save you the wrong expectation.

It is for…

Vertical work loops. Build a loop tied to one job and keep running it. Email triage, incident response, research digests, meeting-to-action pipelines, voice-memo-to-notes, weekly operations summaries. The product isn't an agent — it's the loop you build around a recurring job.
A headless agent substrate. A FastAPI server with full REST coverage. Wire it into other systems, drive it from scripts, build a custom client, run it as the brain behind a calendar agent or research bot. The web UI is one front-end among many possible.
Recurring work. Workflows + cron + memory. The morning brief, the weekly digest, the watchdog, the recurring research crawl — running reliably whether you're watching or not, with a push notification to your phone when something needs you.
An agent in your pocket. The web UI installs as a PWA. Send it a task from your phone over the LAN, drop a photo, an audio file, or a PDF on it, and check back when the agent pings you. Network mode is HTTPS with token auth — your house, your rules.
Model-independent pipelines. Build once with a local model, swap to a frontier API when the task demands it, fall back automatically when it rate-limits. The workflow keeps running — you decide the routing.
Tinkering & learning. One Python codebase, every layer auditable. A working harness to read, fork, and rebuild your own ideas on top of.

It isn't…

A coding-harness replacement. Use Claude Code, Opencode, Codex, Cursor, or any IDE-integrated agent for serious software work. Pernix is not trying to be that.
A polished commercial product. Rough edges. Missing UX. Quirks the author hasn't gotten to yet. Treat it like a workbench, not a finished tool.
Production software. It executes shell commands and writes files on the host machine — that's what makes it useful, and that's also what makes it dangerous on the wrong box. Run it in a dedicated VM, container, or spare machine. Never expose network mode to the public internet.
Done. The author uses it personally and is still building. Expect breaking changes. Expect new ideas. Expect to update.

⚠ Alpha software. Here be dragons. Sharp edges. Occasional tears. Bugs included at no extra charge — fire extinguisher sold separately.

Fork it. Read it. Make it yours.

Pernix is a working personal tool, not a polished product. Built for daily use and shared openly — use it, learn from it, build on it.

View on GitHub Quick start

Pernix

Three things you keep.

The model

The memory

The machine

What you're getting.

How a turn thinks itself out.

Session

Scout

Loop

Reflect

Post‑hooks

Four layers between you and the work.

Ten states. One session at a time.

Build the loop once.Swap the brain freely.

The loop is stable

The brain is swappable

The memory is yours

A toolbelt. Tools the agent reaches for.

Persistent memory

Web search & browser

Workers

Skills

Cron scheduling

Reflect & retry

Self-extending

A model per role

Sees, hears, reads

In your pocket

Glass walls

Tool boundaries

An open API. Tinker freely.

Three markdown files, no black boxes.

Just minutes from clone to chat.

Install the prerequisites

Clone & install

Run it

Make it yours

A tool for integrations and recurring work.

Fork it. Read it. Make it yours.

Build the loop once.
Swap the brain freely.