A live update
on work and life.
Hermes-Agent rebuilds this page every morning from my Claude sessions, my GitHub activity, what I'm listening to, and what's bubbling up in AI. Half journal, half status board — kept honest by being public.
This week I'm tracking the AI access restrictions and policy shifts affecting my personal tools, while keeping phillipsben.com up to date. I'm also working on getting local inference running so I can keep my agents going when cloud models get throttled or pulled.
- Complicated — Avril Lavigne
- Send Me On My Way — Rusted Root
- Power — Ren
Shipping this week.
§ 01 · GITHUBWhere the commits went.
§ 02 · BY REPORecent Claude work.
§ 03 · SUMMARIZEDI've been heads-down on BudgetTracker for the past couple of weeks, and the scope has grown well past what I'd originally planned.
The app started as a personal finance tracker, but I kept finding things missing. The AI Insights chat was single-shot, so it couldn't answer merchant-specific questions like "how much did I spend on vaping in May." Reworking it as an agent loop with a list_transactions tool that accepts an array of search terms let it answer those in one pass instead of one model call per synonym. Getting that working on the local model required figuring out that the tool path only activates when the provider actually reports supportsTools — a config flag I had to trace through three files before I found it.
The more interesting stretch was building out a read-only MCP server so Hermes, my local agent, could query financial data directly. That ended up being 13 tools — upcoming obligations, cash flow projection, burn rate, spending anomalies, savings goals, price inflation alerts, and a few others — each backed by deterministic SQL so the LLM never does the arithmetic. Hooking it up live surfaced two latent Postgres bugs: one where a date subtraction with an untyped parameter threw a type error at runtime (fixed by casting ::int), and another where raw SQL join rows came back as strings instead of Dates and broke .toISOString() calls. Both had been dormant since the MSSQL-to-Postgres migration.
I also ran a security audit across the app: path traversal on receipt image serving, JWT token type confusion (refresh tokens accepted as access tokens), SSRF on user-supplied LLM endpoints, and upload content spoofing where declared MIME types weren't verified against the actual file bytes. The design decision I found most interesting was the SSRF boundary — rather than blocking private IPs, I blocked user-typed URLs entirely while leaving the operator-configured model endpoint exempt, since that one lives in server environment variables and isn't user-controlled.
The income detection piece was the most debugging-intensive. I wrote a biweekly/monthly pattern detector over transaction history, but two bugs only showed up on real data: a dense 98-deposit series was being misclassified as semimonthly because the day-of-month clustering logic assumed gaps that biweekly deposits don't leave, and a 12-year history with job gaps was causing the whole detection to fail because the rolling mean was too noisy over that span. Switching to the most recent 12 deposits and rewriting the clustering to count distinct date buckets instead of gaps fixed both.
In the feeds.
§ 04 · AI NEWS- 01Statement on US government directive to suspend access to Fable 5 and Mythos 5Hacker News3080 pts
US gov ordered Anthropic to suspend Fable 5 and Mythos 5 — major AI policy shift.
- 02Open source AI must winHacker News1537 pts
Open-source AI movement rallying around self-hosted inference as models get restricted.
- 03GLM 5.2 Is OutHacker News581 pts
GLM 5.2 released — Chinese AI labs continue competing head-to-head with US frontier models.
- 04Police officer investigated for using AI to 'create evidence' in multiple casesHacker News321 pts
Law enforcement officer investigated for using AI to manufacture evidence — serious misuse case.