I've been working on a way for agents to query production systems to help me debug issues and close the loop on things I work on day to day. It works as a hook that rewrites ssh, awscli, gcloud, az, kubectl commands to verify they are read-only and safe. It also keeps track of sessions in files and when agents debug the same things it will give hints in the tool calls like
━━ Past Investigation (May 10, 87% similar) ━━
Root cause: php-fpm pool exhaustion causing nginx 502
Hosts involved: web1
Investigation path:
web1: systemctl status nginx
web1: journalctl -u nginx --no-pager -n 20
web1: systemctl status php-fpm
Consider checking: systemctl status php-fpm
Tools with memory is an interesting idea as well but lmk what you think!
I like the GLM coding plan before they raised their prices, now their rate limits are more strict as they are compute constrained. It is still a good deal for 1/3 the price of Claude for the same quality.
[Lily](https://github.com/aspectrr/lily) A CLI tool that can be installed to any coding agent via hook that gives read-only access to production systems (wraps ssh, kubectl, awscli, gcloud, az) so agents can investigate issues in production. Built it for myself and my team during initial investigations to save use a lot of time on figuring out issues but didn't want to have to babysit agents or just hope that "telling them they are in production" would prevent issues.
[clue.ssh](https://github.com/aspectrr/clue.ssh) A clue game over SSH based on the AI wave, where the goal is to find who stole the H100. Pretty fun and coding agents can play too.
[Chasing Losses](https://github.com/aspectrr/chasing_losses) I was interested in if LLMs chased losses when playing roulette, still investigating this but i've found that different models will bet different amounts at different frequencies even when prompted the same. Struggling on not wanting to guide them too much but also wanting to see how they react when put under pressure.
>Sysadmins, Devops engineers will the be the last ones replaced by AI.
Most setups aren't properly documented which makes the discovery and exploitability part the major bottleneck when this is facilitated by AI, the sysadmin/devops team is downsized.
Yeah this isn't even the worst thing I've seen an agent do, one time I (foolishly) ran Claude Code on my server directly and it managed to completely bring down my entire elasticsearch cluster. never again. its why I built Lily: https://github.com/aspectrr/lily
Hey HN, I have seen many different ways of letting AI run bash commands on remote hosts but none of which fix the issues of:
a. safety (read-only) b. not installing anything on the remote host
so this is my implementation of one that does.
It uses seven layers of verification on the client and reconstructs the commands with safe quoting to prevent unsafe chars or other attack vectors. Check out: https://github.com/aspectrr/lily?tab=security-ov-file
I've been working on a way for agents to query production systems to help me debug issues and close the loop on things I work on day to day. It works as a hook that rewrites ssh, awscli, gcloud, az, kubectl commands to verify they are read-only and safe. It also keeps track of sessions in files and when agents debug the same things it will give hints in the tool calls like
━━ Past Investigation (May 10, 87% similar) ━━ Root cause: php-fpm pool exhaustion causing nginx 502 Hosts involved: web1 Investigation path: web1: systemctl status nginx web1: journalctl -u nginx --no-pager -n 20 web1: systemctl status php-fpm Consider checking: systemctl status php-fpm
Tools with memory is an interesting idea as well but lmk what you think!