Privacy
Every frame of your phone’s screen is processed on your own machine. Screenshots of your messages, banking, and accounts never leave the desk.
PhysiClaw’s software is one CLI called physiclaw, and getting it running is four
commands: install it, check the environment, fetch the on-device vision model, and start the
server. macOS is the supported platform today; the only real difference on other OSes is the
serial-port name, which the wizard finds for you anyway.
One script installs everything — uv, Python 3.12, and the physiclaw CLI — isolated under
~/.local/bin so it never touches your system Python:
curl -fsSL https://raw.githubusercontent.com/physiclaw/PhysiClaw/main/install.sh | bashWhen it finishes, physiclaw is on your PATH. Confirm it’s there:
physiclaw --versionIf the shell can’t find the command, open a new terminal (so the updated PATH loads) and try
again.
physiclaw doctor is a read-only health check — it never changes anything. It prints the
Python version, your config path, whether the vision model is present, what hardware it can
see, and a numbered list of what to do next.
physiclaw doctorOn a fresh install it will flag two things as missing — the vision model and a running server.
That’s expected; the next two steps fix both. Run doctor again any time something feels off;
it’s the fastest way to see the whole system’s state at a glance.
PhysiClaw reads the screen with a small icon-detection model that runs on your machine. Fetch it once:
physiclaw setup local-vision-modelThis downloads about 100 MB and converts it to a fast inference format. The conversion borrows some heavy dependencies in a throwaway environment and deletes them on success, so your install stays lean — only the finished model stays behind.
Why local, and not a cloud API?
Privacy
Every frame of your phone’s screen is processed on your own machine. Screenshots of your messages, banking, and accounts never leave the desk.
Offline & fast
No network round-trip per look. The detector runs locally, so a peek stays in the
~4-second range even on a flaky connection.
physiclaw serverThis starts the MCP server on port 8048 and opens the setup wizard in your browser. Leave this shell running — it’s the process that holds the serial port and the camera and talks to your phone. On first start it prints, among other things:
PhysiClaw MCP server on http://localhost:8048/mcpQR code (scan with phone): http://localhost:8048/api/bridge/qrThe browser window it opens is the hardware-setup wizard. You can drive setup there or from a second terminal — both do the same thing, and Calibrate walks through it.
PhysiClaw speaks MCP (Model Context Protocol) — the standard way an AI agent calls external tools. Any MCP client (Claude Desktop, an IDE, your own script) connects to the same endpoint:
http://localhost:8048/mcpOnce calibrated, the client will see PhysiClaw’s tools — peek, tap, swipe, and the rest —
and can start operating the phone. First task wires up a client and runs
one end to end.
PhysiClaw auto-detects the arm’s USB serial port, so you rarely need its name. When a troubleshooting step does ask for it, the format differs by OS:
/dev/tty.usbserial-XXXX (or /dev/tty.usbmodemXXXX). List candidates with
ls /dev/tty.usb*.
/dev/ttyUSB0 (CH340 adapters) or /dev/ttyACM0. List with ls /dev/ttyUSB* /dev/ttyACM*.
COM3, COM4, … Check Device Manager → Ports (COM & LPT) for the active number.