Skip to content

Install the software

PhysiClaw’s software is one CLI called physiclaw, and getting it running is four commands: install it, check the environment, fetch the on-device vision model, and start the server. macOS is the supported platform today; the only real difference on other OSes is the serial-port name, which the wizard finds for you anyway.

One script installs everything — uv, Python 3.12, and the physiclaw CLI — isolated under ~/.local/bin so it never touches your system Python:

bash
curl -fsSL https://raw.githubusercontent.com/physiclaw/PhysiClaw/main/install.sh | bash

When it finishes, physiclaw is on your PATH. Confirm it’s there:

bash
physiclaw --version

If the shell can’t find the command, open a new terminal (so the updated PATH loads) and try again.

physiclaw doctor is a read-only health check — it never changes anything. It prints the Python version, your config path, whether the vision model is present, what hardware it can see, and a numbered list of what to do next.

bash
physiclaw doctor

On a fresh install it will flag two things as missing — the vision model and a running server. That’s expected; the next two steps fix both. Run doctor again any time something feels off; it’s the fastest way to see the whole system’s state at a glance.

PhysiClaw reads the screen with a small icon-detection model that runs on your machine. Fetch it once:

bash
physiclaw setup local-vision-model

This downloads about 100 MB and converts it to a fast inference format. The conversion borrows some heavy dependencies in a throwaway environment and deletes them on success, so your install stays lean — only the finished model stays behind.

Why local, and not a cloud API?

Privacy

Every frame of your phone’s screen is processed on your own machine. Screenshots of your messages, banking, and accounts never leave the desk.

Offline & fast

No network round-trip per look. The detector runs locally, so a peek stays in the ~4-second range even on a flaky connection.

bash
physiclaw server

This starts the MCP server on port 8048 and opens the setup wizard in your browser. Leave this shell running — it’s the process that holds the serial port and the camera and talks to your phone. On first start it prints, among other things:

PhysiClaw MCP server on http://localhost:8048/mcp
QR code (scan with phone): http://localhost:8048/api/bridge/qr

The browser window it opens is the hardware-setup wizard. You can drive setup there or from a second terminal — both do the same thing, and Calibrate walks through it.

PhysiClaw speaks MCP (Model Context Protocol) — the standard way an AI agent calls external tools. Any MCP client (Claude Desktop, an IDE, your own script) connects to the same endpoint:

http://localhost:8048/mcp

Once calibrated, the client will see PhysiClaw’s tools — peek, tap, swipe, and the rest — and can start operating the phone. First task wires up a client and runs one end to end.

PhysiClaw auto-detects the arm’s USB serial port, so you rarely need its name. When a troubleshooting step does ask for it, the format differs by OS:

/dev/tty.usbserial-XXXX (or /dev/tty.usbmodemXXXX). List candidates with ls /dev/tty.usb*.