The shortest path from a sentence to a real iPhone tap
Skip XCUITest, simulators, and JSON-RPC plumbing. Plug a Claude or GPT key, hand the agent a goal, and watch it work the real device.
Inside the loop
One sentence in, a dozen tool calls out
The agent narrates its plan, dispatches OCR, taps, and screenshots until the goal is met — every call audited end to end.
chat / claude-sonnet-4 running
U
Find the latest unread DM in WeChat from the marketing channel and summarize it.
AI
Launching WeChat, then I'll scan the chat list for the marketing channel and look for the unread badge.
launch_app
bundle_id: "com.tencent.xin"
→ ok
find_text_on_screen
text: "marketing"
→ match at (210, 384)
tap_by_text
text: "marketing"
describe_screen
→ { "unread": "Q3 plan review at 3pm…" }
AI
The latest message is a meeting reminder for the Q3 plan review at 3pm today — no action required from you.
Architecture
From your editor's chat box to a real iPhone
MCP client speaks stdio, the bridge translates to authenticated HTTPS, the dashboard fans out over a reverse tunnel — all four hops in under 200ms.
architecture
MCP client Claude / GPT / any LLM
│
│ tool-call
▼
agentfy-mcp-server ← 40+ device tools
│
│ HTTPS + X-API-Key
▼
app.agentfy.io ← tenant scope, audit log
│
│ reverse tunnel
▼
real iPhones ← 1 device, or 100
Tools
40+
Latency
< 200ms
Per-tenant
isolated
Setup
60s
Tool surface
40+ tools, exposed as first-class MCP calls
The same tools your scripts can call — the agent just happens to be the most general consumer.
Device input
What the agent can do on the screen
tap tap_by_text swipe long_press text press_home press_lock Perception
What the agent can see
screenshot describe_screen find_text_on_screen find_element_on_screen ocr App control
Lifecycle and deep links
launch_app terminate_app get_foreground_app open_url list_apps Sub-agents + AI
Hand off the messy bits
ai_takeover ai_solve_captcha ai_extract ai_classify Network + state
Talk to the outside world
http extract jsonpath set log Vault + clipboard
Secrets and host-device IO
${vault.X} read_clipboard write_clipboard paste_to_phone Build with…
Bring your LLM key, bring an iPhone, start agenting
BYOK on every plan. 40+ MCP tools. 60 seconds to first call.
Start free trial