Protocol
PUNCH — Peer Unification via NAT Crossing Handshake
A signalling protocol for SRT session brokerage.
Overview
Punch is a signalling protocol that enables SRT endpoints to discover each other and negotiate connections without manual IP/port exchange. It is to SRT what WHIP is to WebRTC — a standardised session establishment mechanism over HTTP and WebSocket.
Punch does not modify or extend SRT itself. It operates alongside SRT, coordinating the out-of-band information exchange that SRT requires but does not define.
Transport
Punch uses two transport layers:
| Layer | Transport | Purpose |
|---|---|---|
| Session management | HTTP REST | Create, query, and close sessions |
| Peer signalling | WebSocket | Real-time peer discovery, health, tally |
Both layers share the same origin (punch.thåst.se). Authentication is via bearer tokens in the Authorization header.
Session lifecycle
States
| State | Description | Transitions to |
|---|---|---|
WAITING | Session created, fewer than 2 peers registered | READY, CLOSED |
READY | All required peer coordinates exchanged | CONNECTED, CLOSED |
CONNECTED | Peers report active SRT connection | CLOSED |
RELAYING | Direct hole punch failed, relay path active | CONNECTED, CLOSED |
CLOSED | Session terminated (explicit or TTL expiry) | — |
Transitions
HTTP API
Create session
POST /api/sessionContent-Type: application/json
{ "name": "nab-floor-cam1", "streams": ["cam-wide", "cam-close"], "latency": 200, "ttl": 1800}Response 201 Created:
{ "session": "nab-floor-cam1", "token": "p_eyJhbGciOiJIUzI1NiJ9...", "url": "https://punch.thåst.se/s/nab-floor-cam1", "qr": "https://punch.thåst.se/s/nab-floor-cam1/qr", "streams": ["cam-wide", "cam-close"], "latency": 200, "ttl": 1800, "state": "WAITING", "created": "2026-03-06T10:30:00Z"}Fields:
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Human-readable session identifier (max 64 chars, URL-safe) |
streams | string[] | No | Named stream slots (default: single unnamed stream) |
latency | number | No | Recommended SRT latency in ms (default: 200) |
ttl | number | No | Session timeout in seconds (default: 1800 = 30 min) |
Get session
GET /api/session/:nameAuthorization: Bearer p_eyJ...Response 200 OK:
{ "session": "nab-floor-cam1", "state": "CONNECTED", "streams": { "cam-wide": { "state": "connected", "peer_a": { "ip": "203.0.113.5", "port": 9000, "label": "CAM-1" }, "peer_b": { "ip": "198.51.100.3", "port": 9000, "label": "MCR-IN-1" }, "health": { "rtt": 45, "retransmit": 0.2, "bitrate": 8500000 } }, "cam-close": { "state": "waiting", "peer_a": null, "peer_b": null, "health": null } }, "latency": 200, "created": "2026-03-06T10:30:00Z", "expiresAt": 1741256400000}Close session
DELETE /api/session/:nameAuthorization: Bearer p_eyJ...Response 204 No Content
All connected peers receive a session_closed WebSocket message.
Generate connection strings
GET /api/session/:name/connect?stream=cam-wide&role=senderAuthorization: Bearer p_eyJ...Query parameters:
| Parameter | Required | Description |
|---|---|---|
stream | No | Stream identifier; defaults to the session’s first stream. |
role | No | Local SRT role (sender or receiver); affects format-specific output. Defaults to sender. |
Response 200 OK:
{ "stream": "cam-wide", "role": "sender", "for_peer_a": { "ffmpeg": "ffmpeg -re -i input.ts -c copy -f mpegts \"srt://198.51.100.3:38501?mode=rendezvous&latency=200000&passphrase=xK9mQ2...&pkt_size=1316&connect_timeout=120000\"", "obs": "srt://198.51.100.3:38501?mode=rendezvous&localip=0.0.0.0&localport=38501&latency=200000&passphrase=xK9mQ2...&connect_timeout=120000", "vmix": "vMix SRT settings (enter manually in the GUI):\n Hostname: 198.51.100.3\n Port: 38501\n Local Port: 38501\n Latency: 200 ms\n Passphrase: xK9mQ2...\n Mode: Rendezvous\n Key Length: 0 (auto)", "gstreamer": "srtsink uri=srt://198.51.100.3:38501 mode=rendezvous latency=200 passphrase=\"xK9mQ2...\" connect-timeout=120000", "srt-live-transmit": "srt-live-transmit \"udp://:5000\" \"srt://198.51.100.3:38501?mode=rendezvous&latency=200000&passphrase=xK9mQ2...&pkt_size=1316&connect_timeout=120000\"" }, "for_peer_b": { "ffmpeg": "...", "obs": "...", "vmix": "...", "gstreamer": "...", "srt-live-transmit": "..." }}The response carries two parallel format maps: for_peer_a is the set
of strings peer A should run (targeting peer B’s endpoint), and
for_peer_b is the mirror. Both maps cover every supported encoder
format. Producers typically dispatch one map to each operator.
Supported formats: ffmpeg, obs, vmix, gstreamer, srt-live-transmit
WebSocket API
Connection
wss://punch.thåst.se/api/ws/:session_nameAuthorization: Bearer p_eyJ...Or via query parameter:
wss://punch.thåst.se/api/ws/:session_name?token=p_eyJ...Messages: Peer → Server
register
Register this peer’s SRT endpoint information.
{ "type": "register", "stream": "cam-wide", "port": 9000, "role": "sender", "meta": { "label": "CAM-1", "audio": "stereo", "codec": "h264", "resolution": "1920x1080", "framerate": 50 }}| Field | Type | Required | Description |
|---|---|---|---|
stream | string | No | Stream slot to claim (omit for single-stream sessions) |
port | number | Yes | Local SRT port (1024-65535) |
role | string | No | sender or receiver (informational) |
meta | object | No | Arbitrary metadata about this endpoint |
The server extracts the peer’s public IP from the connection. The peer does not need to know its own public IP.
health
Report SRT connection health metrics.
{ "type": "health", "stream": "cam-wide", "rtt": 45, "retransmit": 0.2, "bitrate": 8500000, "dropped": 0, "buffer": 85}| Field | Type | Description |
|---|---|---|
rtt | number | Round-trip time in ms |
retransmit | number | Retransmit ratio (percentage) |
bitrate | number | Current bitrate in bits/second |
dropped | number | Dropped packets since last report |
buffer | number | Receive buffer fullness (percentage) |
Recommended reporting interval: 1-5 seconds.
status
Report SRT connection state changes.
{ "type": "status", "stream": "cam-wide", "srt": "connected"}Values for srt: connecting, connected, disconnected, error
tally
Set tally state for this stream (typically sent by a vision mixer/switcher integration).
{ "type": "tally", "stream": "cam-wide", "state": "program"}Values for state: off, preview, program
ready
Signal that this peer has prepared its SRT process and is ready to begin rendezvous. The server emits a coordinated start (see below) to both peers once both sides have signalled ready.
{ "type": "ready", "stream": "cam-wide"}Messages: Server → Peer
peer
Peer coordinate exchange — sent when the opposing peer registers.
{ "type": "peer", "stream": "cam-wide", "ip": "198.51.100.3", "port": 38501, "localPort": 38501, "passphrase": "xK9mQ2vL8nRp3wYh", "latency": 200, "meta": { "label": "MCR-IN-1" }}| Field | Type | Description |
|---|---|---|
ip | string | Remote peer’s reported public IP. |
port | number | Remote peer’s UDP port. |
localPort | number | The local UDP port this peer must bind to for SRT rendezvous. |
passphrase | string | Shared SRT passphrase (≥ 10 chars). |
latency | number | Suggested SRT latency in milliseconds (20–8000). |
meta | object | Opaque metadata advertised by the remote peer. |
Upon receiving this message, the peer should initiate SRT in rendezvous mode to the provided ip:port, binding locally to localPort, with the given passphrase and latency. Both peers in a stream are assigned the same localPort — this symmetric port assignment maximises the range of NATs through which SRT rendezvous can complete.
peer_match
Sent only to admin observer connections when both peers in a stream are matched. Admins use this to render for_peer_a / for_peer_b connection strings.
{ "type": "peer_match", "stream": "cam-wide", "peer_a": { "ip": "203.0.113.5", "port": 38501 }, "peer_b": { "ip": "198.51.100.3", "port": 38501 }, "localPort": 38501, "passphrase": "xK9mQ2vL8nRp3wYh", "latency": 200}start
Coordinated go-signal — sent to both peers in a stream once both have transmitted ready. Peers should begin SRT rendezvous immediately on receipt.
{ "type": "start", "stream": "cam-wide"}session
Session state update — broadcast to all connected peers.
{ "type": "session", "state": "CONNECTED", "streams": { "cam-wide": { "state": "connected", "peers": 2 }, "cam-close": { "state": "waiting", "peers": 0 } }, "connected": 1, "total": 2}tally
Tally state update — broadcast to all peers of the affected stream.
{ "type": "tally", "stream": "cam-wide", "state": "program"}health_report
Aggregated health for dashboard clients (sent to admin-role connections).
{ "type": "health_report", "streams": { "cam-wide": { "rtt": 45, "retransmit": 0.2, "bitrate": 8500000, "status": "healthy" } }}Health status thresholds:
| Status | RTT | Retransmit | Condition |
|---|---|---|---|
healthy | <100ms | <1% | All nominal |
warning | 100-200ms | 1-5% | Degraded but functional |
critical | >200ms | >5% | Likely visible artefacts |
error
Error notification.
{ "type": "error", "code": "SESSION_FULL", "message": "All stream slots are occupied"}Error codes (WebSocket):
| Code | Description |
|---|---|
SESSION_FULL | All stream slots occupied |
SESSION_EXPIRED | Session TTL exceeded |
STREAM_TAKEN | Requested stream slot already claimed |
AUTH_FAILED | Token not valid for this stream or operation |
INVALID_MESSAGE | Malformed WebSocket message or rate-limit exceeded |
The HTTP error surface adds SESSION_EXISTS, RATE_LIMITED, TURNSTILE_FAILED, and SERVER_ERROR. The full enumeration is normative in spec/openapi.yaml.
session_closed
Sent to all peers when a session is closed (by admin or TTL).
{ "type": "session_closed", "reason": "admin_closed"}Reasons: admin_closed, ttl_expired, error
Authentication
Token format
Tokens are HMAC-SHA256 signed JSON payloads, base64url-encoded, prefixed with p_:
p_eyJzZXNzaW9uIjoibmFiLWZsb29yLWNhbTEiLCJyb2xlIjoiYWRtaW4iLCJleHAiOjE3NDEyMzQ1Njd9.signatureToken claims
{ "session": "nab-floor-cam1", "role": "admin", "stream": null, "exp": 1741234567}| Claim | Type | Description |
|---|---|---|
session | string | Session this token grants access to |
role | string | admin (full control) or peer (stream-scoped) |
stream | string? | Specific stream slot (null = all streams) |
exp | number | Expiry timestamp (Unix seconds) |
Roles
| Role | Create session | Close session | Register peer | View health | Set tally |
|---|---|---|---|---|---|
admin | Yes | Yes | Yes | Yes | Yes |
peer | No | No | Own stream only | Own stream | No |
Passphrase distribution
When a session is created, Punch generates a cryptographically random AES passphrase (32 characters). This passphrase is:
- Stored in the Durable Object (never in KV or external storage)
- Distributed to peers via the
peerWebSocket message (WSS) - Embedded in the connection strings returned by
GET /api/session/:name/connectto admin tokens, so the dashboard can render copy-paste FFmpeg/OBS strings (HTTPS, TLS 1.3) - Used by both sides for SRT AES-128 or AES-256 encryption
Both delivery paths are TLS-protected and require an authenticated bearer token. Punch never logs the passphrase, and it is not exposed to any unauthenticated request. This eliminates the need to exchange passphrases out-of-band.
Rate limits
| Endpoint | Limit | Window |
|---|---|---|
POST /api/session | 10 | per minute |
GET /api/session/:id | 60 | per minute |
| WebSocket messages | 100 | per second per connection |
Versioning
The protocol version is communicated via the Punch-Version response header:
Punch-Version: 1.0WebSocket messages may include a v field for forward compatibility:
{ "type": "register", "v": 1, "port": 9000 }Servers must ignore unknown fields. Clients must handle unknown message types gracefully.