Skip to content

Protocol

PUNCH — Peer Unification via NAT Crossing Handshake

A signalling protocol for SRT session brokerage.


Overview

Punch is a signalling protocol that enables SRT endpoints to discover each other and negotiate connections without manual IP/port exchange. It is to SRT what WHIP is to WebRTC — a standardised session establishment mechanism over HTTP and WebSocket.

Punch does not modify or extend SRT itself. It operates alongside SRT, coordinating the out-of-band information exchange that SRT requires but does not define.

Transport

Punch uses two transport layers:

LayerTransportPurpose
Session managementHTTP RESTCreate, query, and close sessions
Peer signallingWebSocketReal-time peer discovery, health, tally

Both layers share the same origin (punch.thåst.se). Authentication is via bearer tokens in the Authorization header.

Session lifecycle

States

StateDescriptionTransitions to
WAITINGSession created, fewer than 2 peers registeredREADY, CLOSED
READYAll required peer coordinates exchangedCONNECTED, CLOSED
CONNECTEDPeers report active SRT connectionCLOSED
RELAYINGDirect hole punch failed, relay path activeCONNECTED, CLOSED
CLOSEDSession terminated (explicit or TTL expiry)

Transitions

create()

register(peer_2) — all required peers present

srt_connected() — any peer reports SRT up

hole_punch_fail() — future: relay fallback

close() / TTL

close() / TTL

close() / TTL

close() / TTL

WAITING

READY

CONNECTED

RELAYING

CLOSED

HTTP API

Create session

POST /api/session
Content-Type: application/json
{
"name": "nab-floor-cam1",
"streams": ["cam-wide", "cam-close"],
"latency": 200,
"ttl": 1800
}

Response 201 Created:

{
"session": "nab-floor-cam1",
"token": "p_eyJhbGciOiJIUzI1NiJ9...",
"url": "https://punch.thåst.se/s/nab-floor-cam1",
"qr": "https://punch.thåst.se/s/nab-floor-cam1/qr",
"streams": ["cam-wide", "cam-close"],
"latency": 200,
"ttl": 1800,
"state": "WAITING",
"created": "2026-03-06T10:30:00Z"
}

Fields:

FieldTypeRequiredDescription
namestringYesHuman-readable session identifier (max 64 chars, URL-safe)
streamsstring[]NoNamed stream slots (default: single unnamed stream)
latencynumberNoRecommended SRT latency in ms (default: 200)
ttlnumberNoSession timeout in seconds (default: 1800 = 30 min)

Get session

GET /api/session/:name
Authorization: Bearer p_eyJ...

Response 200 OK:

{
"session": "nab-floor-cam1",
"state": "CONNECTED",
"streams": {
"cam-wide": {
"state": "connected",
"peer_a": { "ip": "203.0.113.5", "port": 9000, "label": "CAM-1" },
"peer_b": { "ip": "198.51.100.3", "port": 9000, "label": "MCR-IN-1" },
"health": { "rtt": 45, "retransmit": 0.2, "bitrate": 8500000 }
},
"cam-close": {
"state": "waiting",
"peer_a": null,
"peer_b": null,
"health": null
}
},
"latency": 200,
"created": "2026-03-06T10:30:00Z",
"expiresAt": 1741256400000
}

Close session

DELETE /api/session/:name
Authorization: Bearer p_eyJ...

Response 204 No Content

All connected peers receive a session_closed WebSocket message.

Generate connection strings

GET /api/session/:name/connect?stream=cam-wide&role=sender
Authorization: Bearer p_eyJ...

Query parameters:

ParameterRequiredDescription
streamNoStream identifier; defaults to the session’s first stream.
roleNoLocal SRT role (sender or receiver); affects format-specific output. Defaults to sender.

Response 200 OK:

{
"stream": "cam-wide",
"role": "sender",
"for_peer_a": {
"ffmpeg": "ffmpeg -re -i input.ts -c copy -f mpegts \"srt://198.51.100.3:38501?mode=rendezvous&latency=200000&passphrase=xK9mQ2...&pkt_size=1316&connect_timeout=120000\"",
"obs": "srt://198.51.100.3:38501?mode=rendezvous&localip=0.0.0.0&localport=38501&latency=200000&passphrase=xK9mQ2...&connect_timeout=120000",
"vmix": "vMix SRT settings (enter manually in the GUI):\n Hostname: 198.51.100.3\n Port: 38501\n Local Port: 38501\n Latency: 200 ms\n Passphrase: xK9mQ2...\n Mode: Rendezvous\n Key Length: 0 (auto)",
"gstreamer": "srtsink uri=srt://198.51.100.3:38501 mode=rendezvous latency=200 passphrase=\"xK9mQ2...\" connect-timeout=120000",
"srt-live-transmit": "srt-live-transmit \"udp://:5000\" \"srt://198.51.100.3:38501?mode=rendezvous&latency=200000&passphrase=xK9mQ2...&pkt_size=1316&connect_timeout=120000\""
},
"for_peer_b": {
"ffmpeg": "...",
"obs": "...",
"vmix": "...",
"gstreamer": "...",
"srt-live-transmit": "..."
}
}

The response carries two parallel format maps: for_peer_a is the set of strings peer A should run (targeting peer B’s endpoint), and for_peer_b is the mirror. Both maps cover every supported encoder format. Producers typically dispatch one map to each operator.

Supported formats: ffmpeg, obs, vmix, gstreamer, srt-live-transmit

WebSocket API

Connection

wss://punch.thåst.se/api/ws/:session_name
Authorization: Bearer p_eyJ...

Or via query parameter:

wss://punch.thåst.se/api/ws/:session_name?token=p_eyJ...

Messages: Peer → Server

register

Register this peer’s SRT endpoint information.

{
"type": "register",
"stream": "cam-wide",
"port": 9000,
"role": "sender",
"meta": {
"label": "CAM-1",
"audio": "stereo",
"codec": "h264",
"resolution": "1920x1080",
"framerate": 50
}
}
FieldTypeRequiredDescription
streamstringNoStream slot to claim (omit for single-stream sessions)
portnumberYesLocal SRT port (1024-65535)
rolestringNosender or receiver (informational)
metaobjectNoArbitrary metadata about this endpoint

The server extracts the peer’s public IP from the connection. The peer does not need to know its own public IP.

health

Report SRT connection health metrics.

{
"type": "health",
"stream": "cam-wide",
"rtt": 45,
"retransmit": 0.2,
"bitrate": 8500000,
"dropped": 0,
"buffer": 85
}
FieldTypeDescription
rttnumberRound-trip time in ms
retransmitnumberRetransmit ratio (percentage)
bitratenumberCurrent bitrate in bits/second
droppednumberDropped packets since last report
buffernumberReceive buffer fullness (percentage)

Recommended reporting interval: 1-5 seconds.

status

Report SRT connection state changes.

{
"type": "status",
"stream": "cam-wide",
"srt": "connected"
}

Values for srt: connecting, connected, disconnected, error

tally

Set tally state for this stream (typically sent by a vision mixer/switcher integration).

{
"type": "tally",
"stream": "cam-wide",
"state": "program"
}

Values for state: off, preview, program

ready

Signal that this peer has prepared its SRT process and is ready to begin rendezvous. The server emits a coordinated start (see below) to both peers once both sides have signalled ready.

{
"type": "ready",
"stream": "cam-wide"
}

Messages: Server → Peer

peer

Peer coordinate exchange — sent when the opposing peer registers.

{
"type": "peer",
"stream": "cam-wide",
"ip": "198.51.100.3",
"port": 38501,
"localPort": 38501,
"passphrase": "xK9mQ2vL8nRp3wYh",
"latency": 200,
"meta": {
"label": "MCR-IN-1"
}
}
FieldTypeDescription
ipstringRemote peer’s reported public IP.
portnumberRemote peer’s UDP port.
localPortnumberThe local UDP port this peer must bind to for SRT rendezvous.
passphrasestringShared SRT passphrase (≥ 10 chars).
latencynumberSuggested SRT latency in milliseconds (20–8000).
metaobjectOpaque metadata advertised by the remote peer.

Upon receiving this message, the peer should initiate SRT in rendezvous mode to the provided ip:port, binding locally to localPort, with the given passphrase and latency. Both peers in a stream are assigned the same localPort — this symmetric port assignment maximises the range of NATs through which SRT rendezvous can complete.

peer_match

Sent only to admin observer connections when both peers in a stream are matched. Admins use this to render for_peer_a / for_peer_b connection strings.

{
"type": "peer_match",
"stream": "cam-wide",
"peer_a": { "ip": "203.0.113.5", "port": 38501 },
"peer_b": { "ip": "198.51.100.3", "port": 38501 },
"localPort": 38501,
"passphrase": "xK9mQ2vL8nRp3wYh",
"latency": 200
}

start

Coordinated go-signal — sent to both peers in a stream once both have transmitted ready. Peers should begin SRT rendezvous immediately on receipt.

{
"type": "start",
"stream": "cam-wide"
}

session

Session state update — broadcast to all connected peers.

{
"type": "session",
"state": "CONNECTED",
"streams": {
"cam-wide": { "state": "connected", "peers": 2 },
"cam-close": { "state": "waiting", "peers": 0 }
},
"connected": 1,
"total": 2
}

tally

Tally state update — broadcast to all peers of the affected stream.

{
"type": "tally",
"stream": "cam-wide",
"state": "program"
}

health_report

Aggregated health for dashboard clients (sent to admin-role connections).

{
"type": "health_report",
"streams": {
"cam-wide": {
"rtt": 45,
"retransmit": 0.2,
"bitrate": 8500000,
"status": "healthy"
}
}
}

Health status thresholds:

StatusRTTRetransmitCondition
healthy<100ms<1%All nominal
warning100-200ms1-5%Degraded but functional
critical>200ms>5%Likely visible artefacts

error

Error notification.

{
"type": "error",
"code": "SESSION_FULL",
"message": "All stream slots are occupied"
}

Error codes (WebSocket):

CodeDescription
SESSION_FULLAll stream slots occupied
SESSION_EXPIREDSession TTL exceeded
STREAM_TAKENRequested stream slot already claimed
AUTH_FAILEDToken not valid for this stream or operation
INVALID_MESSAGEMalformed WebSocket message or rate-limit exceeded

The HTTP error surface adds SESSION_EXISTS, RATE_LIMITED, TURNSTILE_FAILED, and SERVER_ERROR. The full enumeration is normative in spec/openapi.yaml.

session_closed

Sent to all peers when a session is closed (by admin or TTL).

{
"type": "session_closed",
"reason": "admin_closed"
}

Reasons: admin_closed, ttl_expired, error

Authentication

Token format

Tokens are HMAC-SHA256 signed JSON payloads, base64url-encoded, prefixed with p_:

p_eyJzZXNzaW9uIjoibmFiLWZsb29yLWNhbTEiLCJyb2xlIjoiYWRtaW4iLCJleHAiOjE3NDEyMzQ1Njd9.signature

Token claims

{
"session": "nab-floor-cam1",
"role": "admin",
"stream": null,
"exp": 1741234567
}
ClaimTypeDescription
sessionstringSession this token grants access to
rolestringadmin (full control) or peer (stream-scoped)
streamstring?Specific stream slot (null = all streams)
expnumberExpiry timestamp (Unix seconds)

Roles

RoleCreate sessionClose sessionRegister peerView healthSet tally
adminYesYesYesYesYes
peerNoNoOwn stream onlyOwn streamNo

Passphrase distribution

When a session is created, Punch generates a cryptographically random AES passphrase (32 characters). This passphrase is:

  1. Stored in the Durable Object (never in KV or external storage)
  2. Distributed to peers via the peer WebSocket message (WSS)
  3. Embedded in the connection strings returned by GET /api/session/:name/connect to admin tokens, so the dashboard can render copy-paste FFmpeg/OBS strings (HTTPS, TLS 1.3)
  4. Used by both sides for SRT AES-128 or AES-256 encryption

Both delivery paths are TLS-protected and require an authenticated bearer token. Punch never logs the passphrase, and it is not exposed to any unauthenticated request. This eliminates the need to exchange passphrases out-of-band.

Rate limits

EndpointLimitWindow
POST /api/session10per minute
GET /api/session/:id60per minute
WebSocket messages100per second per connection

Versioning

The protocol version is communicated via the Punch-Version response header:

Punch-Version: 1.0

WebSocket messages may include a v field for forward compatibility:

{ "type": "register", "v": 1, "port": 9000 }

Servers must ignore unknown fields. Clients must handle unknown message types gracefully.