RFC
Document: Punch Protocol Specification Version: 1.0.0 Status: Stable Date: 2026-05-03 Editor: The Thåst Collective
Abstract
This document specifies the Punch protocol, a session brokerage protocol for SRT (Secure Reliable Transport). Punch coordinates the out-of-band exchange that two SRT endpoints require but that SRT itself does not define: peer discovery, public-address resolution, passphrase distribution, and the synchronisation needed for SRT rendezvous-mode hole punching.
Punch is not a media transport. It does not carry, transcode, relay, or inspect SRT payloads. The signalling server brokers the introduction; once peers are matched, media flows directly between them.
Status of This Document
This is version 1.0.0 of the Punch protocol specification. Subsequent
backwards-compatible revisions will increment the minor version. Breaking
changes will increment the major version and be accompanied by a migration
note. The current version is implemented at https://punch.thåst.se.
The companion artefacts are:
openapi.yaml— OpenAPI 3.1 description of the HTTP APIpunch-messages.schema.json— JSON Schema 2020-12 for WebSocket messages../protocol.md— informal narrative reference
In the event of disagreement between this document and the companion files, the JSON Schema and OpenAPI document are normative for their respective surfaces; this document is normative for protocol semantics, state machines, and security considerations.
1. Conventions
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.
All time values are in milliseconds unless otherwise specified. All sizes are in octets unless otherwise specified.
JSON in this document follows RFC 8259. Field names are case-sensitive and MUST be encoded as UTF-8.
2. Terminology
- Session — A named brokerage instance. Identified by a session name
matching
[A-Za-z0-9][A-Za-z0-9._-]{0,63}. - Stream — A logical media flow inside a session. A session MAY contain one or more streams. Each stream brokers exactly one pair of peers.
- Peer — An SRT endpoint participating in a stream. Peers are
symmetric — Punch does not distinguish caller from listener — but each
peer MAY advertise its intended SRT role (
senderorreceiver). - Admin — The session creator. Holds an admin token granting full
observer access (including
peer_matchevents) and the ability to close the session. - Producer — The human or automation that creates a session and distributes peer URLs. Holds the admin token.
3. Architecture Overview
Punch never sees SRT media. Once peer coordinates are exchanged, the media path is direct peer-to-peer.
A producer creates a session via HTTP. Each peer subsequently opens a WebSocket to the session, registers a UDP port, and receives — when matched — the remote peer’s coordinates plus a shared passphrase. Peers then initiate SRT in rendezvous mode and the signalling layer is no longer involved in media transport.
4. Session Lifecycle
4.1 States
A session MUST be in exactly one of the following states:
| State | Meaning |
|---|---|
WAITING | Session created; fewer than the required peers per stream are registered. |
READY | All required peers for at least one stream have registered and have been matched. |
CONNECTED | At least one peer has reported an active SRT connection (status: connected). |
RELAYING | Reserved for future relay fallback. MUST NOT be entered by v1.0.0 implementations. |
CLOSED | Session terminated, either by admin action or TTL expiry. Terminal state. |
4.2 Transitions
created → WAITINGall peers matched → READY (per-stream; session enters READY when any stream is matched)peer reports srt → CONNECTEDadmin DELETE → CLOSED (any state)ttl expiry → CLOSED (any state)A session MUST NOT transition out of CLOSED. Implementations MUST
reject all subsequent operations on a closed session with 404 (over HTTP)
or close the WebSocket with code 4404 (see §8.2).
4.3 Time-to-Live
Every session has an absolute expiry time expiresAt, recorded as Unix
milliseconds at session creation. Implementations MUST close the session
no later than expiresAt. Default TTL is 1800 s; the producer MAY
request any value up to 86400 s.
5. HTTP API
The full HTTP surface is defined in openapi.yaml. This section enumerates
normative requirements that are not expressible in OpenAPI alone.
5.1 Authentication
All HTTP endpoints under /api/session/{name} (other than POST /api/session) REQUIRE a bearer token in the Authorization header.
The token format is described in §7. Implementations MUST reject
requests with missing, malformed, or session-mismatched tokens with HTTP
401 and an Error body bearing code: AUTH_FAILED.
5.2 Rate limiting
Implementations MUST rate-limit POST /api/session per source IP. The
default is 10 requests per minute per IP. Implementations MAY rate-limit
other endpoints. Rate-limited requests MUST return 429 and SHOULD
include Retry-After.
5.3 Idempotency
POST /api/session is NOT idempotent. A session name uniquely
identifies a session for the duration of its lifetime. Subsequent
POST with the same name MUST fail with 409 and
code: SESSION_EXISTS.
5.4 CAPTCHA
Implementations MAY require a Cloudflare Turnstile token on
POST /api/session to mitigate automated session creation. The token, when
required, is supplied as turnstileToken in the request body. Failed
verification MUST return 403 with code: TURNSTILE_FAILED. When the
deployment is configured without a Turnstile secret, the field MUST be
ignored.
6. WebSocket Signalling
The full message envelope is defined in punch-messages.schema.json. This
section specifies the message exchange and runtime invariants.
6.1 Connection establishment
A peer MUST open the WebSocket at /api/ws/{session}?t={token}. The
token is supplied as a query parameter because most browser WebSocket
clients cannot set the Authorization header on the upgrade request. The
server MUST validate the token before completing the upgrade.
6.2 Message exchange
After a successful upgrade, the peer SHOULD send a register message
within 30 s. Failure to register within that window MAY result in
the server closing the WebSocket with code 4408.
The server’s response set following register is:
- When the stream’s complementary peer has already registered: a
peermessage containing the remote peer’sip,port, the local bind portlocalPort, the sharedpassphrase, and the suggestedlatency. The complementary peer receives an equivalentpeermessage. Admin observer connections receive apeer_matchmessage carrying both endpoints. - Always: a
sessionmessage reflecting the newSessionState, broadcast to every connection in the session. - Once both peers in the stream have signalled
ready: astartmessage to each peer, instructing them to begin SRT rendezvous.
Implementations MUST NOT require any specific ordering between the
peer and session messages emitted in response to a single register.
Clients MUST handle them in either order.
6.3 Keep-alive
Implementations MUST support WebSocket ping/pong to detect dead
peers. Servers SHOULD treat a peer as dead if no traffic
(message, ping, or pong) is observed for 90 s and SHOULD close the
WebSocket with code 4408. Clients SHOULD send an application-level
keep-alive (punch.ping) every 30 s when no other traffic is flowing.
6.4 Message ordering
Within a single WebSocket the server MUST preserve message order
relative to the events that produced them. No ordering guarantee is made
across different peers in a session — clients MUST NOT assume that
peer A’s register is observed before peer B’s register, only that the
matching peer messages are emitted only after both register messages
have been processed.
6.5 Hibernation
Cloudflare Workers Durable Objects support WebSocket hibernation. Servers
MAY hibernate idle WebSockets and reattach state via the
PeerAttachment envelope (see src/protocol.ts). Hibernation MUST be
transparent to clients.
7. Token Format
Punch tokens are HMAC-signed JSON payloads encoded similarly to JWT but without the JOSE header layer.
token := "p_" base64url(payload) "." base64url(hmac-sha256(payload))payload is a JSON object:
{ "session": "string", "role": "admin" | "peer", "stream": "string | null", "exp": 1777810754}session(REQUIRED) — the session this token is bound to. Implementations MUST reject the token if the request’s session does not match.role(REQUIRED) —adminfor the producer,peerfor join URLs.stream(REQUIRED) — bound stream identifier, ornullfor admin tokens.exp(REQUIRED) — Unix epoch seconds. Implementations MUST reject expired tokens.
The signing key is a deployment-wide secret. Implementations MUST treat the secret as confidential, MUST NOT transmit it to clients, and SHOULD rotate it on a regular schedule.
8. Error Handling
8.1 HTTP
HTTP error responses MUST include a JSON body of the form:
{ "error": "<code>", "message": "<human-readable>" }The <code> value MUST be one of the values enumerated in
Error.error in openapi.yaml.
8.2 WebSocket close codes
In v1.0.0 implementations close the WebSocket with 1000 on normal
shutdown (admin DELETE, TTL expiry) and 1011 on an internal error or
when a peer is reaped by the dead-peer sweep. Application-level error
information is delivered as an error message immediately before close
where possible.
The 4xxx code range listed below is reserved for future versions and
MUST NOT be relied upon by v1.0.0 clients. Implementations claiming
conformance to a later version MAY emit these codes; v1.0.0
implementations MUST use 1000 and 1011 only.
| Code | Meaning (reserved for v1.1+) |
|---|---|
4400 | Invalid message format. |
4401 | Authentication failed. |
4404 | Session not found or already closed. |
4408 | Idle timeout or registration timeout. |
4409 | Stream slot already taken. |
4429 | Per-peer message rate limit exceeded. |
The 4xxx range is chosen to keep Punch error codes orthogonal to the IANA WebSocket close code registry.
9. Security Considerations
9.1 Token scope
A token grants exactly the permissions implied by role and stream. A
peer token bound to stream cam-wide MUST NOT be accepted for any
other stream or for admin operations.
9.2 Passphrases
Each session MUST generate a passphrase with at least 256 bits of cryptographically secure randomness. Passphrases MUST NOT be reused across sessions and MUST NOT be logged or persisted beyond the session’s lifetime.
9.3 Reflection / amplification
The signalling server is HTTP-only and MUST NOT echo client-supplied data into UDP traffic. The server has no UDP attack surface.
9.4 Coordinate leakage
Peer IP/port pairs are reported by clients themselves. The server MUST NOT disclose a peer’s coordinates to anyone other than the matched peer in the same stream and the session admin observer.
9.5 Rate limiting
Implementations MUST rate-limit session creation per source IP and SHOULD rate-limit per-token message volume to mitigate abuse of the WebSocket signalling channel.
9.6 Transport
Implementations MUST serve all HTTP and WebSocket endpoints over TLS
1.2 or higher. Implementations MUST NOT redirect from https to
http under any circumstances.
9.7 Privacy
The signalling server MUST NOT log session passphrases, admin tokens, or stream payloads. The server MAY log peer IP addresses for authentication-failure and rate-limit events for operational reasons; such logs SHOULD be retained no longer than necessary for operational analysis.
10. IANA Considerations
This protocol uses no IANA-allocated identifiers. Should a future revision require a registered URI scheme, MIME type, or port number, the allocation request will reference this document.
11. Versioning
The protocol version reported by GET /api/health reflects the
implementation’s adherence to the named version of this specification.
Implementations MUST include the version in the Punch-Version
response header on all HTTP responses.
A client MAY detect compatibility by inspecting the version field
on the health endpoint or the Punch-Version header on any response.
12. References
Normative
- RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels.
- RFC 8174 — Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words.
- RFC 8259 — The JavaScript Object Notation (JSON) Data Interchange Format.
- RFC 6455 — The WebSocket Protocol.
Informative
- SRT Alliance, “Secure Reliable Transport (SRT) Protocol”, 2024.
- WHIP — RFC 9725, WebRTC HTTP Ingest Protocol (analogous design pattern).
Appendix A: State machine reference
Appendix B: Implementation notes
The reference implementation lives at https://github.com/FiLORUX/punch.
Notable implementation choices that are not normative for the protocol
itself:
- Storage — One Cloudflare Durable Object per session. The protocol does not require Durable Objects; any storage layer offering per-session strong consistency suffices.
- WebSocket hibernation — A Cloudflare-specific optimisation. The
PeerAttachmentenvelope is an internal concern of hibernation; clients never observe it. - Rate limiting — In-memory per Worker isolate. Adequate for moderate traffic; production deployments at scale SHOULD use Cloudflare Workers Rate Limiting bindings or an external rate-limit store.
Appendix C: Change log
| Version | Date | Notes |
|---|---|---|
| 1.0.0 | 2026-05-03 | Initial stable release. |