Skip to content

RFC

Document: Punch Protocol Specification Version: 1.0.0 Status: Stable Date: 2026-05-03 Editor: The Thåst Collective


Abstract

This document specifies the Punch protocol, a session brokerage protocol for SRT (Secure Reliable Transport). Punch coordinates the out-of-band exchange that two SRT endpoints require but that SRT itself does not define: peer discovery, public-address resolution, passphrase distribution, and the synchronisation needed for SRT rendezvous-mode hole punching.

Punch is not a media transport. It does not carry, transcode, relay, or inspect SRT payloads. The signalling server brokers the introduction; once peers are matched, media flows directly between them.

Status of This Document

This is version 1.0.0 of the Punch protocol specification. Subsequent backwards-compatible revisions will increment the minor version. Breaking changes will increment the major version and be accompanied by a migration note. The current version is implemented at https://punch.thåst.se.

The companion artefacts are:

  • openapi.yaml — OpenAPI 3.1 description of the HTTP API
  • punch-messages.schema.json — JSON Schema 2020-12 for WebSocket messages
  • ../protocol.md — informal narrative reference

In the event of disagreement between this document and the companion files, the JSON Schema and OpenAPI document are normative for their respective surfaces; this document is normative for protocol semantics, state machines, and security considerations.

1. Conventions

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.

All time values are in milliseconds unless otherwise specified. All sizes are in octets unless otherwise specified.

JSON in this document follows RFC 8259. Field names are case-sensitive and MUST be encoded as UTF-8.

2. Terminology

  • Session — A named brokerage instance. Identified by a session name matching [A-Za-z0-9][A-Za-z0-9._-]{0,63}.
  • Stream — A logical media flow inside a session. A session MAY contain one or more streams. Each stream brokers exactly one pair of peers.
  • Peer — An SRT endpoint participating in a stream. Peers are symmetric — Punch does not distinguish caller from listener — but each peer MAY advertise its intended SRT role (sender or receiver).
  • Admin — The session creator. Holds an admin token granting full observer access (including peer_match events) and the ability to close the session.
  • Producer — The human or automation that creates a session and distributes peer URLs. Holds the admin token.

3. Architecture Overview

WebSocket

WebSocket

SRT rendezvous (UDP hole-punched)

Punch signalling server

Peer A

SRT sender

Peer B

SRT receiver

Punch never sees SRT media. Once peer coordinates are exchanged, the media path is direct peer-to-peer.

A producer creates a session via HTTP. Each peer subsequently opens a WebSocket to the session, registers a UDP port, and receives — when matched — the remote peer’s coordinates plus a shared passphrase. Peers then initiate SRT in rendezvous mode and the signalling layer is no longer involved in media transport.

4. Session Lifecycle

4.1 States

A session MUST be in exactly one of the following states:

StateMeaning
WAITINGSession created; fewer than the required peers per stream are registered.
READYAll required peers for at least one stream have registered and have been matched.
CONNECTEDAt least one peer has reported an active SRT connection (status: connected).
RELAYINGReserved for future relay fallback. MUST NOT be entered by v1.0.0 implementations.
CLOSEDSession terminated, either by admin action or TTL expiry. Terminal state.

4.2 Transitions

created → WAITING
all peers matched → READY (per-stream; session enters READY when any stream is matched)
peer reports srt → CONNECTED
admin DELETE → CLOSED (any state)
ttl expiry → CLOSED (any state)

A session MUST NOT transition out of CLOSED. Implementations MUST reject all subsequent operations on a closed session with 404 (over HTTP) or close the WebSocket with code 4404 (see §8.2).

4.3 Time-to-Live

Every session has an absolute expiry time expiresAt, recorded as Unix milliseconds at session creation. Implementations MUST close the session no later than expiresAt. Default TTL is 1800 s; the producer MAY request any value up to 86400 s.

5. HTTP API

The full HTTP surface is defined in openapi.yaml. This section enumerates normative requirements that are not expressible in OpenAPI alone.

5.1 Authentication

All HTTP endpoints under /api/session/{name} (other than POST /api/session) REQUIRE a bearer token in the Authorization header. The token format is described in §7. Implementations MUST reject requests with missing, malformed, or session-mismatched tokens with HTTP 401 and an Error body bearing code: AUTH_FAILED.

5.2 Rate limiting

Implementations MUST rate-limit POST /api/session per source IP. The default is 10 requests per minute per IP. Implementations MAY rate-limit other endpoints. Rate-limited requests MUST return 429 and SHOULD include Retry-After.

5.3 Idempotency

POST /api/session is NOT idempotent. A session name uniquely identifies a session for the duration of its lifetime. Subsequent POST with the same name MUST fail with 409 and code: SESSION_EXISTS.

5.4 CAPTCHA

Implementations MAY require a Cloudflare Turnstile token on POST /api/session to mitigate automated session creation. The token, when required, is supplied as turnstileToken in the request body. Failed verification MUST return 403 with code: TURNSTILE_FAILED. When the deployment is configured without a Turnstile secret, the field MUST be ignored.

6. WebSocket Signalling

The full message envelope is defined in punch-messages.schema.json. This section specifies the message exchange and runtime invariants.

6.1 Connection establishment

A peer MUST open the WebSocket at /api/ws/{session}?t={token}. The token is supplied as a query parameter because most browser WebSocket clients cannot set the Authorization header on the upgrade request. The server MUST validate the token before completing the upgrade.

6.2 Message exchange

After a successful upgrade, the peer SHOULD send a register message within 30 s. Failure to register within that window MAY result in the server closing the WebSocket with code 4408.

The server’s response set following register is:

  1. When the stream’s complementary peer has already registered: a peer message containing the remote peer’s ip, port, the local bind port localPort, the shared passphrase, and the suggested latency. The complementary peer receives an equivalent peer message. Admin observer connections receive a peer_match message carrying both endpoints.
  2. Always: a session message reflecting the new SessionState, broadcast to every connection in the session.
  3. Once both peers in the stream have signalled ready: a start message to each peer, instructing them to begin SRT rendezvous.

Implementations MUST NOT require any specific ordering between the peer and session messages emitted in response to a single register. Clients MUST handle them in either order.

6.3 Keep-alive

Implementations MUST support WebSocket ping/pong to detect dead peers. Servers SHOULD treat a peer as dead if no traffic (message, ping, or pong) is observed for 90 s and SHOULD close the WebSocket with code 4408. Clients SHOULD send an application-level keep-alive (punch.ping) every 30 s when no other traffic is flowing.

6.4 Message ordering

Within a single WebSocket the server MUST preserve message order relative to the events that produced them. No ordering guarantee is made across different peers in a session — clients MUST NOT assume that peer A’s register is observed before peer B’s register, only that the matching peer messages are emitted only after both register messages have been processed.

6.5 Hibernation

Cloudflare Workers Durable Objects support WebSocket hibernation. Servers MAY hibernate idle WebSockets and reattach state via the PeerAttachment envelope (see src/protocol.ts). Hibernation MUST be transparent to clients.

7. Token Format

Punch tokens are HMAC-signed JSON payloads encoded similarly to JWT but without the JOSE header layer.

token := "p_" base64url(payload) "." base64url(hmac-sha256(payload))

payload is a JSON object:

{
"session": "string",
"role": "admin" | "peer",
"stream": "string | null",
"exp": 1777810754
}
  • session (REQUIRED) — the session this token is bound to. Implementations MUST reject the token if the request’s session does not match.
  • role (REQUIRED) — admin for the producer, peer for join URLs.
  • stream (REQUIRED) — bound stream identifier, or null for admin tokens.
  • exp (REQUIRED) — Unix epoch seconds. Implementations MUST reject expired tokens.

The signing key is a deployment-wide secret. Implementations MUST treat the secret as confidential, MUST NOT transmit it to clients, and SHOULD rotate it on a regular schedule.

8. Error Handling

8.1 HTTP

HTTP error responses MUST include a JSON body of the form:

{ "error": "<code>", "message": "<human-readable>" }

The <code> value MUST be one of the values enumerated in Error.error in openapi.yaml.

8.2 WebSocket close codes

In v1.0.0 implementations close the WebSocket with 1000 on normal shutdown (admin DELETE, TTL expiry) and 1011 on an internal error or when a peer is reaped by the dead-peer sweep. Application-level error information is delivered as an error message immediately before close where possible.

The 4xxx code range listed below is reserved for future versions and MUST NOT be relied upon by v1.0.0 clients. Implementations claiming conformance to a later version MAY emit these codes; v1.0.0 implementations MUST use 1000 and 1011 only.

CodeMeaning (reserved for v1.1+)
4400Invalid message format.
4401Authentication failed.
4404Session not found or already closed.
4408Idle timeout or registration timeout.
4409Stream slot already taken.
4429Per-peer message rate limit exceeded.

The 4xxx range is chosen to keep Punch error codes orthogonal to the IANA WebSocket close code registry.

9. Security Considerations

9.1 Token scope

A token grants exactly the permissions implied by role and stream. A peer token bound to stream cam-wide MUST NOT be accepted for any other stream or for admin operations.

9.2 Passphrases

Each session MUST generate a passphrase with at least 256 bits of cryptographically secure randomness. Passphrases MUST NOT be reused across sessions and MUST NOT be logged or persisted beyond the session’s lifetime.

9.3 Reflection / amplification

The signalling server is HTTP-only and MUST NOT echo client-supplied data into UDP traffic. The server has no UDP attack surface.

9.4 Coordinate leakage

Peer IP/port pairs are reported by clients themselves. The server MUST NOT disclose a peer’s coordinates to anyone other than the matched peer in the same stream and the session admin observer.

9.5 Rate limiting

Implementations MUST rate-limit session creation per source IP and SHOULD rate-limit per-token message volume to mitigate abuse of the WebSocket signalling channel.

9.6 Transport

Implementations MUST serve all HTTP and WebSocket endpoints over TLS 1.2 or higher. Implementations MUST NOT redirect from https to http under any circumstances.

9.7 Privacy

The signalling server MUST NOT log session passphrases, admin tokens, or stream payloads. The server MAY log peer IP addresses for authentication-failure and rate-limit events for operational reasons; such logs SHOULD be retained no longer than necessary for operational analysis.

10. IANA Considerations

This protocol uses no IANA-allocated identifiers. Should a future revision require a registered URI scheme, MIME type, or port number, the allocation request will reference this document.

11. Versioning

The protocol version reported by GET /api/health reflects the implementation’s adherence to the named version of this specification. Implementations MUST include the version in the Punch-Version response header on all HTTP responses.

A client MAY detect compatibility by inspecting the version field on the health endpoint or the Punch-Version header on any response.

12. References

Normative

  • RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels.
  • RFC 8174 — Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words.
  • RFC 8259 — The JavaScript Object Notation (JSON) Data Interchange Format.
  • RFC 6455 — The WebSocket Protocol.

Informative

  • SRT Alliance, “Secure Reliable Transport (SRT) Protocol”, 2024.
  • WHIP — RFC 9725, WebRTC HTTP Ingest Protocol (analogous design pattern).

Appendix A: State machine reference

create()

all peers matched

srt connected

admin close / TTL expiry / fatal error

admin close / TTL expiry / fatal error

admin close / TTL expiry / fatal error

WAITING

READY

CONNECTED

CLOSED

Appendix B: Implementation notes

The reference implementation lives at https://github.com/FiLORUX/punch. Notable implementation choices that are not normative for the protocol itself:

  • Storage — One Cloudflare Durable Object per session. The protocol does not require Durable Objects; any storage layer offering per-session strong consistency suffices.
  • WebSocket hibernation — A Cloudflare-specific optimisation. The PeerAttachment envelope is an internal concern of hibernation; clients never observe it.
  • Rate limiting — In-memory per Worker isolate. Adequate for moderate traffic; production deployments at scale SHOULD use Cloudflare Workers Rate Limiting bindings or an external rate-limit store.

Appendix C: Change log

VersionDateNotes
1.0.02026-05-03Initial stable release.