Internet-Draft HTTP Identity Digest December 2024
Pardue & West Expires 9 June 2025 [Page]
Workgroup:
HyperText Transfer Protocol
Internet-Draft:
draft-pardue-http-identity-digest-latest
Published:
Intended Status:
Standards Track
Expires:
Authors:
L. Pardue
Cloudflare
M. West
Google

HTTP Identity Digest

Abstract

The Repr-Digest and Content-Digest integrity fields are subject to HTTP content coding considerations. There are some use cases that benefit from the unambiguous exchange of integrity digests of unencoded representation. The Identity-Digest and Want-Identity-Digest fields complement existing integrity fields for this purpose.

About This Document

This note is to be removed before publishing as an RFC.

The latest revision of this draft can be found at https://LPardue.github.io/draft-pardue-http-identity-digest/draft-pardue-http-identity-digest.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-pardue-http-identity-digest/.

Discussion of this document takes place on the HyperText Transfer Protocol Working Group mailing list (mailto:http-wg@hplb.hp.com), which is archived at https://www.ics.uci.edu/pub/ietf/http/hypermail.

Source for this draft and an issue tracker can be found at https://github.com/LPardue/draft-pardue-http-identity-digest.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 9 June 2025.

Table of Contents

1. Introduction

The Integrity fields defined in [DIGEST-FIELDS] are suitable for a range of use cases. However, because the fields are subject to HTTP content coding considerations, it is difficult to support use cases that could benefit from the exchange of integrity digests of the unencoded representation.

As a simple example, an application using HTTP might be presented with request or response representation data that has been transparently decoded. Attempting to verify the integrity of the data against the Repr-Digest would first require re-encoding that data using the same coding indicated by the Content-Encoding header field (Section 8.4 of [HTTP]), which is not always possible (see Section 6.5 of [DIGEST-FIELDS]).

Even when receiver-side re-encoding for the purpose of Repr-Digest validation is technically possible, it might not be practical for certain kinds of environments. For instance, browsers tend to provide built-in support for transparent decoding but little support for encoding; while this could be done via the use of additional libraries it would create work in JavaScript that could contend with other activities. Even on the server side, the re-encoding of received data might not be acceptable; some coding algorithms are optimized towards efficient decoding at the cost of complex encoding. This is all made more complex if the the Content-Encoding field value indicates a series of encodings.

A more complex example involves HTTP Range Requests (Section 14 of [HTTP]), where a client fetches multiple partial representations from different origins and "stitches" them back into a whole. Unfortunately, if the origins apply different content coding, the Repr-Digest field will vary by the server's selected encoding (i.e. the Content-Encoding header field, Section 8.4 of [HTTP]). This provides a challenge for a client - in order to verify the integrity of the pieced-together whole it would need to remove the encoding of each part, combine them, and then encode the result in order to compare against one or more Repr-Digests.

The Accept-Encoding header field (Section 12.5.3 of [HTTP]) provides the means to indicate preferences for content coding. It is possible for an endpoint to indicate a preference for no encoding, for example by sending the "identity" token. However, codings often provide data compression that is advantageous. Disabling content coding in order to simplify integrity checking is possibly an unacceptable trade off.

For a variety of reasons, decoding and re-encoding content in order to benefit from HTTP integrity fields is not preferable. This specification defines the Identity-Digest and Want-Identity-Digest fields to support a simpler validation workflow in some scenarios where content coding is applied. These fields complement the other integrity fields defined in [DIGEST-FIELDS].

2. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

This document uses the Augmented BNF defined in [RFC5234] and updated by [RFC7405]. This includes the rules: LF (line feed)

This document uses the following terminology from Section 3 of [STRUCTURED-FIELDS] to specify syntax and parsing: Byte Sequence, Dictionary, Integer, and List.

The definitions "representation", "selected representation", "representation data", "representation metadata", "user agent" and "content" in this document are to be interpreted as described in [HTTP].

Integrity fields: collective term for Content-Digest, Repr-Digest, and Identity-Digest

Integrity preference fields: collective term for Want-Repr-Digest, Want-Content-Digest, and Want-Identity-Digest

3. Complementary Integrity Fields

The following examples illustrate how Integrity fields can be used in combination to address different and complementary needs, particularly the cases described in Section 1. The unencoded data used in the example is the string "An unexceptional string" following by an LF character.

When a response message is not conveying partial or encoded representation data, all Integrity fields contain the same value, making validation trivial and identical.

GET /boringstring HTTP/1.1
Host: example.org

Figure 1: Simple GET request
NOTE: '\' line wrapping per RFC 8792

HTTP/1.1 200 OK
Content-Length: 24
Content-Digest: \
  sha-256=:5Bv3NIx05BPnh0jMph6v1RJ5Q7kl9LKMtQxmvc9+Z7Y=:
Repr-Digest: \
  sha-256=:5Bv3NIx05BPnh0jMph6v1RJ5Q7kl9LKMtQxmvc9+Z7Y=:
Identity-Digest: \
  sha-256=:5Bv3NIx05BPnh0jMph6v1RJ5Q7kl9LKMtQxmvc9+Z7Y=:

An unexceptional string
Figure 2: Response to GET request

When a response message conveys complete encoded content, the Content-Digest and the Repr-Digest are the same, while the Identity-Digest is different.

GET /boringstring HTTP/1.1
Host: example.org
Accept-Encoding: gzip

Figure 3: GET request with content negotiation
NOTE: '\' line wrapping per RFC 8792

HTTP/1.1 200 OK
Content-Encoding: gzip
Content-Digest: \
  sha-256=:XyjvEuFb1P5rqc2le3vQm7M96DwZhvmOwqHLu2xVpY4=:
Repr-Digest: \
  sha-256=:XyjvEuFb1P5rqc2le3vQm7M96DwZhvmOwqHLu2xVpY4=:
Identity-Digest: \
  sha-256=:5Bv3NIx05BPnh0jMph6v1RJ5Q7kl9LKMtQxmvc9+Z7Y=:

1f 8b 08 00 79 1f 08 64 00 ff
73 cc 53 28 cd 4b ad 48 4e 2d
28 c9 cc cf 4b cc 51 28 2e 29
ca cc 4b e7 02 00 7e af 07 44
18 00 00 00
Figure 4: Response with gzip encoding

Finally, when a response message contains partial and encoded content, all Integrity fields vary. The Content-Digest can be used to validate the integrity of the received part. Repr-Digest or Identity-Digest can be used later after reconstruction, the choice of which to use is left to the application, which would consider a range of factors outside the scope of discussion.

GET /boringstring HTTP/1.1
Host: example.org
Accept-Encoding: gzip
Range: bytes=0-10

Figure 5: Range request with content negotiation
NOTE: '\' line wrapping per RFC 8792

HTTP/1.1 206 Partial Content
Content-Encoding: gzip
Content-Digest: \
  sha-256=:SotB7Pa5A7iHSBdh9mg1Ev/ktAzrxU4Z8ldcCIUyfI4=:
Repr-Digest: \
  sha-256=:XyjvEuFb1P5rqc2le3vQm7M96DwZhvmOwqHLu2xVpY4=:
Identity-Digest: \
  sha-256=:5Bv3NIx05BPnh0jMph6v1RJ5Q7kl9LKMtQxmvc9+Z7Y=:

1f 8b 08 00 79 1f 08 64 00 ff
Figure 6: Partial response with gzip encoding

4. The Identity-Digest Field

The Identity-Digest HTTP field can be used in requests and responses to communicate digests that are calculated using a hashing algorithm applied to the representation with no content coding (a.k.a. an identity encoding). Apart from the content coding concerns, it behaves similarly to Repr-Digest.

Identity-Digest is a Dictionary (see Section 3.2 of [STRUCTURED-FIELDS]) where each:

For example:

NOTE: '\' line wrapping per RFC 8792

Identity-Digest: \
  sha-512=:YMAam51Jz/jOATT6/zvHrLVgOYTGFy1d6GJiOHTohq4yP+pgk4vf2aCs\
  yRZOtw8MjkM7iw7yZ/WkppmM44T3qg==:

The Dictionary type can be used, for example, to attach multiple digests calculated using different hashing algorithms.

NOTE: '\' line wrapping per RFC 8792

Identity-Digest: \
  sha-256=:d435Qo+nKZ+gLcUHn7GQtQ72hiBVAgqoLsZnZPiTGPk=:,\
  sha-512=:YMAam51Jz/jOATT6/zvHrLVgOYTGFy1d6GJiOHTohq4yP+pgk4vf2aCs\
  yRZOtw8MjkM7iw7yZ/WkppmM44T3qg==:

A recipient MAY ignore any or all digests. This allows the recipient to choose which hashing algorithm(s) to use for validation instead of verifying every digest.

A sender MAY send a digest without knowing whether the recipient supports a given hashing algorithm, or even knowing that the recipient will ignore it.

Identity-Digest can be sent in a trailer section. In this case, Identity-Digest MAY be merged into the header section; see Section 6.5.1 of [HTTP].

5. The Want-Identity-Digest Field

Want-Identity-Digest indicates that the sender would like to receive a representation digest on messages associated with the request URI and representation metadata where no content coding is applied, using the Identity-Digest field.

If Want-Identity-Digest is used in a response, it indicates that the server would like the client to provide the Identity-Digest field on future requests.

Want-Identity-Digest is only a hint. The receiver of the field can ignore it and send an Integrity field using any algorithm or omit fields entirely. It is not a protocol error if preferences are ignored. Applications that use Integrity fields and Integrity preferences can define expectations or constraints that operate in addition to this specification.

Want-Identity-Digest is of type Dictionary where each:

Examples:

Want-Identity-Digest: sha-256=1
Want-Identity-Digest: sha-512=3, sha-256=10, unixsum=0

6. Security Considerations

The considerations in [DIGEST-FIELDS] apply. There are no known additional considerations.

7. IANA Considerations

This document has no IANA actions (yet)

8. Normative References

[DIGEST-FIELDS]
Polli, R. and L. Pardue, "Digest Fields", RFC 9530, DOI 10.17487/RFC9530, , <https://www.rfc-editor.org/rfc/rfc9530>.
[HTTP]
Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC5234]
Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/rfc/rfc5234>.
[RFC7405]
Kyzivat, P., "Case-Sensitive String Support in ABNF", RFC 7405, DOI 10.17487/RFC7405, , <https://www.rfc-editor.org/rfc/rfc7405>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
[STRUCTURED-FIELDS]
Nottingham, M. and P. Kamp, "Structured Field Values for HTTP", RFC 9651, DOI 10.17487/RFC9651, , <https://www.rfc-editor.org/rfc/rfc9651>.

Acknowledgments

Early drafts of [DIGEST-FIELDS] included a mechanism to support the exchange of digests where no content coding is applied, which was removed before publication. While the design here is different, it is motivated by discussion of the previous design in the HTTP WG. The motivating use cases still mostly apply identically.

Authors' Addresses

Lucas Pardue
Cloudflare
Mike West
Google