@@ -235,6 +235,40 @@ JSON schema for wheel metadata has been produced.
235235This schema will be updated with each revision to the wheel metadata
236236specification. The schema is available in :ref: `0819-wheel-json-schema `.
237237
238+ Handling of Integer and Float Values in JSON Package Metadata
239+ -------------------------------------------------------------
240+
241+ While no core metadata or wheel metadata values are currently encoded as
242+ integers or floats, when decoding a JSON file, integer and float values should
243+ be decoded as strings for both core metadata and wheel metadata. This is to
244+ avoid compatibility issues due to differences in precision and representation
245+ of integers and floats between languages and parsers. This also mitigates a
246+ security risk with integer parsing denial of service attacks based on
247+ `CVE-2020-10735 <https://github.com/advisories/GHSA-6jr7-xr67-mgxw >`__.
248+
249+ If a future field of core metadata or wheel metadata needs to be encoded as an
250+ integer or float, the field MUST be decoded lazily after loading the JSON
251+ document. This minimizes the risks of denial of service attacks by minimizing
252+ the integer parsing allowed during the deserialization process.
253+
254+ If using the Python :mod: `!json ` module, parsing integers and floats as strings
255+ can be accomplished by setting the ``parse_int `` and ``parse_float ``
256+ keyword arguments to :func: `json.load ` or :func: `json.loads ` to :class: `str `.
257+
258+ Handling of Duplicate Keys in JSON Package Metadata
259+ ---------------------------------------------------
260+
261+ JSON does not define semantics for duplicate keys in a JSON document. However,
262+ different parsers treat duplicate keys differently. Tools SHOULD NOT generate
263+ duplicate keys in JSON package metadata. However, it is likely duplicate keys
264+ may be generated anyway, so tools consuming JSON package metadata should handle
265+ duplicate keys gracefully. In the interest of compatibility and matching the
266+ behavior of the Python :mod: `!json ` module, if duplicate keys are encountered,
267+ the second duplicate key should be used as the data for that key. This matches
268+ the behavior of many JSON parsers such as those in Python, Rust, Go, and the
269+ ECMAScript Standard. Tools MAY warn about duplicate keys in JSON package
270+ metadata.
271+
238272Deprecation of the ``METADATA ``, ``PKG-INFO ``, and ``WHEEL `` Files
239273------------------------------------------------------------------
240274
@@ -272,25 +306,13 @@ or ``WHEEL`` files.
272306Security Implications
273307=====================
274308
275- One attack vector with JSON encoded core metadata is if the JSON payload is
276- designed to consume excessive memory or CPU resources in a denial of service
277- (DoS) attack. While this attack is not likely to affect users whom can cancel
278- resource-intensive interactive operations, it may be an issue for package
279- indexes.
280-
281- There are several mitigations that can be made to prevent this:
282-
283- #. The length of the JSON payload can be restricted to a reasonable size.
284- #. The reader may use a :class: `~json.JSONDecoder ` to omit parsing :class: `int `
285- and :class: `float ` values to avoid quadratic number parsing time complexity
286- attacks.
287- #. I plan to contribute a change to :class: `~json.JSONDecoder ` in Python
288- 3.15+ that will allow it to be configured to restrict the nesting of JSON
289- payloads to a reasonable depth. Core metadata currently has a maximum depth
290- of 2 to encode mapping and list fields.
291-
292- With these mitigations in place, concerns about denial of service attacks with
293- JSON encoded core metadata are minimal.
309+ JSON encoded core metadata and wheel metadata have the potential for a denial
310+ of service attack due to the quadratic parsing time complexity of parsing of
311+ integers. This PEP mitigates this risk by requiring that integers and floats be
312+ parsed as strings, and only lazily parsed into integers or floats after the
313+ initial deserialization of the JSON document. With these mitigations in place,
314+ concerns about denial of service attacks with JSON encoded package metadata are
315+ considered minimal.
294316
295317
296318Reference Implementation
0 commit comments