From 9341c40842278bfab8ac1009d2b081ef9b49734f Mon Sep 17 00:00:00 2001 From: Aaran McGuire Date: Tue, 5 May 2026 19:56:54 +0100 Subject: [PATCH] Fix incorrect snprintf format specifier and undefined bsearch comparator MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 1. spss/readstat_sav_parse.c: %*s should be %.*s snprintf uses %*s (minimum field width — pads with spaces) where %.*s (precision — truncates to N chars) is intended. This means if temp_val is longer than str_len, the full string is copied rather than a str_len-character prefix. The same function also does memcpy(temp_val, str_start, str_len) where temp_val is char[65]. If str_len > 64 the memcpy overflows the stack buffer. Added a bounds check before the memcpy. Note: this file is generated by Ragel from readstat_sav_parse.rl. The fix is applied to both the generated .c and should be applied to the .rl source as well. 2. stata/readstat_dta_read.c: dta_compare_strls returns uint64 diff dta_compare_strls returns key->o - target->o where both are uint64_t. When the difference exceeds INT_MAX, the truncated int has the wrong sign, causing bsearch to navigate the wrong direction — potentially returning NULL (strL silently replaced with empty string) or a wrong entry (silent data corruption). Replaced with safe three-way comparison: (a>b)-(ao == target->o) - return key->v - target->v; - - return key->o - target->o; + if (key->o != target->o) + return (key->o > target->o) - (key->o < target->o); + return (key->v > target->v) - (key->v < target->v); } static dta_strl_t dta_interpret_strl_vo_bytes(dta_ctx_t *ctx, const unsigned char *vo_bytes) {