Commit d8b7b8e
committed
docs(jina): JinaV5 docstring — point at existing precision-path primitives
User directive: item 11 should reference existing code, NOT duplicate it.
"Only document, use, don't duplicate."
Updated the ModelSource::JinaV5 variant docstring to:
1. Correct "Qwen3 base" → "Qwen 3.5 base" (per user's Qwopus/Qwen3.5
clarification; Qwopus and Jina v5 share the Qwen 3.x family)
2. Add Reader-LM v3 alias explicitly — "Also known as Reader-LM v3 (same
model, alternate name — BERT 3.x architecture lineage; NOT the older
Qwen2-based Reader-LM 1.5B/v1/v2)"
3. Document the canonical precision path by CITING EXISTING PRIMITIVES
with file:line references. No new code, no duplicated conversion logic:
- crate::hpc::gguf::read_tensor_f32 (src/hpc/gguf.rs:188) —
F16/F32/BF16/Q8_0 → Vec<f32> loader, handles F16 source to F32
transient upcast in a single call
- crate::hpc::gguf::f16_to_f32 (src/hpc/gguf.rs:417) — scalar
per-element F16 → F32 primitive (used internally by read_tensor_f32)
- crate::hpc::quantized::f32_to_bf16_rounded (src/hpc/quantized.rs:80) —
F32 working format → BF16 storage conversion
- crate::hpc::quantized::f32_vec_to_bf16 — slice variant of the above
- crate::hpc::quantized::bf16_gemm_f32 (src/hpc/quantized.rs:108) —
BF16 GEMM with F32 accumulation (the actual BF16 compute primitive)
- crate::simd::F32x16::mul_add / F32x8 / F64x8 (src/simd.rs:206) —
hardware FMA primitive (the "add_mul" the user was referencing).
Compiles to VFMADD213PS (AVX-FMA) or VDPBF16PS (AVX-512-BF16).
4. Explicit anti-patterns:
- Never F16 → BF16 direct (loses 3 exponent bits, F16 max ~65504
overflows before reaching BF16 range)
- Never 8-bit quantization as compute precision (only as final
calibrated storage format)
- No F32 in hot loops (F32 is strictly a transient upcast pipe)
5. Referenced the external calibration path for completeness:
lance-graph/crates/bgz-tensor/src/gamma_phi.rs::calibrate_gamma
(HDR-TV-style per-role normalizer, not an ndarray-internal primitive)
Verified before commit (per "verify assumed validity" rule):
- cargo check --lib: clean, pre-existing warnings only
- cargo test --lib hpc::jina::runtime: 11 tests pass, including
test_jina_runtime_loads and test_jina_v4_explicit_route (both still
assert JinaV4 because JINA still loads v4 bytes pre-bake)
- All cited symbols verified to exist at the file:line references via grep:
* src/hpc/gguf.rs:188 read_tensor_f32 ✓
* src/hpc/gguf.rs:417 f16_to_f32 ✓
* src/hpc/quantized.rs:80 f32_to_bf16_rounded ✓ (confirmed wrapper line)
* src/hpc/quantized.rs:108 bf16_gemm_f32 ✓
* src/simd.rs:206 mul_add ✓
Pure docstring change, no code behavior change, no new dependencies,
no new functions. Fully additive.
https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A1 parent 2a7f89e commit d8b7b8e
1 file changed
Lines changed: 48 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
44 | | - | |
45 | | - | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
46 | 88 | | |
47 | 89 | | |
48 | 90 | | |
49 | | - | |
50 | | - | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
51 | 94 | | |
52 | 95 | | |
53 | 96 | | |
| |||
0 commit comments