The native parser is designed to capture file descriptors directly in C++ so data doesn't have to flow through Node.js before reaching the native runtime. This provides significant performance benefits by:
- Bypassing Node.js streams layer - Data goes directly from kernel to C++
- Background I/O thread - Reading happens on a separate C++ thread
- Zero-copy transfer - Raw buffers passed to JS without intermediate copies
- Reduced overhead - No JavaScript event loop involvement in I/O
┌─────────────────────────────────────────────────────────────┐
│ Node.js Main Thread │
│ │
│ const fd = fs.openSync('/path/to/file', 'r'); │
│ const parser = createJsonParserNativeFromFd(fd, {...}); │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Native Addon (N-API) │ │
│ │ │ │
│ │ 1. Receives fd from JS │ │
│ │ 2. Duplicates fd: fd_dup = dup(fd) │ │
│ │ 3. Starts C++ background thread │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ │ fd_dup │
│ ▼ │
└─────────────────────────────────────────────────────────────┘
│
│ Direct syscall
▼
┌─────────────────────────────────────────────────────────────┐
│ C++ Background Thread (std::thread) │
│ │
│ while (!stop) { │
│ ssize_t n = read(fd_dup, buf, BUF_SZ); // Direct read │
│ // Process data... │
│ // Send batches to main thread via TSFN │
│ } │
│ │
│ Data path: Kernel → C++ buffer → Zero-copy → JS │
└─────────────────────────────────────────────────────────────┘
1. File Descriptor Duplication
// native/json-native-parser.cc
static void parser_thread_main(ParserInstance* inst) {
// Duplicate fd so stop() can close it to break a blocking read
inst->fd_dup = dup(inst->fd);
// Now we have our own copy of the fd in C++ land
// This allows the background thread to read independently
}2. Direct System Call Reading
// Read directly from fd using syscall
// No Node.js stream layer involved!
const size_t BUF_SZ = 256 * 1024; // 256KB buffer
std::vector<char> buf(BUF_SZ);
while (!inst->stop.load()) {
ssize_t n = read(inst->fd_dup, buf.data(), BUF_SZ);
// Data flows: Kernel → C++ buffer (no JS involved)
if (n == 0) break; // EOF
if (n < 0) {
// Handle errors
continue;
}
// Process data in C++...
// Send to JS via zero-copy buffers
}3. Zero-Copy Buffer Transfer
// When passRawBuffers: true (default)
// Allocate buffer owned by C++
item.external_data = std::make_unique<uint8_t[]>(candidate.size());
std::memcpy(item.external_data.get(), candidate.data(), candidate.size());
// Create external buffer (zero-copy)
// JS side gets a Buffer that points to C++ memory
napi_create_external_buffer(env, it.byte_count, it.external_data.get(),
nullptr, nullptr, &buffer);stdin is file descriptor 0, so you can pass it directly:
import {createJsonParserNativeFromFd} from '@oresoftware/json-stream-parser';
// stdin is fd 0
const parser = createJsonParserNativeFromFd(0, {
delimiter: '\n',
batchSize: 2048
});
parser.on('data', (obj) => {
// Data flows: stdin (fd 0) → C++ background thread → JS
// No Node.js stream layer involved!
console.log('Parsed:', obj);
});Benefits:
- ✅ Data goes directly from kernel to C++ (no Node.js stream overhead)
- ✅ Reading happens on background thread (doesn't block main thread)
- ✅ Zero-copy buffers to JS (no intermediate copies)
You can extract the file descriptor from a Node.js socket and pass it to the native parser:
import * as net from 'net';
import {createJsonParserNativeFromFd} from '@oresoftware/json-stream-parser';
const server = net.createServer((socket) => {
// Get the underlying file descriptor from the socket
// Note: This uses internal Node.js API (_handle.fd)
const fd = (socket as any)._handle?.fd;
if (fd !== undefined && fd >= 0) {
// Pass fd directly to native parser
// Data flows: Socket → C++ background thread → JS
// Bypasses Node.js stream layer!
const parser = createJsonParserNativeFromFd(fd, {
delimiter: '\n',
batchSize: 2048
});
parser.on('data', (obj) => {
console.log('Received:', obj);
});
parser.on('error', (err) => {
console.error('Parse error:', err);
});
} else {
// Fallback to stream-based parser if fd not available
const {JSONParser} = require('@oresoftware/json-stream-parser');
socket.pipe(new JSONParser()).on('data', (obj) => {
console.log('Received:', obj);
});
}
});
server.listen(6970, () => {
console.log('Server listening on port 6970');
});socket._handle.fdis an internal Node.js API and may change between versions- Works on Unix-like systems (Linux, macOS, BSD)
- Windows sockets work differently (may need
socket._handle._socket)
- The socket must remain open while the parser is reading
- Closing the socket will cause the parser to stop (EOF)
- The parser duplicates the fd internally, so it can read independently
- The native parser only reads from the fd
- You can still write to the socket using
socket.write() - The parser doesn't interfere with socket writes
Data Flow: Kernel → Node.js Stream → JS Transform → JS Event Loop
Overhead: Stream buffering + JS event loop + Transform processing
Data Flow: Kernel → C++ Background Thread → Zero-copy Buffer → JS
Overhead: Minimal (just TSFN callback)
Performance Benefits:
- ~30-50% faster for large files
- Better under load (background I/O doesn't block main thread)
- Lower memory overhead (zero-copy buffers)
- Reduced CPU usage (no JS stream processing)
- You have a file descriptor (files, stdin, inherited FDs)
- You want maximum performance
- Main thread is busy (background I/O helps)
- Processing large files or high-throughput streams
- You have a Node.js stream (TCP, HTTP, child process stdout)
- You can't access the underlying fd
- You need compatibility across Node.js versions
- Socket fd access is not reliable on your platform
import * as net from 'net';
import {createJsonParserNativeFromFd} from '@oresoftware/json-stream-parser';
const server = net.createServer((socket) => {
console.log('Client connected');
// Try to get fd for direct C++ access
const fd = (socket as any)._handle?.fd;
if (fd !== undefined && fd >= 0) {
console.log(`Using native parser with fd ${fd}`);
const parser = createJsonParserNativeFromFd(fd, {
delimiter: '\n',
batchSize: 2048,
emitNonJSON: true
});
parser.on('data', (obj) => {
console.log('Received:', obj);
// Echo back (socket.write still works!)
socket.write(JSON.stringify({
echo: obj,
timestamp: Date.now()
}) + '\n');
});
parser.on('string', (line) => {
console.log('Non-JSON:', line);
});
parser.on('error', (err) => {
console.error('Parse error:', err);
});
socket.on('close', () => {
console.log('Client disconnected');
});
} else {
// Fallback to stream parser
console.log('Falling back to stream parser');
const {JSONParser} = require('@oresoftware/json-stream-parser');
socket.pipe(new JSONParser()).on('data', (obj) => {
console.log('Received:', obj);
socket.write(JSON.stringify({echo: obj}) + '\n');
});
}
});
server.listen(6970, () => {
console.log('TCP JSON server listening on port 6970');
});The native parser is designed to capture file descriptors directly in C++ to bypass Node.js stream overhead:
- File descriptors are duplicated in C++ (
dup()) - Reading happens on a C++ background thread using
read()syscall - Data flows: Kernel → C++ buffer → Zero-copy → JS
- No Node.js stream layer involved in the data path
- Better performance especially under load
This architecture provides significant performance benefits by eliminating JavaScript stream processing overhead and moving I/O to a dedicated background thread.