Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 173 additions & 0 deletions docs/dev/hotfile-replication.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
# Hot File Replication

Hot file replication automatically creates additional replicas of files that are being read
frequently (i.e. "hot" files), distributing load across the pool fleet.

## How It Works

### Protocol-Agnostic Design

All dCache read protocols (DCap, NFS/pNFS, WebDAV, xrootd, FTP, HTTP, pool-to-pool) send a
`PoolIoFileMessage` to the pool when a client requests a file. The pool dispatches that message
to `PoolV4.ioFile()`, which is therefore a single, protocol-independent entry point for every
read request. Hot file monitoring is implemented at that point:

```
Any Door (DCap / NFS / WebDAV / xrootd / FTP / HTTP / …)
→ sends PoolIoFileMessage
→ Pool.messageArrived(PoolIoFileMessage)
→ PoolV4.ioFile() ← monitoring happens here
→ queues mover (protocol-specific)
→ FileRequestMonitor.reportFileRequest(pnfsId, currentCount, protocolInfo)
```

Because the counting and triggering happen in `PoolV4.ioFile()`, no protocol-specific code
changes are required to benefit from this feature.

### Request Counting

`PoolV4.ioFile()` reads the concurrent mover count for the file from `IoQueueManager`:

```java
long requestCount = _ioQueue.numberOfRequestsFor(message.getPnfsId());
_fileRequestMonitor.reportFileRequest(message.getPnfsId(), requestCount,
message.getProtocolInfo());
```

When `requestCount` reaches or exceeds the configured threshold,
`MigrationModule.reportFileRequest()` creates a migration job named
`hotfile-<pnfsId>` that replicates the file to additional pools.

### Pool Selection

The migration job selects target pools by querying PoolManager via `PoolMgrQueryPoolsMsg`,
deriving `protocolUnit` from the triggering request's `ProtocolInfo` (e.g., `"DCap/3"`) and
`netUnitName` from the client's IP address when available (e.g., `"192.168.1.10"`). When the
client IP is not available (non-IP protocol or unknown), an empty string is used for `netUnitName`,
which causes PoolManager to match any network unit. When `ProtocolInfo` is null (e.g., for
internal pool-to-pool transfers), `protocolUnit` falls back to `"*/*"` and `netUnitName` to `""`
so that selection is based solely on the file's storage group and pool-group read preferences.

`PoolMgrQueryPoolsMsg.getPools()` returns a `List<String>[]` where index 0 is the highest
read-preference level. `PoolListByPoolMgrQuery` selects **only** the first non-empty
preference level, so the file is always replicated to the best available pools:

```java
// Only take the first non-empty preference level (highest read preference)
for (int i = 0; i < poolLists.length; i++) {
List<String> poolList = poolLists[i];
if (poolList != null && !poolList.isEmpty()) {
selectedPools = poolList;
break;
}
}
```

Prior to this, all preference levels were flattened into a union, causing files to be
replicated to pools from lower-preference groups (e.g. flush pools) instead of the intended
read-only pools.

### Job Housekeeping

To prevent unbounded memory growth, `MigrationModule` keeps at most 50 hotfile jobs. When a
new job would exceed that limit, the oldest jobs that have reached a terminal state
(`FINISHED`, `CANCELLED`, `FAILED`) are pruned first.

## Configuration

| Property | Default | Description |
|---|---|---|
| `pool.hotfile.replication.enable` | `false` | Enable/disable hot file monitoring. **Must be `true` to activate.** |
| `pool.migration.hotfile.threshold` | `50` | Number of concurrent read movers required to trigger replication |
| `pool.migration.hotfile.replicas` | `1` | Number of additional replicas to create |
| `pool.migration.concurrency.default` | `1` | Number of files the migration job migrates concurrently |

Example (`dcache.conf` or pool layout file):

```ini
pool.hotfile.replication.enable = true
pool.migration.hotfile.threshold = 3
pool.migration.hotfile.replicas = 3
pool.migration.concurrency.default = 1
```

> **Note:** The feature is disabled by default. A pool restart is required after any
> configuration change.

## Key Source Files

| File | Role |
|---|---|
| `modules/dcache/src/main/java/org/dcache/pool/classic/PoolV4.java` | Entry point; checks enable flag, calls `FileRequestMonitor` |
| `modules/dcache/src/main/java/org/dcache/pool/migration/MigrationModule.java` | Implements `FileRequestMonitor`; counts requests, creates and manages migration jobs |
| `modules/dcache/src/main/java/org/dcache/pool/migration/PoolListByPoolMgrQuery.java` | Queries PoolManager for eligible target pools; selects highest-preference level only |
| `modules/dcache/src/test/java/org/dcache/pool/classic/HotfileMonitoringTest.java` | Spring-context integration test for enable/disable behaviour |
| `modules/dcache/src/test/java/org/dcache/pool/migration/MigrationModuleTest.java` | Unit tests for `reportFileRequest`, threshold, housekeeping |
| `modules/dcache/src/test/java/org/dcache/pool/migration/PoolListByPoolMgrQueryTest.java` | Unit tests for pool selection, preference-level handling, unknown net unit, and wildcard protocol |
| `skel/share/defaults/pool.properties` | Canonical defaults for all `pool.hotfile.*` and `pool.migration.hotfile.*` properties |

## Diagnostics

### Log Messages

With the default log level the following INFO messages are emitted by `MigrationModule`:

```
Hot file monitoring: pnfsId=<id>, requests=<n>, threshold=<t>
Hot file detected! Triggering replication for pnfsId=<id>
Created migration job with id hotfile-<id> for pnfsId <id> with <n> replicas and concurrency <c>
Starting migration job hotfile-<id> for pnfsId <id>
Successfully started migration job hotfile-<id> for pnfsId <id>
Job hotfile-<id> already exists with state <STATE>
```
Comment on lines +113 to +122
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation claims specific INFO-level log messages from MigrationModule (e.g. "Hot file detected!" / "Successfully started migration job...") but the current implementation appears to log hotfile job creation/start at DEBUG and does not emit these exact messages. Please update the documented messages/levels to match the actual log statements, or adjust the code to emit the documented INFO logs.

Copilot uses AI. Check for mistakes.

`PoolV4` emits at INFO:

```
PoolV4.ioFile: Received IO request for pnfsId=<id>, hotFileEnabled=<bool>, monitorSet=<bool>
PoolV4.ioFile: Calling reportFileRequest for pnfsId=<id>, count=<n>
```

And at ERROR if the monitor is not wired:

```
PoolV4.ioFile: Hot file replication enabled but FileRequestMonitor is NULL!
```
Comment on lines +124 to +135
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states PoolV4 emits INFO messages like "Received IO request..." and an ERROR when FileRequestMonitor is NULL, but PoolV4.ioFile currently logs only "moverId ... received request" at DEBUG and calls _fileRequestMonitor.reportFileRequest(...) without a null-check/logging. Please align this section with the actual PoolV4 logging/behaviour (or add the described logging/null handling if that’s the intended UX).

Copilot uses AI. Check for mistakes.

`PoolListByPoolMgrQuery` emits at DEBUG when a preference level is selected:

```
Selected preference level <i> with <n> pools for <pnfsId>: [pool1, pool2, …]
```

### Runtime Log Level Adjustment

```
# In the pool's admin shell
log set org.dcache.pool.classic.PoolV4 DEBUG
log set org.dcache.pool.migration.MigrationModule DEBUG
```

### Checking Job and Replica Status

```bash
# List active migration jobs on all pools
ssh -p <admin-port> admin@<host> '\s <pool-pattern> migration ls'

# Inspect a specific job
ssh -p <admin-port> admin@<host> '\s <pool-name> migration info hotfile-<pnfsId>'

# List replicas of a file
ssh -p <admin-port> admin@<host> '\sl <pnfsId> rep ls <pnfsId>'
```

### Interpreting Absence of Log Messages

| Symptom | Likely Cause |
|---|---|
| No `PoolV4.ioFile` messages | IO requests are not reaching the pool, or the feature is disabled (`hotFileEnabled=false`) |
| `monitorSet=false` in PoolV4 log | `FileRequestMonitor` not wired — check Spring context startup errors |
| `requests` stays at 1 | IoQueue not counting movers correctly |
| "Hot file detected" but no job created | Exception during job creation — check ERROR lines for a stack trace |
| Job created but not started | `MigrationModule` not started — run `migration start` in the admin interface |
| "Job already exists" repeating | Previous job is stuck in a non-terminal state — inspect job state |
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
package org.dcache.pool.classic;

import diskCacheV111.util.PnfsId;
import diskCacheV111.vehicles.ProtocolInfo;
import javax.annotation.Nullable;

/**
* Abstract interface for monitoring file requests in the pool.
Expand All @@ -12,7 +14,9 @@ public interface FileRequestMonitor {
*
* @param pnfsId the file identifier
* @param numberOfRequests the number of requests for this file
* @param protocolInfo the protocol info of the request, may be {@code null} if unknown
*/
void reportFileRequest(PnfsId pnfsId, long numberOfRequests);
void reportFileRequest(PnfsId pnfsId, long numberOfRequests,
@Nullable ProtocolInfo protocolInfo);
}

Original file line number Diff line number Diff line change
Expand Up @@ -756,7 +756,8 @@ private void ioFile(CellMessage envelope, PoolIoFileMessage message) {
message.getPnfsId());
if (_hotFileReplicationEnabled) {
_fileRequestMonitor.reportFileRequest(message.getPnfsId(),
_ioQueue.numberOfRequestsFor(message.getPnfsId()));
_ioQueue.numberOfRequestsFor(message.getPnfsId()),
message.getProtocolInfo());
}
message.setSucceeded();
} catch (OutOfDateCacheException e) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@
import diskCacheV111.util.PnfsId;
import diskCacheV111.util.RetentionPolicy;
import diskCacheV111.vehicles.PoolManagerGetPoolMonitor;
import diskCacheV111.vehicles.IpProtocolInfo;
import diskCacheV111.vehicles.PoolManagerPoolInformation;
import diskCacheV111.vehicles.ProtocolInfo;
import dmg.cells.nucleus.CellCommandListener;
import dmg.cells.nucleus.CellInfoProvider;
import dmg.cells.nucleus.CellLifeCycleAware;
Expand All @@ -25,6 +27,7 @@
import dmg.util.command.Option;
import java.io.PrintWriter;
import java.io.StringWriter;
import java.net.InetSocketAddress;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
Expand All @@ -40,6 +43,7 @@
import java.util.function.Predicate;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.annotation.Nullable;
import javax.annotation.concurrent.GuardedBy;
import org.dcache.cells.CellStub;
import org.dcache.pool.PoolDataBeanProvider;
Expand All @@ -57,6 +61,7 @@
import org.dcache.util.expression.TypeMismatchException;
import org.dcache.util.expression.UnknownIdentifierException;
import org.dcache.util.pool.CostModuleTagProvider;
import org.dcache.vehicles.FileAttributes;
import org.parboiled.Parboiled;
import org.parboiled.parserunners.ReportingParseRunner;
import org.parboiled.support.ParsingResult;
Expand Down Expand Up @@ -1244,7 +1249,8 @@ public Object messageArrived(CellMessage envelope, PoolMigrationJobCancelMessage
* new job.
*/
@Override
public synchronized void reportFileRequest(PnfsId pnfsId, long numberOfRequests) {
public synchronized void reportFileRequest(PnfsId pnfsId, long numberOfRequests,
ProtocolInfo protocolInfo) {
if (numberOfRequests < hotFileThreshold) {
return;
}
Expand All @@ -1268,11 +1274,28 @@ public synchronized void reportFileRequest(PnfsId pnfsId, long numberOfRequests)
_context.getPoolManagerStub(),
Collections.singletonList(_context.getPoolName()));
sourceList.refresh();

// Get file attributes from repository for pool selection
CacheEntry cacheEntry;
try {
cacheEntry = _context.getRepository().getEntry(pnfsId);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching generic Exception from repository.getEntry(...) will also catch InterruptedException; the current code logs and returns without restoring the thread's interrupt status. Please either catch InterruptedException explicitly and call Thread.currentThread().interrupt(), or rethrow it, so interruptions are not silently swallowed.

Suggested change
cacheEntry = _context.getRepository().getEntry(pnfsId);
cacheEntry = _context.getRepository().getEntry(pnfsId);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
LOGGER.warn("Failed to get cache entry for {}: {}", pnfsId, e.getMessage(), e);
return;

Copilot uses AI. Check for mistakes.
} catch (Exception e) {
LOGGER.warn("Failed to get cache entry for {}: {}", pnfsId, e.getMessage(), e);
return;
}
FileAttributes fileAttributes = cacheEntry.getFileAttributes();

String protocolUnit = deriveProtocolUnit(protocolInfo);
String netUnitName = deriveNetUnitName(protocolInfo);

Collection<Pattern> excluded = new HashSet<>();
excluded.add(Pattern.compile(Pattern.quote(_context.getPoolName())));
RefreshablePoolList basePoolList = new PoolListFilter(
new PoolListByPoolGroupOfPool(_context.getPoolManagerStub(),
_context.getPoolName()),
new PoolListByPoolMgrQuery(_context.getPoolManagerStub(),
pnfsId,
fileAttributes,
protocolUnit,
netUnitName),
excluded,
FALSE_EXPRESSION,
Collections.emptySet(),
Expand Down Expand Up @@ -1454,6 +1477,37 @@ public boolean isActive(PnfsId id) {
return _context.isActive(id);
}

/**
* Returns the PSU protocol-unit string for the given request, e.g. {@code "DCap/3"} or
* {@code "xrootd/2.1"}. When {@code protocolInfo} is {@code null} (e.g. an internal
* pool-to-pool transfer), returns the PSU wildcard that matches any protocol unit.
*/
private static String deriveProtocolUnit(@Nullable ProtocolInfo protocolInfo) {
if (protocolInfo == null) {
return "*/*";
}
String unit = protocolInfo.getProtocol() + "/" + protocolInfo.getMajorVersion();
return protocolInfo.getMinorVersion() != 0
? unit + "." + protocolInfo.getMinorVersion()
: unit;
}

/**
* Returns the client IP address string for use as the PSU net-unit name, e.g.
* {@code "10.0.0.5"}. Returns an empty string whenever the address is unavailable
* (non-IP protocol, null socket address, or null {@code protocolInfo}), which causes
* the PSU to match any network unit.
*/
private static String deriveNetUnitName(@Nullable ProtocolInfo protocolInfo) {
if (!(protocolInfo instanceof IpProtocolInfo)) {
return "";
}
InetSocketAddress addr = ((IpProtocolInfo) protocolInfo).getSocketAddress();
return (addr != null && addr.getAddress() != null)
? addr.getAddress().getHostAddress()
: "";
Comment on lines +1497 to +1508
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deriveNetUnitName returns an empty string when the client address is unavailable. PoolSelectionUnitV2 ultimately resolves the provided netUnitName via InetAddress.getByName(netUnitName) (see NetHandler.match(String)), so an empty string can resolve to a concrete address (often loopback) and unintentionally apply a net-unit constraint instead of disabling it. Consider using an explicit wildcard net-unit value (e.g. "0.0.0.0/0.0.0.0" or "::/0") for the 'match any net' case, or otherwise ensure PoolManager treats this as 'no net constraint'.

Suggested change
* {@code "10.0.0.5"}. Returns an empty string whenever the address is unavailable
* (non-IP protocol, null socket address, or null {@code protocolInfo}), which causes
* the PSU to match any network unit.
*/
private static String deriveNetUnitName(@Nullable ProtocolInfo protocolInfo) {
if (!(protocolInfo instanceof IpProtocolInfo)) {
return "";
}
InetSocketAddress addr = ((IpProtocolInfo) protocolInfo).getSocketAddress();
return (addr != null && addr.getAddress() != null)
? addr.getAddress().getHostAddress()
: "";
* {@code "10.0.0.5"}. When the client address is unavailable (non-IP protocol,
* null socket address, or null {@code protocolInfo}), returns a wildcard net-unit
* value that is interpreted by the PoolManager as "no net constraint" (match any
* network).
*/
private static final String ANY_NET_UNIT = "0.0.0.0/0.0.0.0";
private static String deriveNetUnitName(@Nullable ProtocolInfo protocolInfo) {
if (!(protocolInfo instanceof IpProtocolInfo)) {
return ANY_NET_UNIT;
}
InetSocketAddress addr = ((IpProtocolInfo) protocolInfo).getSocketAddress();
return (addr != null && addr.getAddress() != null)
? addr.getAddress().getHostAddress()
: ANY_NET_UNIT;

Copilot uses AI. Check for mistakes.
}

// Hot file replication parameters
public int getNumReplicas() {
return hotFileReplicaCount;
Expand Down
Loading
Loading