Skip to content

feat(migration): add OpenClaw memory and transcript migration tool#1432

Open
yeyitech wants to merge 3 commits intovolcengine:mainfrom
yeyitech:feat/openclaw-memory-migration
Open

feat(migration): add OpenClaw memory and transcript migration tool#1432
yeyitech wants to merge 3 commits intovolcengine:mainfrom
yeyitech:feat/openclaw-memory-migration

Conversation

@yeyitech
Copy link
Copy Markdown
Contributor

Closes #1011

Summary

  • add a reusable openviking.migration.openclaw module for discovering and importing legacy OpenClaw data
  • add a runnable example CLI for migrating native OpenClaw memory files and historical transcript sessions
  • add a formal content.import_memory API across embedded and HTTP clients so migration can target stable memory URIs directly
  • fix import_memory(wait=false) lock behavior so back-to-back imports do not fail with resource is busy

What this supports

  • native OpenClaw memory markdown import into OpenViking memory categories
  • transcript replay into OpenViking sessions followed by commit-triggered memory extraction
  • memory / transcript / all migration modes
  • dry-run previews and overwrite control

Validation

  • ruff check openviking/async_client.py openviking/client/local.py openviking/server/routers/content.py openviking/service/fs_service.py openviking/storage/content_write.py openviking/sync_client.py openviking_cli/client/base.py openviking_cli/client/http.py openviking_cli/client/sync_http.py openviking/migration/openclaw.py tests/client/test_filesystem.py tests/client/test_http_client_local_upload.py tests/server/test_api_content_write.py tests/server/test_content_write_service.py tests/migration/test_openclaw.py examples/openclaw-migration/migrate.py
  • python -m compileall openviking openviking_cli tests/migration examples/openclaw-migration
  • custom embedded smoke with mocked AGFS/embedder/VLM covering direct import_memory() plus migrate_openclaw(mode=\"all\")

Notes

  • targeted pytest is currently blocked in this repo by the existing pytest-asyncio collection error: AttributeError: 'Package' object has no attribute 'obj'

@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis 🔶

1011 - Partially compliant

Compliant requirements:

  • Add migration script/CLI command to import existing OpenClaw conversation history
  • Read existing OpenClaw conversation transcripts from local data directory
  • Feed through OpenViking's session API (create sessions, add messages, commit)
  • Support native OpenClaw memory markdown import
  • Support transcript replay into OpenViking sessions
  • Add dry-run previews and overwrite control

Non-compliant requirements:

  • (No explicit progress reporting implemented)

Requires further human verification:

  • Resume capability (uses deterministic URIs/session IDs for reruns)
⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🏅 Score: 85
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: feat: add content.import_memory API

Relevant files:

  • openviking/async_client.py
  • openviking/client/local.py
  • openviking/server/routers/content.py
  • openviking/service/fs_service.py
  • openviking/storage/content_write.py
  • openviking/sync_client.py
  • openviking_cli/client/base.py
  • openviking_cli/client/http.py
  • openviking_cli/client/sync_http.py
  • tests/client/test_filesystem.py
  • tests/client/test_http_client_local_upload.py
  • tests/server/test_api_content_write.py
  • tests/server/test_content_write_service.py

Sub-PR theme: feat: add OpenClaw migration tool

Relevant files:

  • openviking/migration/init.py
  • openviking/migration/openclaw.py
  • examples/openclaw-migration/migrate.py
  • examples/openclaw-migration/README.md
  • tests/migration/test_openclaw.py

⚡ Recommended focus areas for review

Potential Async Client Compatibility

The migrate_openclaw function is synchronous but accepts a client of type Any. If an async client is passed, calls like client.import_memory() will return coroutines instead of results, causing silent failures.

def migrate_openclaw(
    client: Any,
    openclaw_dir: str | Path,
    *,
    mode: str = "memory",
    dry_run: bool = False,
    overwrite: bool = False,
    wait: bool = True,
    timeout: float = 300.0,
    poll_interval: float = 1.0,
    agent_ids: Sequence[str] | None = None,
    category_override: str | None = None,
) -> dict[str, Any]:
    """Run an OpenClaw -> OpenViking migration.

    Returns a structured summary containing per-item records for both import
    paths. The caller can print or persist the records as needed.
    """

    if mode not in {"memory", "transcript", "all"}:
        raise ValueError(f"unsupported migration mode: {mode}")

    memory_records: list[dict[str, Any]] = []
    transcript_records: list[dict[str, Any]] = []

    if mode in {"memory", "all"}:
        for artifact in discover_openclaw_memory_artifacts(
            openclaw_dir, category_override=category_override
        ):
            record = {
                "kind": artifact.kind,
                "source_path": str(artifact.source_path),
                "category": artifact.category,
                "uri": artifact.uri,
            }
            if dry_run:
                record["status"] = "planned"
                memory_records.append(record)
                continue

            content = artifact.source_path.read_text(encoding="utf-8").strip()
            if not content:
                record["status"] = "skipped_empty"
                memory_records.append(record)
                continue
            if not overwrite and _uri_exists(client, artifact.uri):
                record["status"] = "skipped_exists"
                memory_records.append(record)
                continue

            result = client.import_memory(
                artifact.uri,
                content,
                mode="replace",
                wait=wait,
                timeout=timeout,
                telemetry=False,
            )
            record["status"] = "imported"
            record["result"] = result
            memory_records.append(record)

    if mode in {"transcript", "all"}:
        sessions = discover_openclaw_transcript_sessions(openclaw_dir, agent_ids=agent_ids)
        for session in sessions:
            target_session_id = _stable_target_session_id(session.agent_id, session.session_id)
            record = {
                "agent_id": session.agent_id,
                "session_key": session.session_key,
                "source_path": str(session.transcript_path),
                "target_session_id": target_session_id,
            }
            messages = parse_openclaw_transcript(session.transcript_path)
            record["message_count"] = len(messages)
            if dry_run:
                record["status"] = "planned"
                transcript_records.append(record)
                continue
            if not messages:
                record["status"] = "skipped_empty"
                transcript_records.append(record)
                continue
            if _session_exists(client, target_session_id):
                if not overwrite:
                    record["status"] = "skipped_exists"
                    transcript_records.append(record)
                    continue
                client.delete_session(target_session_id)

            client.create_session(target_session_id)
            for message in messages:
                client.add_message(
                    target_session_id,
                    message.role,
                    content=message.content,
                    created_at=message.created_at,
                )
            commit_result = client.commit_session(target_session_id, telemetry=False)
            record["commit_result"] = commit_result
            task_id = commit_result.get("task_id")
            if wait and task_id:
                record["task"] = _wait_for_task(
                    client,
                    task_id,
                    timeout=timeout,
                    poll_interval=poll_interval,
                )
            record["status"] = "imported"
            transcript_records.append(record)

    return {
        "mode": mode,
        "memory": {
            "records": memory_records,
            "summary": _summarize_records(memory_records),
        },
        "transcript": {
            "records": transcript_records,
            "summary": _summarize_records(transcript_records),
        },
    }

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

[Feature]: Add migration tool to import existing OpenClaw conversation history into OpenViking

1 participant