Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions DEV-LOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,75 @@
# DEV-LOG

## Computer Use Windows 增强:窗口绑定截图 + UI Automation + OCR (2026-04-03)

在三平台基础实现之上,利用 Windows 原生 API 增强 Computer Use 的 Windows 专属能力。

**新增文件:**

| 文件 | 行数 | 说明 |
|------|------|------|
| `src/utils/computerUse/win32/windowCapture.ts` | — | `PrintWindow` 窗口绑定截图,支持被遮挡/后台窗口 |
| `src/utils/computerUse/win32/windowEnum.ts` | — | `EnumWindows` 精确窗口枚举(HWND + PID + 标题) |
| `src/utils/computerUse/win32/uiAutomation.ts` | — | `IUIAutomation` UI 元素树读取、按钮点击、文本写入、坐标识别 |
| `src/utils/computerUse/win32/ocr.ts` | — | `Windows.Media.Ocr` 截图+文字识别(英语+中文) |

**修改文件:**

| 文件 | 变更 |
|------|------|
| `packages/@ant/computer-use-swift/src/backends/win32.ts` | `listRunning` 改用 EnumWindows;新增 `captureWindowTarget` 窗口级截图 |

**验证结果(Windows x64):**
- 窗口枚举:38 个可见窗口 ✅
- 窗口截图:VS Code 2575x1415, 444KB ✅(PrintWindow, 即使被遮挡)
- UI Automation:坐标元素识别 ✅
- OCR:识别 VS Code 界面文字,34 行 ✅

---

## Enable Computer Use — macOS + Windows + Linux (2026-04-03)

恢复 Computer Use 屏幕操控功能。参考项目仅 macOS,本次扩展为三平台支持。

**Phase 1 — MCP server stub 替换:**
从参考项目复制 `@ant/computer-use-mcp` 完整实现(12 文件,6517 行)。

**Phase 2 — 移除 src/ 中 8 处 macOS 硬编码:**

| 文件 | 改动 |
|------|------|
| `src/main.tsx:1605` | 去掉 `getPlatform() === 'macos'` |
| `src/utils/computerUse/swiftLoader.ts` | 移除 darwin-only throw |
| `src/utils/computerUse/executor.ts` | 平台守卫扩展为 darwin+win32+linux;剪贴板按平台分发(pbcopy→PowerShell→xclip);paste 快捷键 command→ctrl |
| `src/utils/computerUse/drainRunLoop.ts` | 非 darwin 直接执行 fn() |
| `src/utils/computerUse/escHotkey.ts` | 非 darwin 返回 false(Ctrl+C fallback) |
| `src/utils/computerUse/hostAdapter.ts` | 非 darwin 权限检查返回 granted |
| `src/utils/computerUse/common.ts` | platform + screenshotFiltering 动态化 |
| `src/utils/computerUse/gates.ts` | enabled:true + hasRequiredSubscription→true |

**Phase 3 — input/swift 包 dispatcher + backends 三平台架构:**

```
packages/@ant/computer-use-{input,swift}/src/
├── index.ts ← dispatcher
├── types.ts ← 共享接口
└── backends/
├── darwin.ts ← macOS AppleScript(原样拆出,不改逻辑)
├── win32.ts ← Windows PowerShell
└── linux.ts ← Linux xdotool/scrot/xrandr/wmctrl
```

**编译开关:** `CHICAGO_MCP` 加入 DEFAULT_FEATURES + DEFAULT_BUILD_FEATURES

**验证结果(Windows x64):**
- `isSupported: true` ✅
- 鼠标定位 + 前台窗口信息 ✅
- 双显示器检测 2560x1440 × 2 ✅
- 全屏截图 3MB base64 ✅
- `bun run build` 463 files ✅

---

## Enable Remote Control / BRIDGE_MODE (2026-04-03)

**PR**: [claude-code-best/claude-code#60](https://github.com/claude-code-best/claude-code/pull/60)
Expand Down
2 changes: 1 addition & 1 deletion build.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ rmSync(outdir, { recursive: true, force: true });

// Default features that match the official CLI build.
// Additional features can be enabled via FEATURE_<NAME>=1 env vars.
const DEFAULT_BUILD_FEATURES = ["AGENT_TRIGGERS_REMOTE"];
const DEFAULT_BUILD_FEATURES = ["AGENT_TRIGGERS_REMOTE", "CHICAGO_MCP"];

// Collect FEATURE_* env vars → Bun.build features
const envFeatures = Object.keys(process.env)
Expand Down
Loading