diff --git a/content/docs/ai-prompts/prompts.mdx b/content/docs/ai-prompts/prompts.mdx new file mode 100644 index 0000000..516cee8 --- /dev/null +++ b/content/docs/ai-prompts/prompts.mdx @@ -0,0 +1,90 @@ +--- +title: AI Prompts for OACP +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +Copy-paste prompts for Claude, ChatGPT, or any AI assistant. These help you plan integrations, write `oacp.json`, and generate receiver code. + +## Prompt 1: Plan OACP integration + +``` +I want to add OACP (Open App Capability Protocol) to my Android app. + +Read the OACP documentation at https://oacp.dev and plan how to add voice control capabilities to my app. + +My app is: [describe your app and what it does] + +For each feature that could be voice-controlled, suggest: +- A capability ID (snake_case) +- Whether it should be a broadcast (background) or activity (foreground) action +- What parameters it needs +- Good aliases and examples for voice matching + +Output a complete oacp.json file and an OacpReceiver implementation. +``` + +## Prompt 2: Review oacp.json + +``` +Review this oacp.json file for completeness and voice matching quality. + +Check: +- Are there enough aliases? (aim for 5+ per capability) +- Are the examples realistic user utterances? +- Are disambiguationHints present for similar-sounding actions? +- Are parameter types and constraints correct? +- Is __APPLICATION_ID__ used consistently? +- Are confirmation semantics appropriate? + +[paste your oacp.json here] +``` + +## Prompt 3: Generate OacpReceiver + +``` +Generate a Kotlin OacpReceiver for these OACP capabilities. + +Use the OACP Android SDK (org.oacp.android): +- Extend OacpReceiver +- Override onAction() +- Use OacpParams for type-safe parameter access +- Return OacpResult.success() or OacpResult.error() +- Match actions with endsWith() for build variant safety + +Capabilities: +[paste your oacp.json capabilities array here] +``` + +## Prompt 4: Write OACP.md context + +``` +Write an OACP.md file for my Android app. This file gives AI assistants context about what the app does and how to disambiguate between capabilities. + +My app: [describe your app] +My capabilities: [paste capability IDs and descriptions] + +The file should: +- Start with the app name and a one-line description +- List each capability with plain-English usage notes +- Include disambiguation rules for similar actions +- Note any edge cases or defaults +``` + +## Prompt 5: Add OACP to a specific open source app + +``` +I want to add OACP support to [app name], an open-source Android app. + +The source code is at: [GitHub URL] + +Read the app's source code and: +1. Identify 3-5 features that could be voice-controlled +2. For each, determine if it's a foreground (needs UI) or background action +3. Write a complete oacp.json with proper aliases, examples, and parameters +4. Write an OacpReceiver that handles each action +5. Show the AndroidManifest.xml additions needed +6. Write an OACP.md context file + +Follow the OACP v0.3 protocol. Use __APPLICATION_ID__ for all package-name-dependent values. +``` diff --git a/content/docs/contributing/contributing.mdx b/content/docs/contributing/contributing.mdx new file mode 100644 index 0000000..5497956 --- /dev/null +++ b/content/docs/contributing/contributing.mdx @@ -0,0 +1,81 @@ +--- +title: Contributing +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +Three ways to contribute, ordered by impact. + +## 1. Add OACP to an app + +This is the single most impactful thing you can do. Every app that ships `oacp.json` makes the entire ecosystem more useful. + +Pick any open-source Android app. Fork it. Add OACP. Open a PR. + +Here's the process: + +1. Pick an app (ideally one you already use) +2. Follow the [Add OACP guide](/docs/get-started/add-oacp) +3. Fork the app on GitHub +4. Add the OACP integration (`oacp.json`, ContentProvider, BroadcastReceiver) +5. Test with Hark or `adb` +6. Open a PR against the original repo +7. If accepted, let us know and we'll add it to the [ecosystem](/docs/ecosystem/apps) + +Not sure where to start? The [AI Prompts](/docs/ai-prompts/prompts) page has copy-paste prompts that help you plan an integration, write `oacp.json`, and generate a receiver. + +## 2. Test Hark and file issues + +Install Hark, try voice commands with [ecosystem apps](/docs/ecosystem/apps), and file issues when things break. + +Report issues at: [github.com/OpenAppCapabilityProtocol/hark/issues](https://github.com/OpenAppCapabilityProtocol/hark/issues) + +What to report: + +- Discovery failures (app not found) +- Wrong action matched +- Parameter extraction errors +- Crash logs +- UX suggestions + +## 3. Protocol feedback + +Open issues on the protocol repo: [github.com/OpenAppCapabilityProtocol/oacp/issues](https://github.com/OpenAppCapabilityProtocol/oacp/issues) + +Things we want feedback on: + +- Missing fields in `oacp.json` +- Edge cases in discovery +- Confirmation semantics +- Entity provider patterns +- Cross-platform concerns (iOS, desktop) + +## Development setup (for Hark contributors) + +If you want to work on Hark itself: + +```bash +git clone https://github.com/OpenAppCapabilityProtocol/hark +``` + +Requirements: + +- Flutter 3.29+ +- Android SDK 35 +- Kotlin 2.0 +- An arm64 Android device (emulator won't work for on-device models) + +Build and run: + +```bash +flutter run +``` + +Key conventions: + +- **State management**: Riverpod 3.x. No codegen, no ChangeNotifier. +- **Platform bridge**: Pigeon. Regenerate with: + ```bash + dart run pigeon --input packages/hark_platform/pigeons/messages.dart + ``` +- **Branching**: Always branch from main. Never commit directly to main. diff --git a/content/docs/ecosystem/apps.mdx b/content/docs/ecosystem/apps.mdx new file mode 100644 index 0000000..1067251 --- /dev/null +++ b/content/docs/ecosystem/apps.mdx @@ -0,0 +1,37 @@ +--- +title: OACP Apps +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +Every app below supports OACP. Install it alongside Hark, and you can control it by voice. Each demonstrates a different OACP pattern -- from simple foreground launches to background queries with async results and parameter extraction. + +| App | What it does | Capabilities | Source | +|-----|-------------|-------------|--------| +| OACP Test App | Counter and battery demo | Increment, decrement, set, reset counter; get battery, get counter | [GitHub](https://github.com/OpenAppCapabilityProtocol/oacp) | +| Breezy Weather | Weather forecasts | Weather queries | [GitHub](https://github.com/OpenAppCapabilityProtocol/breezy-weather) | +| Binary Eye | QR/barcode scanner | Scan, create barcodes | [GitHub](https://github.com/OpenAppCapabilityProtocol/BinaryEye) | +| Voice Recorder | Audio recording | Start/stop recording | [GitHub](https://github.com/OpenAppCapabilityProtocol/Voice-Recorder) | +| Libre Camera | Camera app | Take photos | [GitHub](https://github.com/OpenAppCapabilityProtocol/librecamera) | +| Wikipedia | Encyclopedia | Search articles, open pages | [GitHub](https://github.com/OpenAppCapabilityProtocol/apps-android-wikipedia) | +| ArchiveTune | Internet Archive music | Play music, search | [GitHub](https://github.com/OpenAppCapabilityProtocol/ArchiveTune) | +| Live Darbar | Sikh devotional streaming | Play live stream | [GitHub](https://github.com/0xharkirat/live_darbar) | + +## What each app demonstrates + +| App | OACP pattern | +|-----|-------------| +| **OACP Test App** | Foreground + background actions, integer parameters, async results. The full kitchen sink. | +| **Breezy Weather** | Async background query. The app never opens -- it broadcasts the forecast back to Hark. | +| **Binary Eye** | Foreground camera launch for scanning, plus parameter-driven barcode creation. | +| **Voice Recorder** | Simple foreground action. One command starts recording. | +| **Wikipedia** | Entity providers (language selection) and foreground navigation to articles. | +| **ArchiveTune** | Parameter extraction. Hark pulls song name and artist from a single utterance. | +| **Libre Camera** | Simple foreground action. Opens the camera to take a photo. | +| **Live Darbar** | Simple foreground action. Starts a live devotional stream. | + +## Add your app + +Got an app with OACP support? We want to list it here. + +Check out the [Contributing guide](/docs/contributing/contributing) for how to add OACP to an open-source app and get it listed. You can also use the [AI Prompts](/docs/ai-prompts/prompts) page to get help writing your `oacp.json` and receiver. diff --git a/content/docs/faq.mdx b/content/docs/faq.mdx new file mode 100644 index 0000000..1a5f010 --- /dev/null +++ b/content/docs/faq.mdx @@ -0,0 +1,59 @@ +--- +title: FAQ +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +### Does OACP require Google Play Services? + +No. OACP uses standard Android APIs: ContentProvider for discovery, BroadcastReceiver and Intent for dispatch. No proprietary dependencies. + +### Does it work on custom ROMs? + +Yes. OACP relies only on standard AOSP APIs. If your ROM supports ContentProvider and BroadcastReceiver (all of them do), OACP works. + +### What Android versions are supported? + +The OACP Android SDK supports Android 5.0+ (API 21). Hark itself requires Android 8.0+ (API 26) because of speech recognition and overlay features. + +### Is Hark on the Play Store? + +Not yet. Hark is distributed as an APK from GitHub releases. Play Store publishing is planned. + +### Can I use OACP without Hark? + +Yes, in principle. Hark is currently the only assistant that supports OACP. But the protocol is open and assistant-agnostic by design. + +Google Assistant, Claude, ChatGPT, Perplexity, or any other assistant could adopt OACP. The protocol just needs a ContentProvider scanner and an Intent dispatcher. If apps ship `oacp.json` and assistants implement discovery, OACP works without Hark. + +Hark will always remain the open-source reference assistant. + +### Does OACP work with Flutter apps? + +Yes. Place `oacp.json` in `android/app/src/main/assets/oacp.json` (not in Flutter's `assets/` folder). The ContentProvider and receiver are native Android components, so you write them in Kotlin even in a Flutter project. + +### Does OACP work with React Native apps? + +Yes, same approach as Flutter. The OACP integration is entirely on the native Android side. + +### What about iOS? + +Under evaluation. The protocol concepts (capability manifest, discovery, invocation) can map to iOS, but the transport layer would be different. There is no ContentProvider or BroadcastReceiver on iOS. + +### Is the protocol stable? + +OACP is at v0.3. The schema is functional and apps are using it. v1.0 is the stability target with schema lock, compatibility guarantees, and long-term IDs. Breaking changes before v1.0 are possible but will be documented. + +### How is OACP different from Android 15 AppFunctions? + +AppFunctions is Google's vendor-specific API shipping with Android 15+. OACP works on Android 5.0+, is open and assistant-agnostic, and provides richer metadata (aliases, examples, disambiguation hints, entity providers). + +The two may converge. OACP is exploring an annotation processor that generates both `oacp.json` and AppFunctions metadata from a single source. + +### How big is the SDK? + +The AAR is 27 KB. Six classes, zero transitive dependencies beyond `androidx.annotation`. + +### Does Hark send data to the cloud? + +No. All AI inference runs on-device using EmbeddingGemma (308M) for intent matching and Qwen3 (0.6B) for parameter extraction. No data leaves the device. diff --git a/content/docs/get-started/add-oacp.mdx b/content/docs/get-started/add-oacp.mdx new file mode 100644 index 0000000..1c09ae1 --- /dev/null +++ b/content/docs/get-started/add-oacp.mdx @@ -0,0 +1,198 @@ +--- +title: Add OACP to Your App +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +**Five minutes. Three files. One dependency.** + +You are going to add voice control to your Android app. When you are done, Hark (or any OACP-compatible assistant) can discover your app's capabilities and invoke them by voice. + +## What you'll add + +| File | Purpose | +|------|---------| +| `oacp-android-release.aar` | SDK dependency (provider + helpers) | +| `assets/oacp.json` | Declares what your app can do | +| `assets/OACP.md` | Plain-English context for the AI | +| `OacpActionReceiver.kt` | Handles incoming voice commands | + +## Step 1: Add the SDK + +Download `oacp-android-release.aar` from the [latest release](https://github.com/OpenAppCapabilityProtocol/oacp-android-sdk/releases). Copy it to `app/libs/`. + +```kotlin +// app/build.gradle.kts +dependencies { + implementation(files("libs/oacp-android-release.aar")) + implementation("androidx.annotation:annotation:1.7.1") +} +``` + +Sync Gradle. The SDK gives you `OacpReceiver`, `OacpMetadataProvider`, and parameter-parsing helpers. + +## Step 2: Create `oacp.json` + +Create `app/src/main/assets/oacp.json`. This manifest tells assistants what your app can do. + +```json +{ + "oacpVersion": "0.3", + "appId": "__APPLICATION_ID__", + "displayName": "My App", + "capabilities": [ + { + "id": "toggle_dark_mode", + "description": "Toggle dark mode on or off", + "aliases": ["switch theme", "dark theme", "light mode"], + "examples": ["turn on dark mode", "switch to light theme"], + "keywords": ["dark", "theme", "mode"], + "parameters": [], + "confirmation": "never", + "visibility": "public", + "invoke": { + "android": { + "type": "broadcast", + "action": "__APPLICATION_ID__.ACTION_TOGGLE_DARK_MODE" + } + } + } + ] +} +``` + +`__APPLICATION_ID__` is replaced with your real package name at build time. Don't hardcode it. + +More aliases and examples means better voice matching. A capability with 5 aliases and 5 examples resolves much more reliably than one with just a description. + +> **Flutter apps:** place the file at `android/app/src/main/assets/oacp.json`, NOT in Flutter's `assets/` folder. + +## Step 3: Create `OACP.md` + +Create `app/src/main/assets/OACP.md`. This gives the AI assistant context about your app in plain English. + +```markdown +# My App - OACP Context + +My App is a settings manager. It lets users customize their device experience. + +## Capabilities + +- **toggle_dark_mode**: Switches between dark and light theme. No parameters needed. + The app applies the change immediately. No confirmation required. +``` + +Keep it short. Write it like you are explaining the app to a colleague. + +## Step 4: Create a BroadcastReceiver + +Create a receiver that extends `OacpReceiver` from the SDK: + +```kotlin +// OacpActionReceiver.kt +package com.example.myapp + +import android.content.Context +import android.content.Intent +import com.oacp.sdk.OacpReceiver + +class OacpActionReceiver : OacpReceiver() { + + override fun onOacpAction(context: Context, intent: Intent) { + when (intent.action) { + "${context.packageName}.ACTION_TOGGLE_DARK_MODE" -> { + // Your logic here + toggleDarkMode(context) + setResultSuccess("Dark mode toggled") + } + else -> setResultError("Unknown action: ${intent.action}") + } + } + + private fun toggleDarkMode(context: Context) { + // Toggle your app's theme + } +} +``` + +`setResultSuccess()` and `setResultError()` are helpers from `OacpReceiver`. The assistant uses the result to give the user feedback. + +## Step 5: Register in AndroidManifest.xml + +```xml + + + + + + +``` + +The SDK's `OacpMetadataProvider` is registered automatically via manifest merging. It exposes your `oacp.json` through a ContentProvider so assistants can discover your app. + +## Step 6: Verify + +Build and install your app, then check that the manifest is discoverable: + +```bash +adb shell content read \ + --uri content://com.example.myapp.oacp/manifest +``` + +Replace `com.example.myapp` with your actual package name. You should see your `oacp.json` contents printed back. + +If you have [Hark installed](/docs/get-started/try-hark), long-press Home and say "turn on dark mode." Hark should find your app and dispatch the action. + +## Common patterns + +Here are three patterns you will see across OACP apps. Each one maps to a different `completionMode` and invocation style. + +### Background query (no UI) + +The app answers a question without ever becoming visible. The result comes back as a broadcast. + +```json +{ + "id": "get_weather", + "description": "Get current weather conditions", + "invoke": { "android": { "type": "broadcast", "action": "__APPLICATION_ID__.ACTION_GET_WEATHER" } }, + "completionMode": "async_result", + "resultTransport": { "android": { "type": "broadcast", "action": "org.oacp.ACTION_RESULT" } } +} +``` + +### Foreground action (open UI) + +The app opens and takes over the screen. No result is expected. + +```json +{ + "id": "take_photo", + "description": "Open the camera and take a photo", + "invoke": { "android": { "type": "activity", "action": "__APPLICATION_ID__.ACTION_TAKE_PHOTO" } }, + "completionMode": "foreground_handoff" +} +``` + +### Action with parameters + +The assistant extracts structured values from the user's voice and passes them as Intent extras. + +```json +{ + "id": "set_timer", + "description": "Set a countdown timer", + "parameters": [ + { "name": "minutes", "type": "integer", "required": true, "minimum": 1, "maximum": 120 } + ], + "invoke": { "android": { "type": "broadcast", "action": "__APPLICATION_ID__.ACTION_SET_TIMER" } } +} +``` + +## What's next + +- **[oacp.json reference](/docs/oacp/oacp-json)** - Full schema with parameters, confirmation modes, and visibility options +- **[Kotlin SDK guide](/docs/sdks/kotlin/quick-start)** - Async results, parameter parsing, and advanced patterns +- **[Complete integration guide](/docs/oacp/getting-started)** - Multi-capability apps, Activities vs Broadcasts, and Flutter integration diff --git a/content/docs/get-started/try-hark.mdx b/content/docs/get-started/try-hark.mdx new file mode 100644 index 0000000..936a505 --- /dev/null +++ b/content/docs/get-started/try-hark.mdx @@ -0,0 +1,120 @@ +--- +title: Try It Out +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +OACP is early. Most apps don't support it yet. The **Test App** is a playground built specifically for trying OACP with Hark. Install both, speak a command, see it work. Five minutes. + +## What you need + +- An Android device running Android 8.0 (API 26) or higher +- About 5 minutes for setup, plus ~2 minutes for Hark's first-launch model download + +## 1. Install Hark + +Download the latest APK from [GitHub Releases](https://github.com/OpenAppCapabilityProtocol/hark/releases) and sideload it. + +## 2. Install the OACP Test App + +Download the latest APK from [GitHub Releases](https://github.com/OpenAppCapabilityProtocol/oacp/releases). Or clone the repo and build it yourself. + +## 3. Set Hark as the default assistant + +1. Open **Settings** on your device +2. Go to **Apps** → **Default apps** → **Digital assistant app** +3. Select **Hark** + +This lets you launch Hark by long-pressing the Home button, just like Google Assistant. + +## 4. Try your first command + +Long-press the Home button. Hark opens and listens. Say: + +> "increment the counter" + +The Test App opens and the counter goes up by one. That's OACP working end to end. + +## 5. Try more commands + +| Say this | What happens | +|----------|-------------| +| "increment the counter by 5" | Counter goes up by 5 | +| "what's the battery level" | Hark reads back battery percentage (no app opens) | +| "reset the counter" | Counter resets to zero | +| "set the counter to 42" | Counter jumps to 42 | + +## What you can say with ecosystem apps + +Install any of these apps alongside Hark and try real voice commands across different OACP patterns. + +| App | Voice command | What happens | +|-----|--------------|-------------| +| Breezy Weather | "What's the weather?" | Returns current conditions | +| Binary Eye | "Open the QR scanner" | Launches camera scanner | +| Voice Recorder | "Start recording" | Begins audio recording | +| Wikipedia | "Search Wikipedia for Flutter" | Opens article | +| ArchiveTune | "Play Lonely by Akon" | Plays music | +| Libre Camera | "Take a photo" | Opens camera | + +Install any of these from the [ecosystem page](/docs/ecosystem/apps) to try more commands. + +## What just happened + +Here is the full chain, all running on your device: + +1. **Voice capture** - Hark recorded your speech and transcribed it on-device (Whisper) +2. **Discovery** - Hark queried every installed app for an `oacp.json` manifest via Android ContentProviders +3. **Intent resolution** - EmbeddingGemma matched your utterance to the right capability from all discovered capabilities +4. **Parameter extraction** - Qwen3 pulled structured parameters from your sentence ("5", "42", etc.) +5. **Dispatch** - Hark fired an Android broadcast Intent with the action and parameters. The Test App's receiver handled it. + +No cloud. No API key. No account. Everything ran locally on the device. + +## For developers: test with adb + +You don't need to use your voice every time. Send commands directly via adb: + +```bash +# Increment the counter +adb shell am broadcast \ + -a com.oacp.testapp.ACTION_INCREMENT \ + --ei amount 1 + +# Set the counter to a specific value +adb shell am broadcast \ + -a com.oacp.testapp.ACTION_SET_COUNTER \ + --ei value 42 + +# Reset the counter +adb shell am broadcast \ + -a com.oacp.testapp.ACTION_RESET + +# Check battery (returns result via Hark, not useful standalone) +adb shell am broadcast \ + -a com.oacp.testapp.ACTION_GET_BATTERY +``` + +This is the same mechanism Hark uses. The broadcast actions and extras match what the Test App declares in its `oacp.json`. + +## Troubleshooting + +**Nothing happens when I long-press Home** + +Hark is not set as the default assistant. Go to Settings → Apps → Default apps → Digital assistant app and select Hark. + +**Hark says it can't find any apps** + +The Test App is not installed, or it was installed after Hark's last discovery scan. Reinstall the Test App and try again. + +**First command is slow** + +Hark downloads AI models on first launch. This takes about 2 minutes depending on your connection. Subsequent launches are fast. + +**Hark opens but doesn't hear me** + +Check that Hark has microphone permission. Go to Settings → Apps → Hark → Permissions → Microphone. + +--- + +The more apps that adopt OACP, the more useful Hark becomes. That's where you come in. **[Add OACP to your app →](/docs/get-started/add-oacp)** diff --git a/content/docs/hark/overview.mdx b/content/docs/hark/overview.mdx index a47bd39..9708821 100644 --- a/content/docs/hark/overview.mdx +++ b/content/docs/hark/overview.mdx @@ -1,6 +1,6 @@ --- title: Hark Overview -last_edited: '2026-04-09T00:00:00.000Z' +last_edited: '2026-04-11T00:00:00.000Z' tocIsHidden: false --- @@ -62,10 +62,12 @@ Hark works with any app that implements OACP. These are tested and working today | [OACP Test App](https://github.com/OpenAppCapabilityProtocol/oacp/tree/main/examples/example_oacp_test) | "Increment counter" / "What's the counter at?" | | [Wikipedia](https://github.com/OpenAppCapabilityProtocol/apps-android-wikipedia) | "Search Wikipedia for Flutter" | | [ArchiveTune](https://github.com/OpenAppCapabilityProtocol/ArchiveTune) | "Play Lonely by Akon" - music playback by voice | +| [Libre Camera](https://github.com/OpenAppCapabilityProtocol/librecamera) | "Take a photo" - camera launch by voice | +| [Live Darbar](https://github.com/0xharkirat/live_darbar) | "Play live kirtan" - Sikh devotional streaming | Each is a fork showing exactly what was added to support OACP. Check the diff against upstream to see how simple the integration is. -Want to add OACP to your own app? See [Getting started with OACP](/docs/oacp/getting-started). +See the full list on the [ecosystem page](/docs/ecosystem/apps). Want to add OACP to your own app? See [Add OACP to Your App](/docs/get-started/add-oacp). ## Getting started diff --git a/content/docs/hark/wake-word.mdx b/content/docs/hark/wake-word.mdx new file mode 100644 index 0000000..b138aff --- /dev/null +++ b/content/docs/hark/wake-word.mdx @@ -0,0 +1,68 @@ +--- +title: Wake Word +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +Hark supports "Hey Hark" wake word detection using openWakeWord, an open-source library (Apache 2.0) running entirely on-device. Say "Hey Hark" and the microphone activates automatically. + +The name "Hark" means "to listen." It's also my name. The coincidence was too good to ignore. + +## How it works + +The pipeline: AudioRecord (16kHz mono) feeds into Silero VAD (voice activity gate), then into openWakeWord (mel-spectrogram, speech embeddings, keyword detection), across the Pigeon bridge to ChatNotifier, which auto-starts the mic. + +Components: + +- **openWakeWord**: ONNX-based wake word engine. Runs a mel-spectrogram model and an embedding model to detect the keyword. +- **Silero VAD**: Voice Activity Detection pre-filter. Saves battery by only running wake word inference when speech is detected. +- **Custom model**: `hey_harkh.onnx` (201 KB), trained via the openWakeWord Google Colab notebook. + +## Architecture + +``` +AudioRecord (16kHz) + → Silero VAD (is someone speaking?) + → openWakeWord engine (is it "Hey Hark"?) + → WakeWordDetector.kt (Kotlin wrapper) + → HarkPlatformPlugin (Pigeon bridge) + → HarkResultFlutterApi.onWakeWordDetected() + → ChatNotifier (auto-starts microphone) +``` + +Detection threshold: 0.3. Cooldown: 1500ms between detections. + +## AudioRecord mutual exclusion + +Android only allows one AudioRecord at a time. Wake word detection and speech recognition (SpeechRecognizer) both need the mic. Hark handles this by: + +1. Wake word detected. Pause wake word engine (fully stops AudioRecord). +2. STT starts. Speech is transcribed. +3. STT finishes. Resume wake word engine (restarts AudioRecord). + +The pause/resume cycle causes a roughly 25-second buffer rebuild delay. openWakeWord needs about 10 seconds of audio context in its embedding buffer before it can reliably detect again. This is a known limitation being addressed. + +## Current status: Phase 1 (shipped) + +In-app wake word detection is working. When Hark is open, "Hey Hark" activates the mic. + +## Phase 2 (planned): Background service + +A foreground service running wake word detection even when Hark is closed. "Hey Hark" from any screen would launch the overlay and start listening. + +## Phase 3 (planned): Continuous listening session + +After "Hey Hark" starts a session, the mic stays open for the entire conversation. No need to say the wake word again. The user can interrupt while Hark is speaking (barge-in). This requires acoustic echo cancellation to prevent Hark from hearing its own TTS output. + +## Dependencies + +- `xyz.rementia:openwakeword:0.1.4` (Maven Central) +- `com.github.gkonovalov.android-vad:silero:2.0.10` (JitPack) +- ONNX Runtime 1.23.0 + +## Preprocessing models + +Two models must be at the root of the assets directory (the library hardcodes these paths): + +- `melspectrogram.onnx` (mel-spectrogram extraction) +- `embedding_model.onnx` (speech embedding) diff --git a/content/docs/index.mdx b/content/docs/index.mdx index f9edc9f..3cc4a1e 100644 --- a/content/docs/index.mdx +++ b/content/docs/index.mdx @@ -1,48 +1,70 @@ --- title: Documentation -last_edited: '2026-04-09T00:00:00.000Z' +last_edited: '2026-04-11T00:00:00.000Z' tocIsHidden: true --- +**OACP is MCP for mobile apps.** + +Apps describe what they can do in a JSON file. AI Assistant (Hark) discovers those capabilities. Voice commands become Android Intents. Everything runs on-device. No cloud. + +[Hark](/docs/hark/overview) is the open-source assistant that brings OACP to life. The name means "to listen." It's also my name. Eight apps support it today, with more coming. + -## Recommended reading order +## How it works + +``` +App ships oacp.json → Assistant discovers via ContentProvider +User speaks a command → On-device AI matches to the right capability +Assistant fires Intent → App handles the action, returns a result +``` + +No server. No API key. Everything runs on the device. + +## Where to start -- Start with **[What is OACP?](/docs/oacp/what-is-oacp)** if you need the protocol model. -- Read **[Hark Overview](/docs/hark/overview)** if you want to see the open-source assistant end to end. -- Jump to **[Kotlin Quick Start](/docs/sdks/kotlin/quick-start)** if you want to expose capabilities from an Android app now. -- Use **[Roadmap](/docs/roadmap)** only after the basics are clear. +| You want to... | Start here | +|---|---| +| See it working | [Try It Out](/docs/get-started/try-hark) | +| Add voice control to your app | [Add OACP to Your App](/docs/get-started/add-oacp) | +| Understand the protocol | [What is OACP?](/docs/oacp/what-is-oacp) | +| Read the full oacp.json schema | [oacp.json Reference](/docs/oacp/oacp-json) | +| Learn about Hark's architecture | [Hark Overview](/docs/hark/overview) | +| See which apps support OACP | [Ecosystem](/docs/ecosystem/apps) | +| Contribute | [Contributing Guide](/docs/contributing/contributing) | +| Use AI to help you integrate | [AI Prompts](/docs/ai-prompts/prompts) | ## Quick links -- **Protocol spec**: [github.com/OpenAppCapabilityProtocol/oacp](https://github.com/OpenAppCapabilityProtocol/oacp) -- **Kotlin SDK**: [github.com/OpenAppCapabilityProtocol/oacp-android-sdk](https://github.com/OpenAppCapabilityProtocol/oacp-android-sdk) -- **Hark**: [github.com/OpenAppCapabilityProtocol/hark](https://github.com/OpenAppCapabilityProtocol/hark) -- **Organization**: [github.com/OpenAppCapabilityProtocol](https://github.com/OpenAppCapabilityProtocol) +- [github.com/OpenAppCapabilityProtocol/oacp](https://github.com/OpenAppCapabilityProtocol/oacp) - Protocol spec +- [github.com/OpenAppCapabilityProtocol/oacp-android-sdk](https://github.com/OpenAppCapabilityProtocol/oacp-android-sdk) - Android SDK +- [github.com/OpenAppCapabilityProtocol/hark](https://github.com/OpenAppCapabilityProtocol/hark) - Hark assistant +- [github.com/OpenAppCapabilityProtocol](https://github.com/OpenAppCapabilityProtocol) - Organization diff --git a/content/docs/oacp/entity-providers.mdx b/content/docs/oacp/entity-providers.mdx new file mode 100644 index 0000000..86d020c --- /dev/null +++ b/content/docs/oacp/entity-providers.mdx @@ -0,0 +1,214 @@ +--- +title: Entity Providers +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +Entity providers serve dynamic data to the assistant at runtime. Playlists, alarms, bookmarks, contacts, saved views. Anything where the set of valid values changes based on user data. + +Without entity providers, the assistant can only match against static strings in `oacp.json`. That works for a language picker with 20 entries. It does not work for a music library with 500 playlists. + +## When to use them + +Use an entity provider when: + +- The valid values change at runtime (playlists, alarms, bookmarks) +- The list is too large to inline in `oacp.json` (contacts, files) +- The data is user-specific (custom tags, saved searches) + +Use an [entity snapshot](#entity-snapshots) instead when: + +- The list is small and static (languages, categories, difficulty levels) +- The values never change between app updates + +## Declaring entity types + +Define custom entity types at the top level of `oacp.json`. + +```json +"entityTypes": [ + { + "id": "playlist", + "displayName": "Playlist", + "description": "A user-created music playlist" + } +] +``` + +## Declaring entity providers + +Entity providers tell the assistant where to query for entities of a given type. + +```json +"entityProviders": [ + { + "id": "user_playlists", + "entityType": "playlist", + "transport": "provider", + "uri": "content://com.example.music.oacp/entities/playlist" + } +] +``` + +The `uri` points to a `ContentProvider` in your app. The assistant queries it when it needs to resolve a parameter of this entity type. + +## Referencing in parameters + +Link a parameter to an entity type with `entityRef`. + +```json +{ + "name": "playlist", + "type": "string", + "description": "The playlist to play.", + "entityRef": { + "entityType": "playlist", + "resolution": "required", + "entityDisambiguationPrompt": "Which playlist?" + } +} +``` + +When `resolution` is `"required"`, the assistant must resolve the user's input to a known entity before invoking the capability. If the input is ambiguous, the assistant shows the `entityDisambiguationPrompt`. + +When `resolution` is `"optional"`, the assistant tries to resolve but accepts free-form input as a fallback. + +## Implementing OacpEntitySource + +Your app needs a Kotlin class that implements `OacpEntitySource`. This is the code that runs when the assistant queries your provider. + +```kotlin +class MusicEntitySource : OacpEntitySource { + override fun queryEntities( + context: Context, + entityType: String, + query: String? + ): List { + if (entityType != "playlist") return emptyList() + return playlistDb.getAll() + .filter { query == null || it.name.startsWith(query, ignoreCase = true) } + .map { + OacpEntity( + it.id, + it.name, + aliases = listOf(), + description = "${it.trackCount} tracks" + ) + } + } +} +``` + +Register it in your `Application.onCreate()`: + +```kotlin +class MyApp : Application() { + override fun onCreate() { + super.onCreate() + OacpProvider.setEntitySource(MusicEntitySource()) + } +} +``` + +The `query` parameter is a prefix string. When the user says "play my rock playlist", the assistant queries with `query = "rock"`. Return all matches. The assistant handles ranking and disambiguation. + +## Entity snapshots + +For small, static sets of values, skip the provider and inline them directly in the parameter definition. + +```json +{ + "name": "language", + "type": "string", + "description": "Target language for the article.", + "entitySnapshot": [ + { "id": "en", "displayName": "English" }, + { "id": "es", "displayName": "Spanish", "aliases": ["espanol"] }, + { "id": "fr", "displayName": "French", "aliases": ["francais"] }, + { "id": "de", "displayName": "German", "aliases": ["deutsch"] }, + { "id": "ja", "displayName": "Japanese" } + ] +} +``` + +Each snapshot entry has: + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `id` | string | Yes | Stable identifier passed as the parameter value. | +| `displayName` | string | Yes | Human-readable name shown to the user. | +| `aliases` | string[] | No | Alternate names that resolve to this entity. | + +The assistant matches the user's utterance against `displayName` and `aliases`. The resolved `id` is sent as the parameter value. + +## Testing with adb + +Query your entity provider directly from the command line. + +List all entities: + +```bash +adb shell content query --uri "content://com.example.music.oacp/entities/playlist" +``` + +Search with a prefix: + +```bash +adb shell content query --uri "content://com.example.music.oacp/entities/playlist?q=rock" +``` + +You should see rows with `id`, `displayName`, `aliases`, and `description` columns. If you get no results, check that your `Application.onCreate()` calls `OacpProvider.setEntitySource()` and that the `uri` in `oacp.json` matches your provider's authority. + +## Example: Wikipedia language selection + +A Wikipedia reader app lets users switch languages by voice. The app has ~300 supported languages, but only ~30 are commonly requested. An entity snapshot works here because the list is bounded and changes only with app updates. + +The capability in `oacp.json`: + +```json +{ + "id": "switch_language", + "description": "Switch Wikipedia to a different language edition.", + "aliases": ["change language", "set language", "use another language"], + "examples": [ + "switch to Spanish Wikipedia", + "show me the French version", + "change language to Japanese" + ], + "keywords": ["language", "edition", "translate"], + "parameters": [ + { + "name": "language_code", + "type": "string", + "description": "ISO language code for the target Wikipedia edition.", + "required": true, + "prompt": "Which language?", + "entitySnapshot": [ + { "id": "en", "displayName": "English" }, + { "id": "es", "displayName": "Spanish", "aliases": ["espanol", "castellano"] }, + { "id": "fr", "displayName": "French", "aliases": ["francais"] }, + { "id": "de", "displayName": "German", "aliases": ["deutsch"] }, + { "id": "ja", "displayName": "Japanese", "aliases": ["nihongo"] }, + { "id": "zh", "displayName": "Chinese", "aliases": ["mandarin", "zhongwen"] }, + { "id": "pt", "displayName": "Portuguese", "aliases": ["portugues"] }, + { "id": "ru", "displayName": "Russian", "aliases": ["russkiy"] } + ] + } + ], + "confirmation": "never", + "completionMode": "foreground_handoff", + "invoke": { + "android": { + "type": "activity", + "action": "__APPLICATION_ID__.ACTION_SWITCH_LANGUAGE", + "extrasMapping": { + "language_code": "__APPLICATION_ID__.EXTRA_LANGUAGE_CODE" + } + } + } +} +``` + +User says "Spanish Wikipedia". The assistant matches "Spanish" to the snapshot entry with `displayName: "Spanish"`. It resolves `language_code` to `"es"` and invokes the capability. + +User says "show me the espanol version". The assistant matches "espanol" to the `aliases` of the Spanish entry. Same result: `language_code = "es"`. diff --git a/content/docs/oacp/getting-started.mdx b/content/docs/oacp/getting-started.mdx index 79e1766..4aa9036 100644 --- a/content/docs/oacp/getting-started.mdx +++ b/content/docs/oacp/getting-started.mdx @@ -1,6 +1,6 @@ --- -title: Getting Started -last_edited: '2026-04-09T00:00:00.000Z' +title: Integration Guide +last_edited: '2026-04-11T00:00:00.000Z' tocIsHidden: false --- diff --git a/content/docs/oacp/how-it-works.mdx b/content/docs/oacp/how-it-works.mdx index 723c5c2..4a99d14 100644 --- a/content/docs/oacp/how-it-works.mdx +++ b/content/docs/oacp/how-it-works.mdx @@ -1,6 +1,6 @@ --- title: How OACP Works -last_edited: '2026-04-09T00:00:00.000Z' +last_edited: '2026-04-11T00:00:00.000Z' tocIsHidden: false --- @@ -10,24 +10,16 @@ OACP treats installed apps as self-describing toolkits rather than hardcoded int Actions like opening a camera, scanning a QR code, or navigating to a screen require the app to be visible. The assistant launches the app directly. -``` -User speaks - | - v -STT (Speech-to-text) - | - v -EmbeddingGemma matches utterance to the right capability -using semantic similarity against oacp.json metadata - | - v -Qwen3 0.6B extracts parameters (numbers, names, values) - | - v -startActivity() dispatched to target app - | - v -App opens and performs the action +```mermaid +sequenceDiagram + participant U as User + participant H as Hark + participant A as App + U->>H: "Open the QR scanner" + H->>H: STT → EmbeddingGemma → match + H->>H: Qwen3 → extract params + H->>A: startActivity(ACTION_SCAN) + A->>U: App opens, shows camera ``` No result is expected from the target app. The assistant confirms based on the action it dispatched. @@ -36,29 +28,17 @@ No result is expected from the target app. The assistant confirms based on the a Actions like weather queries, reading counter values, or checking battery level do not need the app to open. The target app processes the request in the background and `sendBroadcast()`s a result back. -``` -User speaks - | - v -STT -> EmbeddingGemma -> Qwen3 0.6B -> sendBroadcast() - | | - | +-------------------------------------+ - | | - | v - | Target app receives broadcast - | | - | v - | App processes request in background - | | - | v - | App broadcasts org.oacp.ACTION_RESULT - | | (includes org.oacp.extra.REQUEST_ID for correlation) - | | - | v - | Assistant receives result - | - v -Chat bubble shows result + TTS reads it aloud +```mermaid +sequenceDiagram + participant U as User + participant H as Hark + participant A as App + U->>H: "What's the battery level?" + H->>H: STT → EmbeddingGemma → match + H->>A: sendBroadcast(ACTION_GET_BATTERY) + A->>A: Read battery level + A->>H: broadcast(ACTION_RESULT, "85%") + H->>U: "Battery is at 85 percent" ``` The user never leaves the assistant. Results appear inline in the conversation and the assistant speaks the answer. @@ -70,6 +50,20 @@ At startup (and on app install), the assistant scans for `ContentProvider`s matc - `/manifest` - the `oacp.json` file, parsed into capability objects that drive resolution. - `/context` - the `OACP.md` file, fetched and validated for presence. Currently reserved for future models with larger context windows that can consume the extra semantic detail. +```mermaid +sequenceDiagram + participant H as Hark + participant PM as PackageManager + participant CP as ContentProvider + H->>PM: Query installed apps + PM->>H: List of packages + loop Each package + H->>CP: content://pkg.oacp/manifest + CP->>H: oacp.json + end + H->>H: Parse capabilities, embed descriptions +``` + ## Why two models? Hark uses one model for matching and one model for extraction: diff --git a/content/docs/oacp/oacp-json.mdx b/content/docs/oacp/oacp-json.mdx new file mode 100644 index 0000000..c040802 --- /dev/null +++ b/content/docs/oacp/oacp-json.mdx @@ -0,0 +1,268 @@ +--- +title: oacp.json Reference +last_edited: '2026-04-11T00:00:00.000Z' +tocIsHidden: false +--- + +Complete schema reference for `oacp.json`. Every field, every type, every default. + +Place the file at `app/src/main/assets/oacp.json`. For Flutter apps, use `android/app/src/main/assets/oacp.json`. + +## Top-level fields + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `oacpVersion` | string | Yes | Protocol version. Currently `"0.3"`. | +| `appId` | string | Yes | Always `"__APPLICATION_ID__"`. Replaced at runtime with the package name. | +| `displayName` | string | Yes | Human-readable app name shown to the user. | +| `appDomains` | string[] | No | Categories the app belongs to (e.g. `"weather"`, `"music"`, `"productivity"`). | +| `appKeywords` | string[] | No | App-level keywords used for ranking during intent resolution. | +| `appAliases` | string[] | No | Alternate names for the app (e.g. `["VLC", "VLC Player"]`). | +| `capabilities` | Capability[] | Yes | Array of capability objects. See below. | +| `entityProviders` | EntityProvider[] | No | Dynamic entity sources queried at runtime. See [Entity Providers](/docs/oacp/entity-providers). | +| `entityTypes` | EntityType[] | No | Custom entity type definitions referenced by parameters. | + +## Capability fields + +Each object in the `capabilities` array has these fields. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `id` | string | Yes | Unique identifier. Use `snake_case`. | +| `description` | string | Yes | What this capability does. One sentence. | +| `domain` | string | No | Category (e.g. `"media"`, `"navigation"`). | +| `aliases` | string[] | No | Alternate phrasings that mean the same thing. | +| `examples` | string[] | No | Real user utterances that should trigger this capability. | +| `keywords` | string[] | No | Extra ranking signals for intent resolution. | +| `disambiguationHints` | string[] | No | Rules for choosing between similar capabilities. | +| `parameters` | Parameter[] | No | Input parameters extracted from the utterance. | +| `parametersSchema` | object | No | JSON Schema for parameters. Alternative to the `parameters` array. | +| `confirmation` | string | No | One of `"never"`, `"always"`, `"if_destructive"`. Default `"never"`. | +| `confirmationMessage` | string | No | What to ask the user before executing. Used with `"always"` or `"if_destructive"`. | +| `executionMessage` | string | No | What to say when the action runs (e.g. `"Setting timer..."`). | +| `visibility` | string | No | `"public"` or `"trusted_only"`. Default `"public"`. | +| `requiresAuth` | boolean | No | Whether the user must be authenticated. Default `false`. | +| `requiresForeground` | boolean | No | Whether the app must be in the foreground. Default `false`. | +| `completionMode` | string | No | One of `"fire_and_forget"`, `"foreground_handoff"`, `"async_result"`. | +| `sensitivity` | string | No | `"low"`, `"medium"`, or `"high"`. Affects confirmation behavior. | +| `sideEffects` | string | No | Description of what changes (e.g. `"Deletes the selected file"`). | +| `idempotent` | boolean | No | Whether calling this twice has the same effect as calling it once. | +| `requiresUnlock` | boolean | No | Whether the device must be unlocked. | +| `resultSchema` | object | No | JSON Schema describing the result payload. | +| `errorCodes` | string[] | No | Known error codes the capability can return. | +| `resultTransport` | ResultTransport | No | How results are delivered back to the assistant. | +| `supportsCancellation` | boolean | No | Whether the action can be cancelled mid-execution. | +| `cancelCapabilityId` | string | No | ID of the capability that cancels this one. | +| `invoke` | InvokeConfig | Yes | Platform-specific invocation config. | + +## Parameter fields + +Each object in the `parameters` array has these fields. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `name` | string | Yes | Parameter name. Used as the key in extras. | +| `description` | string | Yes | Semantic explanation of what this parameter represents. | +| `extractionHint` | string | No | Instruction for the slot-filling model (e.g. `"extract the number of minutes"`). | +| `type` | string | Yes | One of `"string"`, `"integer"`, `"boolean"`, `"enum"`. | +| `required` | boolean | No | Whether the parameter must be present. Default `false`. | +| `enum` | string[] | No | Valid values when `type` is `"enum"`. | +| `examples` | string[] | No | Example values for this parameter. | +| `aliases` | object | No | Maps enum values to alternate phrases (e.g. `{"en": ["english", "eng"]}`). | +| `prompt` | string | No | What to ask the user if the parameter is missing (e.g. `"How many minutes?"`). | +| `default` | any | No | Default value when not provided. | +| `minimum` | number | No | Minimum value for `"integer"` type. | +| `maximum` | number | No | Maximum value for `"integer"` type. | +| `pattern` | string | No | Regex pattern for validation. | +| `entityRef` | EntityRef | No | Links this parameter to a dynamic entity provider. | +| `entitySnapshot` | EntitySnapshot[] | No | Small static set of valid entities. Inline alternative to a provider. | + +## Invoke config + +The `invoke` object tells the assistant how to launch the capability on each platform. + +```json +"invoke": { + "android": { + "type": "broadcast", + "action": "__APPLICATION_ID__.ACTION_FLASHLIGHT_ON", + "extrasMapping": { + "duration_minutes": "__APPLICATION_ID__.EXTRA_DURATION_MINUTES" + } + } +} +``` + +`type` is either `"broadcast"` or `"activity"`. + +- `"broadcast"`: sends an Android `Intent` via `sendBroadcast()`. The app handles it in a `BroadcastReceiver`. Best for background operations. +- `"activity"`: starts an `Activity` via `startActivity()`. Best when the user needs to see UI. + +`action` is the intent action string. Use `__APPLICATION_ID__` as the prefix so it resolves to your actual package name. + +`extrasMapping` maps parameter names to intent extra keys. Each parameter value is placed into the intent extras under the corresponding key. + +## Confirmation semantics + +The `confirmation` field controls whether the assistant asks before executing. + +| Value | Behavior | +|-------|----------| +| `"never"` | Invoke immediately. Show `executionMessage` if provided. | +| `"always"` | Ask the user before invoking. Show `confirmationMessage` if provided. | +| `"if_destructive"` | Confirm only if `sideEffects` indicates a destructive action (e.g. deletion). | + +## Completion modes + +The `completionMode` field tells the assistant what to expect after invocation. + +| Value | Behavior | +|-------|----------| +| `"fire_and_forget"` | No result expected. The assistant moves on. | +| `"foreground_handoff"` | The app opens and the user sees the result in the app's UI. | +| `"async_result"` | The app runs in the background and broadcasts a result via `org.oacp.ACTION_RESULT`. | + +## Result transport + +When `completionMode` is `"async_result"`, define how the result comes back. + +```json +"resultTransport": { + "android": { + "type": "broadcast", + "action": "org.oacp.ACTION_RESULT" + } +} +``` + +The app sends a broadcast with this action after completing the work. The result payload follows `resultSchema` if defined. + +## Entity types + +Define custom entity types at the top level of `oacp.json`. + +```json +"entityTypes": [ + { + "id": "playlist", + "displayName": "Playlist", + "description": "A user-created music playlist" + } +] +``` + +Parameters reference these types via `entityRef`. See [Entity Providers](/docs/oacp/entity-providers) for the full pattern. + +## Entity refs + +Link a parameter to an entity type so the assistant can resolve dynamic values. + +```json +{ + "name": "playlist", + "type": "string", + "entityRef": { + "entityType": "playlist", + "resolution": "required", + "entityDisambiguationPrompt": "Which playlist?" + } +} +``` + +`resolution` can be `"required"` (must resolve to a known entity) or `"optional"` (free-form input is accepted as a fallback). + +## Full example + +Two capabilities from the OACP Test App. `increment_counter` is a foreground action. `get_battery` is a background action that returns a result. + +```json +{ + "oacpVersion": "0.3", + "appId": "__APPLICATION_ID__", + "displayName": "OACP Test App", + "appDomains": ["utility"], + "appKeywords": ["test", "counter", "battery"], + "capabilities": [ + { + "id": "increment_counter", + "description": "Increment the on-screen counter by a given amount.", + "aliases": ["add to counter", "increase counter", "bump counter"], + "examples": [ + "increment the counter by 5", + "add 3 to the counter", + "bump the counter" + ], + "keywords": ["counter", "increment", "add"], + "parameters": [ + { + "name": "amount", + "description": "How much to add to the counter.", + "type": "integer", + "required": false, + "default": 1, + "minimum": 1, + "maximum": 1000, + "prompt": "How much should I add?" + } + ], + "confirmation": "never", + "executionMessage": "Incrementing the counter...", + "completionMode": "foreground_handoff", + "requiresForeground": true, + "idempotent": false, + "invoke": { + "android": { + "type": "activity", + "action": "__APPLICATION_ID__.ACTION_INCREMENT", + "extrasMapping": { + "amount": "__APPLICATION_ID__.EXTRA_AMOUNT" + } + } + } + }, + { + "id": "get_battery", + "description": "Get the current battery level as a percentage.", + "aliases": ["check battery", "battery status", "how much battery"], + "examples": [ + "what's the battery level?", + "check the battery", + "how much battery is left?" + ], + "keywords": ["battery", "charge", "power"], + "parameters": [], + "confirmation": "never", + "completionMode": "async_result", + "resultSchema": { + "type": "object", + "properties": { + "level": { "type": "integer", "description": "Battery percentage 0-100" }, + "charging": { "type": "boolean", "description": "Whether the device is charging" } + } + }, + "resultTransport": { + "android": { + "type": "broadcast", + "action": "org.oacp.ACTION_RESULT" + } + }, + "invoke": { + "android": { + "type": "broadcast", + "action": "__APPLICATION_ID__.ACTION_GET_BATTERY" + } + } + } + ] +} +``` + +## Tips for better voice matching + +**More aliases = more reliable matching.** The intent resolver uses aliases alongside descriptions and examples. Think about how real people phrase things. "Turn on the light", "switch on the torch", "flashlight please" are all the same intent. + +**Examples should be real utterances, not descriptions.** Write what a person would actually say out loud. `"play my workout playlist"` is a good example. `"Plays a playlist selected by the user"` is a description, not an example. + +**Keywords help with ranking.** When two capabilities from different apps match equally well, keywords break the tie. Include the nouns and verbs specific to your capability. + +**disambiguationHints prevent wrong matches.** If your app has "play song" and "play playlist", add hints like `"Use this when the user mentions a specific song title"` to help the resolver pick the right one. diff --git a/content/docs/oacp/what-is-oacp.mdx b/content/docs/oacp/what-is-oacp.mdx index 054cd71..1c0a73e 100644 --- a/content/docs/oacp/what-is-oacp.mdx +++ b/content/docs/oacp/what-is-oacp.mdx @@ -1,6 +1,6 @@ --- title: What is OACP? -last_edited: '2026-04-09T00:00:00.000Z' +last_edited: '2026-04-11T00:00:00.000Z' tocIsHidden: false --- @@ -61,6 +61,24 @@ Use a broadcast action when the app can do the work without opening UI: The assistant sends a broadcast, the app does the work, and the app can return an async result on `org.oacp.ACTION_RESULT`. +## How OACP compares + +| | OACP | Google Assistant Actions | Android 15 AppFunctions | Siri Intents | +|---|---|---|---|---| +| **Min Android** | 5.0 (API 21) | varies | 15 (API 35) | N/A (iOS) | +| **Open protocol** | Yes (Apache 2.0) | No | No | No | +| **Assistant-agnostic** | Yes | Google only | Google only | Apple only | +| **Discovery** | ContentProvider | Play Console | System API | Info.plist | +| **Metadata richness** | High (aliases, examples, hints, entities) | Medium | Low | Medium | +| **On-device only** | Yes | No (cloud) | Yes | Partial | +| **Entity providers** | Yes | No | No | Yes (via Shortcuts) | + +OACP and AppFunctions may converge. We are exploring an annotation processor that generates both oacp.json and AppFunctions metadata from a single source. + +## Who is using it + +Eight apps support OACP today, with more coming. See the [ecosystem](/docs/ecosystem/apps) for the full list. + ## Start building -If you want to add OACP to an app today, start with [Getting Started](/docs/oacp/getting-started). If you want to see the protocol in action first, read [Hark Overview](/docs/hark/overview). +If you want to see OACP working, start with [Try It Out](/docs/get-started/try-hark). If you want to add OACP to your app, jump to [Add OACP to Your App](/docs/get-started/add-oacp). For the full integration reference, see the [Integration Guide](/docs/oacp/getting-started). diff --git a/content/docs/roadmap.mdx b/content/docs/roadmap.mdx index a650626..d66d724 100644 --- a/content/docs/roadmap.mdx +++ b/content/docs/roadmap.mdx @@ -1,6 +1,6 @@ --- title: Roadmap -last_edited: '2026-04-09T00:00:00.000Z' +last_edited: '2026-04-11T00:00:00.000Z' tocIsHidden: false --- @@ -58,6 +58,8 @@ Other SDKs are not documented as installable products until there is code to ins ## How to help -- **Add OACP to an app.** The single most impactful thing you can do. Fork, integrate, open a PR against the ecosystem list. +See the [Contributing Guide](/docs/contributing/contributing) for full details. The short version: + +- **Add OACP to an app.** The single most impactful thing you can do. Use the [AI Prompts](/docs/ai-prompts/prompts) to help plan your integration. - **Test Hark on your device and file issues.** Edge cases in discovery, dispatch, and async results are where the protocol hardens. - **Protocol feedback.** Open an issue on the [oacp repo](https://github.com/OpenAppCapabilityProtocol/oacp) if you find a limitation. diff --git a/content/docs/sdks/kotlin/api-reference.mdx b/content/docs/sdks/kotlin/api-reference.mdx index 27af02d..ab1ab8f 100644 --- a/content/docs/sdks/kotlin/api-reference.mdx +++ b/content/docs/sdks/kotlin/api-reference.mdx @@ -1,6 +1,6 @@ --- title: API Reference -last_edited: '2026-04-09T00:00:00.000Z' +last_edited: '2026-04-11T00:00:00.000Z' tocIsHidden: false --- @@ -15,10 +15,22 @@ Full reference for the public surface of the OACP Android SDK. | `Oacp.EXTRA_STATUS` | `"org.oacp.extra.STATUS"` | | `Oacp.EXTRA_MESSAGE` | `"org.oacp.extra.MESSAGE"` | | `Oacp.EXTRA_RESULT` | `"org.oacp.extra.RESULT"` | +| `Oacp.EXTRA_CAPABILITY_ID` | `"org.oacp.extra.CAPABILITY_ID"` | +| `Oacp.EXTRA_SOURCE_PACKAGE` | `"org.oacp.extra.SOURCE_PACKAGE"` | +| `Oacp.EXTRA_ERROR_CODE` | `"org.oacp.extra.ERROR_CODE"` | +| `Oacp.EXTRA_ERROR_MESSAGE` | `"org.oacp.extra.ERROR_MESSAGE"` | +| `Oacp.EXTRA_ERROR_RETRYABLE` | `"org.oacp.extra.ERROR_RETRYABLE"` | +| `Oacp.STATUS_ACCEPTED` | `"accepted"` | +| `Oacp.STATUS_STARTED` | `"started"` | | `Oacp.STATUS_COMPLETED` | `"completed"` | | `Oacp.STATUS_FAILED` | `"failed"` | -| `Oacp.ERROR_NOT_FOUND` | `"not_found"` | +| `Oacp.STATUS_CANCELLED` | `"cancelled"` | +| `Oacp.ERROR_INVALID_PARAMETERS` | `"invalid_parameters"` | | `Oacp.ERROR_MISSING_PARAMETERS` | `"missing_parameters"` | +| `Oacp.ERROR_NOT_FOUND` | `"not_found"` | +| `Oacp.ERROR_NOT_AUTHENTICATED` | `"not_authenticated"` | +| `Oacp.ERROR_INTERNAL` | `"internal_error"` | +| `Oacp.VERSION` | `"0.3"` | See [`Oacp.kt`](https://github.com/OpenAppCapabilityProtocol/oacp-android-sdk/blob/main/oacp-android/src/main/kotlin/org/oacp/android/Oacp.kt) for the full list. @@ -65,10 +77,57 @@ The receiver automatically: - Wraps exceptions in `ERROR_INTERNAL` results - Uses `requestId` to correlate responses with the original request +## `OacpActivity` - abstract activity + +Subclass and override `onOacpAction` for foreground actions: + +```kotlin +abstract fun onOacpAction(action: String, params: OacpParams) +``` + +Protected members: + +| Member | Type | Description | +|--------|------|-------------| +| `requestId` | `String?` | Extracted from `org.oacp.extra.REQUEST_ID` | +| `capabilityId` | `String?` | Derived from action string suffix | +| `params` | `OacpParams` | Type-safe parameter accessor | +| `sendResult(result, targetPackage?)` | method | Broadcasts an `OacpResult` back to the assistant | + +Handles both `onCreate` and `onNewIntent` for `singleTask` launch mode. + +## `OacpEntitySource` - entity interface + +Implement to serve dynamic entity data: + +```kotlin +interface OacpEntitySource { + fun queryEntities(context: Context, entityType: String, query: String?): List +} +``` + +Register with `OacpProvider.setEntitySource(source)` in your `Application.onCreate()`. + +## `OacpEntity` - entity data class + +```kotlin +data class OacpEntity( + val id: String, + val displayName: String, + val aliases: List = emptyList(), + val description: String? = null, + val keywords: List = emptyList(), + val metadata: Map = emptyMap() +) +``` + ## `OacpProvider` Auto-registered via manifest merger at the authority `${applicationId}.oacp`. You never need to instantiate or subclass it directly. It: - Serves `oacp.json` at `content://${applicationId}.oacp/manifest` - Serves `OACP.md` at `content://${applicationId}.oacp/context` +- Serves entity data at `content://${applicationId}.oacp/entities/?q=query` - Substitutes `__APPLICATION_ID__` placeholders with the real package name at runtime, so the same manifest works for `debug`, `staging`, and `release` variants without modification + +Static method: `OacpProvider.setEntitySource(source: OacpEntitySource?)` registers the entity handler. diff --git a/content/docs/sdks/kotlin/core-components.mdx b/content/docs/sdks/kotlin/core-components.mdx index 2ad283e..e5f8680 100644 --- a/content/docs/sdks/kotlin/core-components.mdx +++ b/content/docs/sdks/kotlin/core-components.mdx @@ -1,6 +1,6 @@ --- title: Core Components -last_edited: '2026-04-09T00:00:00.000Z' +last_edited: '2026-04-11T00:00:00.000Z' tocIsHidden: false --- @@ -14,6 +14,9 @@ The OACP Android SDK gives you a small set of drop-in classes that cover everyth | `OacpReceiver` | Abstract `BroadcastReceiver` base class. Parses parameters, extracts request IDs, sends async results. You implement one method. | | `OacpParams` | Type-safe parameter access. `params.getString("query")`, `params.getInt("volume")`. Matches by suffix for build-variant safety. | | `OacpResult` | Result builder. `OacpResult.success("Done!")`, `OacpResult.error("not_found", "...")`. Auto-broadcasts to the assistant. | +| `OacpActivity` | Abstract `Activity` base class for foreground actions. Auto-extracts request ID and parameters. Provides `sendResult()` helper. | +| `OacpEntitySource` | Interface for serving dynamic entity data (playlists, contacts, bookmarks). Register with `OacpProvider.setEntitySource()`. | +| `OacpEntity` | Data class for entity items returned by `OacpEntitySource.queryEntities()`. | | `Oacp` | Protocol constants - standard extra keys, status values, error codes. | ## Reading parameters @@ -43,9 +46,25 @@ OACP is designed for real Android projects with multiple build variants: - **Receiver code** - match with `action.endsWith(".ACTION_X")` - works for `.debug`, `.staging`, etc. - **`OacpParams`** - resolves parameters by suffix, so `EXTRA_QUERY` matches regardless of prefix -## Foreground actions +## Foreground actions with OacpActivity -For actions that need UI (camera, editor, etc.), use an `Activity` instead of a `BroadcastReceiver`. +For actions that need UI (camera, editor, etc.), extend `OacpActivity`: + +```kotlin +class CameraActivity : OacpActivity() { + override fun onOacpAction(action: String, params: OacpParams) { + val mode = params.getString("mode") ?: "photo" + setupCamera(mode) + } + + private fun onPhotoTaken(path: String) { + sendResult(OacpResult.success("Photo saved", mapOf("path" to path))) + finish() + } +} +``` + +`OacpActivity` auto-extracts `requestId` and `params` from the intent. It handles both `onCreate` and `onNewIntent` for `singleTask` activities. Call `sendResult()` to broadcast results back to the assistant. **In `oacp.json`:** @@ -68,7 +87,34 @@ For actions that need UI (camera, editor, etc.), use an `Activity` instead of a ``` -Parse parameters from the launching intent in your Activity's `onCreate`. +You can also skip `OacpActivity` and parse the intent directly in any Activity's `onCreate`. `OacpActivity` is a convenience, not a requirement. + +## Dynamic entities with OacpEntitySource + +If your app has dynamic data that the assistant needs to reference (playlists, alarms, bookmarks), implement `OacpEntitySource`: + +```kotlin +class MyEntitySource : OacpEntitySource { + override fun queryEntities( + context: Context, + entityType: String, + query: String? + ): List { + if (entityType != "alarm") return emptyList() + return alarmDb.getAll() + .filter { query == null || it.label.startsWith(query, ignoreCase = true) } + .map { OacpEntity(it.id.toString(), it.label, description = it.time) } + } +} +``` + +Register it in your `Application.onCreate()`: + +```kotlin +OacpProvider.setEntitySource(MyEntitySource()) +``` + +The assistant can then query your entities at `content://${applicationId}.oacp/entities/alarm?q=morning`. See [Entity Providers](/docs/oacp/entity-providers) for the full guide. ## Testing with adb diff --git a/content/navigation-bar/docs-navigation-bar.json b/content/navigation-bar/docs-navigation-bar.json index ffc4eaf..2a5052d 100644 --- a/content/navigation-bar/docs-navigation-bar.json +++ b/content/navigation-bar/docs-navigation-bar.json @@ -15,6 +15,21 @@ } ] }, + { + "title": "Get Started", + "items": [ + { + "title": "Try It Out", + "slug": "content/docs/get-started/try-hark.mdx", + "_template": "item" + }, + { + "title": "Add OACP to Your App", + "slug": "content/docs/get-started/add-oacp.mdx", + "_template": "item" + } + ] + }, { "title": "OACP Protocol", "items": [ @@ -24,18 +39,23 @@ "_template": "item" }, { - "title": "How OACP Works", + "title": "How It Works", "slug": "content/docs/oacp/how-it-works.mdx", "_template": "item" }, { - "title": "Getting Started", - "slug": "content/docs/oacp/getting-started.mdx", + "title": "oacp.json Reference", + "slug": "content/docs/oacp/oacp-json.mdx", + "_template": "item" + }, + { + "title": "Entity Providers", + "slug": "content/docs/oacp/entity-providers.mdx", "_template": "item" }, { - "title": "OACP Integration", - "slug": "content/docs/oacp/non-flutter-integration.mdx", + "title": "Integration Guide", + "slug": "content/docs/oacp/getting-started.mdx", "_template": "item" } ] @@ -44,7 +64,7 @@ "title": "Hark", "items": [ { - "title": "Hark Overview", + "title": "Overview", "slug": "content/docs/hark/overview.mdx", "_template": "item" }, @@ -54,68 +74,80 @@ "_template": "item" }, { - "title": "OACP Test App Demo", + "title": "NLU Pipeline", + "slug": "content/docs/hark/nlu-architecture.mdx", + "_template": "item" + }, + { + "title": "Wake Word", + "slug": "content/docs/hark/wake-word.mdx", + "_template": "item" + }, + { + "title": "Test App Demo", "slug": "content/docs/hark/demo-walkthrough.mdx", "_template": "item" } ] }, { - "title": "SDKs", + "title": "Android SDK", "items": [ { - "title": "Kotlin SDK", - "items": [ - { - "title": "Quick Start", - "slug": "content/docs/sdks/kotlin/quick-start.mdx", - "_template": "item" - }, - { - "title": "Core Components", - "slug": "content/docs/sdks/kotlin/core-components.mdx", - "_template": "item" - }, - { - "title": "Async Results", - "slug": "content/docs/sdks/kotlin/async-results.mdx", - "_template": "item" - }, - { - "title": "API Reference", - "slug": "content/docs/sdks/kotlin/api-reference.mdx", - "_template": "item" - } - ], - "_template": "items" + "title": "Quick Start", + "slug": "content/docs/sdks/kotlin/quick-start.mdx", + "_template": "item" }, { - "title": "Flutter SDK (under development)", - "items": [ - { - "title": "Under development", - "slug": "content/docs/sdks/flutter/overview.mdx", - "_template": "item" - } - ], - "_template": "items" + "title": "Core Components", + "slug": "content/docs/sdks/kotlin/core-components.mdx", + "_template": "item" }, { - "title": "React Native SDK (under development)", - "items": [ - { - "title": "Under development", - "slug": "content/docs/sdks/react-native/overview.mdx", - "_template": "item" - } - ], - "_template": "items" + "title": "Async Results", + "slug": "content/docs/sdks/kotlin/async-results.mdx", + "_template": "item" + }, + { + "title": "API Reference", + "slug": "content/docs/sdks/kotlin/api-reference.mdx", + "_template": "item" + } + ] + }, + { + "title": "Ecosystem", + "items": [ + { + "title": "OACP Apps", + "slug": "content/docs/ecosystem/apps.mdx", + "_template": "item" } ] }, { - "title": "Roadmap", + "title": "Contributing", "items": [ + { + "title": "Contributing Guide", + "slug": "content/docs/contributing/contributing.mdx", + "_template": "item" + }, + { + "title": "AI Prompts", + "slug": "content/docs/ai-prompts/prompts.mdx", + "_template": "item" + } + ] + }, + { + "title": "Resources", + "items": [ + { + "title": "FAQ", + "slug": "content/docs/faq.mdx", + "_template": "item" + }, { "title": "Roadmap", "slug": "content/docs/roadmap.mdx", diff --git a/src/app/layout.tsx b/src/app/layout.tsx index 8302a27..8bbabd5 100644 --- a/src/app/layout.tsx +++ b/src/app/layout.tsx @@ -47,8 +47,8 @@ export default function RootLayout({ )} {isThemeSelectorEnabled && } diff --git a/src/app/page.tsx b/src/app/page.tsx index b2b6209..2dbbc61 100644 --- a/src/app/page.tsx +++ b/src/app/page.tsx @@ -31,6 +31,12 @@ export default function LandingPage() { > Docs + + Get Started + +

+ MCP for mobile apps. +

- Extend any Android app's capabilities to any on-device AI assistant. + Give your Android app agentic powers.

- OACP gives apps a simple way to describe what they can do, and gives - assistants a standard way to discover and invoke those actions on - device. + Apps describe what they can do. AI Assistant (Hark) discovers and invokes those + capabilities. On-device. Open protocol.

- Read the docs + Try It Out - Integrate your app + Add OACP to Your App
+

+ Hark means "to listen." It's also short for my name "Harkirat". +

{/* Three-card grid */} @@ -99,7 +110,7 @@ export default function LandingPage() { -
-            {` ┌──────────────────────────┐        ┌──────────────────────────┐
- │  Any Android app         │        │  Hark (voice assistant)  │
- │  + OACP Kotlin SDK       │◀──────▶│  on-device AI pipeline   │
- │  + oacp.json / OACP.md   │  OACP  │  discovers + dispatches  │
- └──────────────────────────┘        └──────────────────────────┘
-              ▲                                    ▲
-              └──────── OACP Protocol ─────────────┘
-                   (content providers + intents)`}
-          
+
+
+

+ Background action (round trip) +

+
+{` "Hey Hark, what's the weather?"
+        │
+        ▼
+ ┌─────────────┐  broadcast   ┌─────────────────┐
+ │    Hark     │─────────────►│  Weather App    │
+ │  on-device  │   (OACP)     │                 │
+ │     AI      │◄─────────────│  fetches data   │
+ └─────────────┘ ACTION_RESULT└─────────────────┘
+        │        
+        ▼
+ "Currently 22°, partly cloudy"`}
+              
+
+
+

+ Foreground action (one way) +

+
+{` "Hey Hark, take a picture with
+  front camera in 2 seconds"
+        │
+        ▼
+ ┌─────────────┐   (OACP)     ┌─────────────────┐
+ │    Hark     │─────────────►│  Camera App     │
+ │  on-device  │  activity    │                 │
+ │     AI      │              │  opens camera,  │
+ └─────────────┘              │  2s countdown   │
+                              └─────────────────┘`}
+              
+
+
+ + + + {/* Ecosystem */} +
+
+

+ Ecosystem +

+

+ 8 apps and counting. Breezy Weather, Binary Eye, Wikipedia, and + more. Each one ships an oacp.json and works with any OACP assistant + out of the box. +

+ + See all apps +
@@ -179,13 +237,11 @@ export default function LandingPage() {

- What Is Being Cooked + What's next

- A fuller open-source AI assistant with on-device AI, a stronger - protocol, more SDKs, and more real apps using OACP. The current docs - stay a little unfinished on purpose so the next contributors can see - where help is needed. + Wake word detection is shipped. Background listening, self-hosted + inference, and disambiguation UI are next. Everything is open source.

Docs + + Ecosystem + Roadmap + + Contributing +