Skip to content

feat(agent): agent for article fact checking #348

Merged
e06084 merged 19 commits intoMigoXLab:devfrom
seancoding-day:feature/add-article-check-scenario
Mar 4, 2026
Merged

feat(agent): agent for article fact checking #348
e06084 merged 19 commits intoMigoXLab:devfrom
seancoding-day:feature/add-article-check-scenario

Conversation

@seancoding-day
Copy link
Collaborator

No description provided.

seancoding-day and others added 17 commits February 26, 2026 17:11
Implement ArticleFactChecker using Agent-First architecture pattern
with LangChain ReAct agent for autonomous claim extraction and
verification. Features include:

- Thread-safe context passing between eval() and aggregate_results()
- Dual-layer EvalDetail.reason: text summary + structured report dict
- Intermediate artifact saving (claims, verification details, report)
- Claims extraction from tool_calls and per-claim verification merging
- PromptTemplates with OUTPUT_FORMAT for structured agent responses

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add test data files for fact-checking scenarios:
- blog_article.md: tech blog about PaddleOCR-VL with institutional claims
- news_article_excerpt.md: news article excerpt for testing
- product_review_excerpt.md: product review with statistical claims

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comprehensive test coverage for ArticleFactChecker including:
- PromptTemplates validation and output format
- Claims extraction from tool_calls
- Per-claim verification merging
- Structured report generation
- Dual-layer EvalDetail.reason output
- File saving operations (article, claims, verification, report)
- News and product review article type tests
- Blog article real-world integration test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive test suites for agent tools:
- test_arxiv_search.py: ArxivSearchTool unit and integration tests
- test_claims_extractor.py: ClaimsExtractor with type filtering, dedup
- verify_setup.py: Environment verification script for agent setup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…fact_check

Remove 3 duplicate TestArxivSupport classes that incorrectly tested
AgentFactCheck for arxiv_search support. AgentFactCheck only has
tavily_search; arxiv_search is specific to ArticleFactChecker and
is properly tested in test_article_fact_checker.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Demonstrate ArticleFactChecker usage with InputArgs + Executor pattern:
- JSONL temp file creation for article-level input
- Complete agent_config with claims_extractor, arxiv, tavily tools
- Dual-layer result display (text summary + structured report)
- Intermediate artifact output configuration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive documentation for article fact-checking:
- agent_architecture.md: Agent-First vs Custom architecture patterns
- article_fact_checking_guide.md: Complete usage guide with API reference
- quick_start_article_fact_checking.md: 5-minute quick start guide
- agent_development_guide.md: fix missing fields key in mix example

All docs use correct JSONL format and EvalPipline config structure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- _get_output_dir() now auto-generates outputs/article_factcheck_<ts>_<uuid>/
  when no explicit output_path is configured, eliminating the need to manually
  specify artifact_output_path in examples and user configs
- Add save_artifacts=false opt-out to disable artifact saving entirely
- Add base_output_path config to override the auto-generate base directory
- Append uuid suffix to prevent timestamp collision in concurrent evaluations
- Fix agent_cfg None guard and empty base_output_path fallback
- Update example to remove manual path config and add try/finally cleanup
- Update docs to document all three output path options (priority order)
- Update tests: replace old None-when-unconfigured test with two new tests
  covering auto-generate and save_artifacts=false opt-out behaviors

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove quick_start_article_fact_checking.md (redundant with
  article_fact_checking_guide.md Quick Start section)
- Trim agent_architecture.md from 1055 to 598 lines by removing
  Implementation Patterns, Configuration, and Examples sections
  (all fully covered in agent_development_guide.md)
- Update agent_development_guide.md: refresh _get_output_dir pattern
  to show new three-priority chain; update test count 82->88
- Fix 5 outdated references in article_fact_checking_guide.md from
  'only when output_path is set' to reflect new auto-save default
- Stage dingo/model/llm/agent/__init__.py (previously uncommitted)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Trim module and class docstrings, removing content duplicated between the two
- Add _write_jsonl_file() helper to deduplicate identical JSONL save logic
- Replace manual dict-counting with collections.Counter
- Remove redundant hasattr/getattr double-check in _get_system_prompt()
- Replace decorative === section dividers with concise --- headers
- Extract intermediate variable in example's reason display

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…syncio

Replace single-agent sequential path (~669s for 15 claims) with a
two-phase async architecture targeting 100-150s (4-6x speedup):

  Phase 1: ClaimsExtractor direct call via run_in_executor (~30s)
  Phase 2: asyncio.gather + Semaphore(max_concurrent=5) parallel
            mini-agents, one per claim (~80-120s)

Changes:
- agent_wrapper: add async_invoke_and_format(); extract shared
  _format_agent_result() and _make_error_result() helpers to
  eliminate duplication between sync/async invoke paths
- agent_article_fact_checker: rewrite eval() with asyncio.run()
  bridge and ThreadPoolExecutor fallback; add _async_eval(),
  _async_extract_claims(), _async_verify_single_claim(), and
  aggregation helpers; add PER_CLAIM_VERIFICATION_PROMPT and
  max_concurrent_claims=5 config option
- Fix pre-existing NoneType bug in _build_eval_detail_from_verification
- Add test_async_article_fact_checker.py (16 tests, all passing)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@seancoding-day seancoding-day self-assigned this Mar 4, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust ArticleFactChecker agent, significantly enhancing the platform's ability to perform autonomous, article-level fact-checking. The agent employs a novel two-phase asynchronous architecture, allowing for efficient parallel verification of claims extracted from long-form articles. It integrates new specialized tools, ClaimsExtractor and ArxivSearch, alongside existing web search capabilities, to systematically verify factual statements, institutional attributions, and other claim types. This feature provides a comprehensive, structured report of verification findings, including detailed evidence and sources, and saves intermediate artifacts for transparency and debugging.

Highlights

  • Introduced ArticleFactChecker Agent: A new agent designed for comprehensive, article-level fact-checking, utilizing a two-phase asynchronous architecture for parallel claim verification.
  • Added New Fact-Checking Tools: Integrated ClaimsExtractor for LLM-based claim identification and ArxivSearch for academic paper verification.
  • Enhanced AgentWrapper for Async Operations: Updated the core AgentWrapper to support asynchronous invocation, enabling parallel processing of claims.
  • Expanded Agent Documentation: New guides and architectural overviews were added to detail the agent framework, its patterns, and specific usage of the ArticleFactChecker.
Changelog
  • dingo/model/llm/agent/init.py
    • Imported and exposed the new ArticleFactChecker class.
  • dingo/model/llm/agent/agent_article_fact_checker.py
    • Implemented the ArticleFactChecker agent, featuring a two-phase asynchronous architecture for parallel claim verification, robust JSON parsing, and artifact saving.
  • dingo/model/llm/agent/agent_wrapper.py
    • Added an asynchronous invocation method (async_invoke_and_format) and refactored agent result formatting into a shared static method.
  • dingo/model/llm/agent/tools/arxiv_search.py
    • Added the ArxivSearch tool, enabling academic paper search by ID, DOI, title, or author, with rate limiting and result formatting.
  • dingo/model/llm/agent/tools/claims_extractor.py
    • Developed the ClaimsExtractor tool, which uses an LLM to extract atomic, verifiable claims from long-form text, supporting 8 claim types.
  • docs/agent_architecture.md
    • Updated the agent architecture overview to reflect the new ArticleFactChecker pattern, its workflow, and the newly added tools.
  • docs/agent_development_guide.md
    • Expanded the agent development guide to introduce a new "Agent-First with Context Tracking" pattern and updated the decision tree for choosing agent implementation patterns.
  • docs/article_fact_checking_guide.md
    • Added a new comprehensive guide detailing the usage, configuration, and workflow of the ArticleFactChecker agent.
  • examples/agent/agent_article_fact_checking_example.py
    • Provided a new example script demonstrating the usage of the ArticleFactChecker agent for comprehensive article fact-checking.
  • setup.cfg
    • Added new pytest markers (slow, external, integration) to categorize tests.
  • test/data/blog_article.md
    • Added a new markdown file containing a blog article for testing purposes.
  • test/data/blog_article_full.md
    • Added a new markdown file containing a full blog article for comprehensive testing.
  • test/data/news_article_excerpt.md
    • Added a new markdown file containing a news article excerpt for testing.
  • test/data/product_review_excerpt.md
    • Added a new markdown file containing a product review excerpt for testing.
  • test/scripts/model/llm/agent/test_agent_fact_check.py
    • Updated the test suite to include patches for AgentWrapper and _check_langchain_available.
  • test/scripts/model/llm/agent/test_article_fact_checker.py
    • Added a new test suite for the ArticleFactChecker agent, covering basic functionality, result structure, file saving, error handling, and verdict consistency.
  • test/scripts/model/llm/agent/test_article_fact_checker_news.py
    • Added a new test suite specifically for ArticleFactChecker's handling of news articles, including temporal, attribution, and monetary claims.
  • test/scripts/model/llm/agent/test_article_fact_checker_product.py
    • Added a new test suite specifically for ArticleFactChecker's handling of product reviews, including technical, comparative, and monetary claims.
  • test/scripts/model/llm/agent/test_async_article_fact_checker.py
    • Added a new test suite focusing on the asynchronous aspects of ArticleFactChecker, including parallel execution, JSON parsing robustness, and error handling.
  • test/scripts/model/llm/agent/test_tool_registry.py
    • Modified the test setup and teardown to ensure the tool registry state is properly managed between tests.
  • test/scripts/model/llm/agent/tools/test_arxiv_search.py
    • Added a new test suite for the ArxivSearch tool, covering configuration, pattern detection, execution, rate limiting, and error handling.
  • test/scripts/model/llm/agent/tools/test_claims_extractor.py
    • Added a new test suite for the ClaimsExtractor tool, including tests for missing API keys.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial and well-executed feature addition, introducing the ArticleFactChecker agent. The new two-phase asynchronous architecture is a significant performance improvement, and the new tools (claims_extractor, arxiv_search) are well-designed with robust features like flexible parsing and rate limiting. The accompanying documentation is comprehensive and very helpful. I have a couple of suggestions for improvement regarding some leftover code and test coverage to further enhance the quality of this contribution.

Note: Security Review did not run due to the size of the PR.

@seancoding-day
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new ArticleFactChecker agent with a two-phase asynchronous architecture for parallel claim verification. The changes include the agent itself, new tools for arXiv search and claims extraction, extensive documentation, and a comprehensive test suite. The implementation demonstrates robust patterns for handling LLM interactions, including detailed prompts and fallback parsing logic. My review identifies one potential issue regarding state management in what appears to be a legacy code path.

Note: Security Review did not run due to the size of the PR.

@e06084 e06084 merged commit 1be2ea1 into MigoXLab:dev Mar 4, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants