Skip to content

Latest commit

 

History

History
317 lines (247 loc) · 9.05 KB

File metadata and controls

317 lines (247 loc) · 9.05 KB

Substrate Project Updates

This file tracks all significant changes, additions, and milestones in the Substrate project.


🚀 Recent Updates

2025-10-25: Major data infrastructure upgrade - Comprehensive data management system with library science methodology


2025-10 - Data Infrastructure Revolution

Dataset Additions (5 New Authoritative Datasets)

Knowledge Worker Global Salaries (DS-00005)

  • Added: 2025-10-18
  • Coverage: Global compensation data for knowledge workers
  • Validation: 2025-10-25 validation check completed
  • Status: Active

Pulitzer Prize Winners - Arts & Letters (DS-00004)

  • Added: 2025-10-07
  • Coverage: 1918-2024 (249 winners across Poetry, Drama, General/Special awards)
  • Source: Wikidata
  • Update: 2025-10-25 refresh
  • Quality: High-quality, complete coverage of selected categories
  • Rationale: Focused on Arts & Letters for quality over breadth

Bay Area COVID-19 Wastewater Surveillance (DS-00003)

  • Added: 2025-10-07
  • Coverage: 2022-07-09 to 2025-08-02 (161 weekly data points)
  • Source: California Department of Public Health (CDPH)
  • Type: Leading health indicator (population-level surveillance)
  • Geographic: Statewide California serving as Bay Area proxy

U.S. Gross Domestic Product (DS-00002)

  • Added: 2025-10-16
  • Coverage: Annual 1929-2024 (96 years) + Quarterly Q1 1947 - Q2 2025 (314 quarters)
  • Source: Federal Reserve Economic Data (FRED) / Bureau of Economic Analysis (BEA)
  • Update: 2025-10-25 refresh
  • Significance: Primary measure of U.S. economic activity
  • Quality: Gold standard indicator with three-stage quarterly revision process
  • Research: Created through comprehensive 10-agent parallel research across Perplexity, Claude WebSearch, and Gemini

U.S. Consumer Price Index - Inflation (DS-00001)

  • Added: 2025-10-06
  • Coverage: 1947-2025 (945 monthly data points)
  • Source: Federal Reserve Economic Data (FRED) / Bureau of Labor Statistics (BLS)
  • Update: 2025-10-25 refresh
  • Type: CPI-U (Consumer Price Index for All Urban Consumers)
  • Significance: Gold standard inflation measure for the United States

Data Management System

Library Science Methodology Implementation

  • Eight-Dimension Source Evaluation Framework:

    1. Authority & Credibility
    2. Currency & Timeliness
    3. Accuracy & Reliability
    4. Coverage & Scope
    5. Objectivity & Bias
    6. Accessibility
    7. Documentation Quality
    8. Provenance & Citation
  • Metadata Standards: Dublin Core, MARC, SDMX, DDI

  • Source Classification: Primary, Secondary, Tertiary

  • Quality Assurance: Research-grade evaluation for each dataset

Technical Infrastructure

  • Runtime: Bun (TypeScript)
  • Auto-Discovery: Orchestrator automatically detects all DS-* directories
  • Update Scripts: TypeScript scripts with error handling, retry logic, rate limiting
  • Central Logging: Aggregated logs from all sources
  • Dashboard Generation: Auto-generated README with system health metrics
  • Git Integration: Automated version control
  • Data Formats: Raw JSON + Pipe-delimited (Substrate standard)

Documentation Suite

  • GETTING_STARTED.md - Complete setup and usage guide (536 lines)
  • PROJECT_SUMMARY.md - Technical architecture overview (475 lines)
  • QUICK_REFERENCE.md - Command cheatsheet
  • Data/README.md - Data directory documentation
  • Individual Data/*/UPDATES.md - Dataset-specific change logs
  • Individual Data/*/README.md - Dataset documentation with research methodology
  • README-LIBRARY-SCIENCE.md - Library science framework explanation

Migration from Data-Sources to Data

  • Completed: 2025-10-16
  • Reason: Simplified directory naming, clearer structure
  • Impact: All references updated, old directory removed
  • Documentation: MIGRATION-GUIDE.md and MIGRATION-COMPLETE.md created

2025-10 - GitHub Automation

GitHub Actions

Claude Code Review Workflow

  • Added: 2025-10-06
  • Updated: 2025-10-06
  • Function: Automated code review using Claude
  • Status: Active

Claude PR Assistant Workflow

  • Added: 2025-10-06
  • Updated: 2025-10-06
  • Function: Automated PR assistance and analysis
  • Status: Active

2025-10 - Community Contributions

Problems

Brazil - São Paulo Mental Health

  • Contributor: @ktfth
  • Added: 2025-10-06
  • PR: #30
  • Impact: Expanded geographic coverage of mental health issues

Various Problem Updates

  • Contributor: @DesertEaglePWN
  • Added: 2025-10-06
  • PR: #28, #31
  • Impact: Problem database refinement

Arguments

New Arguments

  • Contributor: @DesertEaglePWN
  • Added: 2025-10-06
  • PR: #31
  • Impact: Expanded argumentation framework

AI Understanding Argument

  • Contributor: @JaymanW
  • Added: 2024-09-25
  • PR: #21
  • Content: Arguments about AI comprehension and understanding

Values

Values Framework

  • Contributor: @karai114
  • Added: 2024-09-25
  • PR: #22
  • Impact: Established values taxonomy for Substrate

Claims

Initial Claims

  • Contributor: @ThatNateGuy
  • Added: 2024-04-25
  • PR: #13
  • Claims Added:
    • Anthropogenic climate change
    • Everettian Interpretation of Quantum Mechanics
    • Supernaturalism
    • Atavistic Model of Cancer
    • Holographic Universe theory

2024-07 - Project Foundation

Repository Consolidation

Single-Repo Structure

  • Date: 2024-07-27
  • Change: Moved from multi-repo to single-repo structure
  • Benefit: Easier management and contribution
  • Impact: Simplified development workflow

Core Components

Initial Object Types Created:

  • Problems
  • Solutions
  • Ideas
  • Plans
  • Experiments
  • Results
  • Models
  • Arguments
  • Claims
  • Values
  • Organizations
  • People
  • Projects
  • Funding Sources
  • Outcomes
  • Risks
  • Threats

Documentation

  • README.md with project vision
  • Introduction video (YouTube)
  • Blog post announcement

Project Milestones

Phase 1: Foundation (July 2024)

✅ Single-repo structure ✅ Core object types defined ✅ Basic directory structure ✅ Initial documentation ✅ Public launch

Phase 2: Community Building (Aug-Sep 2024)

✅ First community contributions ✅ Claims framework established ✅ Arguments and Values added ✅ Multi-contributor ecosystem

Phase 3: Data Infrastructure (Oct 2025)

✅ Five authoritative datasets ✅ Library science methodology ✅ Data management system ✅ TypeScript automation ✅ Comprehensive documentation ✅ GitHub Actions integration

Phase 4: Future (Planned)

  • Web-based contribution interface
  • Interactive data visualizations
  • API for programmatic access
  • Additional authoritative datasets
  • Cross-reference linking system
  • Evidence-based problem/solution matching
  • Community-driven dataset requests

Dataset Update History

For detailed dataset-specific updates, see:

  • Data/UPDATES.md - Central data directory updates
  • Data/US-GDP/UPDATES.md - GDP dataset updates
  • Data/US-Inflation/UPDATES.md - Inflation dataset updates
  • Data/Bay-Area-COVID-Wastewater/UPDATES.md - COVID wastewater updates
  • Data/Pulitzer-Prize-Winners/UPDATES.md - Pulitzer Prize updates

Breaking Changes

2025-10-16: Data-Sources → Data Directory Rename

  • Impact: Directory path changed from Data-Sources/ to Data/
  • Migration: Automatic, all references updated
  • Documentation: See Data/MIGRATION-GUIDE.md

Statistics

Project Scale (as of 2025-10-27)

Datasets:

  • Total: 5 authoritative datasets
  • Total Data Points: 1,700+ (GDP quarterly + monthly inflation + COVID weekly + Pulitzer winners + salary data)
  • Historical Coverage: 1918-2025 (107 years maximum span)
  • Geographic Coverage: Global (U.S.-focused with expanding international data)

Documentation:

  • Lines of Markdown: 8,000+ lines
  • Lines of TypeScript: 1,000+ lines
  • Documentation Files: 25+ files

Community:

  • Contributors: 6+ community members
  • Pull Requests Merged: 10+
  • Issues Addressed: Multiple

Infrastructure:

  • GitHub Actions: 2 workflows
  • Update Scripts: TypeScript with Bun
  • Data Formats: CSV, JSON, Markdown, Pipe-delimited
  • Version Control: Full git integration

Acknowledgments

Major Contributors:

  • Daniel Miessler - Project creator and maintainer
  • @ThatNateGuy - Claims framework
  • @JaymanW - Arguments on AI understanding
  • @karai114 - Values framework
  • @DesertEaglePWN - Problems and Arguments updates
  • @ktfth - Brazil mental health problems

Special Thanks:

  • Jonathan Dunn - Similar goals and inspiration
  • Joel Parish - Structure wisdom
  • Joseph Thacker - Constant flow of ideas

How to Track Updates

Watch This File: UPDATES.md for project-wide changes Watch Data Updates: Data/UPDATES.md for dataset-specific changes Watch GitHub: Releases and commit history Watch Individual Datasets: Each dataset has its own UPDATES.md file


Last Updated: 2025-10-27 Update Frequency: As changes occur Format: Reverse chronological (newest first)