Skip to content

[Data Quality] 23 Cyclic rdfs:subClassOf Relationships + 9 Orphan Nodes in OntoMathPRO v8 #3

@park-kyungchan

Description

@park-kyungchan

Summary

During our migration of OntoMathPRO v8 to Neo4j graph database, we identified 32 data quality issues:

  • 23 cyclic rdfs:subClassOf relationships (causing circular hierarchies)
  • 9 orphan nodes (missing parent relationships)

Impact: These issues prevent direct DAG (Directed Acyclic Graph) implementation, which is essential for graph databases and ontology reasoning systems.


Issue #1: Cyclic rdfs:subClassOf Relationships (23 cycles)

Severity

HIGH - Prevents proper hierarchy traversal and reasoning

Description

Cycles occur when rdfs:subClassOf relationships form circular paths, violating the DAG property required for proper ontology hierarchies.

Example cycle:

  • A rdfs:subClassOf B
  • B rdfs:subClassOf C
  • C rdfs:subClassOf A ← Cycle!

Detected Major Cycles

  1. E34 Cycle (Length: 5)

    • Path: E34 → E1660 → E4830 → E5122 → E6214 → E34
    • Concepts: Mathematical knowledge object chain
  2. E2844 Cycle (Length: 3)

    • Path: E2844 → E1660 → E34 → E2844
    • Concepts: Element of mathematical analysis chain
  3. Matrix Cycle (Length: 4)

    • Path: MatrixOperation → SquareMatrix → DiagonalMatrix → MatrixOperation

Plus 20 more cycles (mostly 2-node bidirectional relationships)

Reproduction Steps

Using Protégé:

  1. Open `ontomathpro_v8.owl` in Protégé
  2. Select "Tools" → "Reasoner" → "HermiT"
  3. Run "Start Reasoner"
  4. Navigate to E34 class
  5. Expand "SubClass Of" hierarchy
  6. Observe circular reference

Using Neo4j Cypher (after import):
```cypher
MATCH path = (n:ObjectType)-[:GENERALIZES*]->(n)
RETURN [node in nodes(path) | node.name] as cycle_path,
length(path) as cycle_length
ORDER BY cycle_length DESC
```

Recommended Fix

For the E34 cycle specifically, we recommend removing E34 → E1660 relationship:

Rationale:

  • "Mathematical knowledge object" (E34) should NOT be subclass of "Value" (E1660)
  • Counter-examples: Theorem, Operator, Formula are not values
  • Keeping E1660 → E34 (Value is-a Mathematical knowledge object) is semantically correct

General approach:

  1. Analyze semantic correctness of each rdfs:subClassOf in the cycle
  2. Remove the weakest relationship (least semantically justified)
  3. Re-validate hierarchy

Issue #2: Orphan Nodes (9 nodes)

Severity

MEDIUM - Reduces hierarchy completeness

Description

9 nodes have no parent relationships due to encoding/naming mismatches in the OWL file.

Detected Orphans

Emden-Fowler Family (5 nodes):

  • `Emden–FowlerEquation` (expected parent: E1897)
  • `Emden–FowlerTypeEquation`
  • `EmdenEquation`
  • `Thomas–FermiEquation`
  • `Euler–Poisson–DarbouxEquation`

Root Cause: Encoding mismatch (`â` vs `-`)

ElementMatrices Family (4 nodes):

  • `ElementMatriсesTheory` (Cyrillic 'с')
  • `DeterminantMatrix`
  • `MatrixOperation`
  • `TraceMatrix`

Root Cause: Cyrillic character in parent name (`с` instead of `c`)

Recommended Fix

  1. Normalize encoding: Convert all em-dashes to regular hyphens
  2. Fix Cyrillic characters: Replace Cyrillic 'с' with Latin 'c' in `ElementMatricesTheory`
  3. Add missing relationships:
    ```xml
    <owl:Class rdf:about="EmdenEquation">
    <rdfs:subClassOf rdf:resource="Emden-FowlerEquation"/>
    </owl:Class>
    ```

Impact Analysis

Current State

  • Total Classes: 4,052
  • With cycles: 23 classes affected
  • Orphaned: 9 classes
  • Effective completeness: ~99.2%

Consequences

  • ❌ Cannot be used in Neo4j without manual fixes
  • ❌ Reasoners may produce incorrect inferences
  • ❌ Hierarchy visualization tools fail
  • ❌ SPARQL queries return incomplete results

Full Details

For complete analysis including all 23 cycles, reproduction scripts, and detailed recommendations, see our full report:

Repository: [Our internal analysis repository]
Report File: `palantir/docs/ontomathpro_issues_report.md`


Environment

  • OntoMathPRO Version: v8 (`ontomathpro_v8.owl`)
  • Detection Method: Neo4j graph database migration + Python OWL parsing
  • Analysis Date: 2025-11-08
  • Reporter: Math Ontology Migration Team

We're happy to provide additional details or collaborate on fixes. Thank you for maintaining this valuable resource!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions