SchemaView: format, lint autofix, add typing + docs#389

Merged

kevinschaper merged 3 commits intolinkml:mainfrom

ialarmedalien:schemaview_tidy

May 9, 2025

Collaborator

ialarmedalien commented May 6, 2025 •

edited

Loading

upstream_repo: dalito/linkml
upstream_branch: issue2578-fix-uri-in-snapshot

Note: upstream tests will fail until linkml/linkml#2648 is merged, which relies on a new release of linkml-runtime being made.

I have a minor change to make to the SchemaView package so wanted to spruce up SV and its tests first.

I ran ruff format and ruff check --fix to fix "safe" linting errors. See my local ruff config.

The changes look rather daunting, but they are all fairly simple. I recommend viewing the commits separately if you have any worries.

commit 1: run ruff format on the files. Alter schemaview.py to pull in specific classes from the linkml_runtime.linkml_model.meta instead of *-importing

commit 2: run the ruff linter auto fixes

commit 3: add in missing typing and basic doc strings. A couple of minor fixes, which I will add comments to.

There is an existing PR that runs formatting and linting auto fixes on the whole codebase, so I have not put in any ruff infrastructure here.

ialarmedalien added 3 commits

May 6, 2025 12:11


          Use specific imports from linkml_runtime.linkml_model.meta rather tha…

556b035

…n * import

Run ruff format on schemaview and schemaview tests


          Run ruff check --fix to fix simple linter errors

ef9b902


          Adding typing and basic docstring where missing

c7e1c3a

codecov bot commented May 6, 2025 •

edited

Loading

Codecov Report

Attention: Patch coverage is 83.60656% with 30 lines in your changes missing coverage. Please review.

Project coverage is 63.79%. Comparing base (00abef0) to head (c7e1c3a).
Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
linkml_runtime/utils/schemaview.py	82.85%	23 Missing and 7 partials ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #389   +/-   ##
=======================================
  Coverage   63.79%   63.79%           
=======================================
  Files          63       63           
  Lines        8946     8938    -8     
  Branches     2587     2584    -3     
=======================================
- Hits         5707     5702    -5     
+ Misses       2633     2629    -4     
- Partials      606      607    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ialarmedalien commented

View reviewed changes

linkml_runtime/index/object_index.py

               from collections.abc import Mapping, Iterator
-              from linkml_runtime import SchemaView
+              from linkml_runtime.utils.schemaview import SchemaView

Collaborator Author

ialarmedalien May 6, 2025

use full path for import (this was mostly to make it easier for me to find places where weird stuff was being imported from SchemaView)

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

		@@ -1,37 +1,68 @@
		"""SchemaView, a virtual schema layered on top of a schema plus its import closure."""

		from __future__ import annotations

Collaborator Author

ialarmedalien May 6, 2025

allows nicer ways of expressing typing (i.e. X | Y, etc.)

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

                       if ordered_by in (OrderedBy.LEXICAL, OrderedBy.LEXICAL.value):
                           return self._order_lexically(elements)
-                      elif ordered_by in (OrderedBy.RANK, OrderedBy.RANK.value):
+                      if ordered_by in (OrderedBy.RANK, OrderedBy.RANK.value):

Collaborator Author

ialarmedalien May 6, 2025

no need for else or elif after a return

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

                               candidate = clist[i]
                               can_add = False
-                              if candidate.is_a is None:
+                              if candidate.is_a is None or candidate.is_a in [p.name for p in slist]:

Collaborator Author

ialarmedalien May 6, 2025

collapse / combine if conditions

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

                       classes = copy(self._get_dict(CLASSES, imports))
-                      classes = self.ordered(classes, ordered_by=ordered_by)
-                      return classes
+                      return self.ordered(classes, ordered_by=ordered_by)

Collaborator Author

ialarmedalien May 6, 2025

return directly, no need to assign to a variable

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

-                      slots = self.ordered(slots, ordered_by=ordered_by)
-                      return slots
+                      return self.ordered(slots, ordered_by=ordered_by)

Collaborator Author

ialarmedalien May 6, 2025

return directly

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

                       :return: all enums in schema view
                       """
-                      return self._get_dict(ENUMS, imports)
+                      return self.all_enums(imports)

Collaborator Author

ialarmedalien May 6, 2025

redirect to the non-deprecated function

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

-                      all_subsets = self.all_subsets(imports=imports)
-                      # {**a,**b} syntax merges dictionary a and b into a single dictionary, removing duplicates.
-                      return {**all_classes, **all_slots, **all_enums, **all_types, **all_subsets}
+                      return self.all_elements(imports)

Collaborator Author

ialarmedalien May 6, 2025

redirect to non-deprecated function

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

-                      return _closure(lambda x: self.class_parents(x, imports=imports, mixins=mixins, is_a=is_a),
-                                      class_name,
-                                      reflexive=reflexive, depth_first=depth_first)
+                      return _closure(

Collaborator Author

ialarmedalien May 6, 2025

reformatted

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

-                                      if not is_empty(v2):
-                                          v = v2
-                                          logger.debug(f'{v} takes precedence over {v2} for {induced_slot.name}.{metaslot_name}')
+                              elif metaslot_name in COMBINE:

Collaborator Author

ialarmedalien May 6, 2025

else + if => elif

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

                       :return:
                       """
-                      self.schema.subsets[subset.name] = type
+                      self.schema.subsets[subset.name] = subset

Collaborator Author

ialarmedalien May 6, 2025

oops

ialarmedalien commented

View reviewed changes

linkml_runtime/utils/schemaview.py

    
                      :return:

                      """

                      self.schema.types[type.name] = type

                      self.schema.types[type_def.name] = type_def

Collaborator Author

ialarmedalien May 6, 2025

type is a reserved word in python

ialarmedalien commented

View reviewed changes

tests/test_utils/test_schemaview.py

-                  schema_path = os.path.join(INPUT_DIR, "schemaview_is_inlined.yaml")
-                  sv = SchemaView(schema_path)
-                  cases = [
+              @pytest.mark.parametrize(

Collaborator Author

ialarmedalien May 6, 2025

use parametrize instead of iterating through a list

ialarmedalien commented

View reviewed changes

tests/test_utils/test_schemaview.py

               @pytest.fixture
-              def view():
+              def schema_view_with_imports() -> SchemaView:

Collaborator Author

ialarmedalien May 6, 2025

clearer name

ialarmedalien commented

View reviewed changes

tests/test_utils/test_schemaview.py

Comment on lines +60 to +61

		@pytest.fixture(scope="session")
		def schema_view_attributes() -> SchemaView:

Collaborator Author

ialarmedalien May 6, 2025

add a fixture for this since it's used in several tests

ialarmedalien commented

View reviewed changes

tests/test_utils/test_schemaview.py

-                  """
-                  view = SchemaView(os.path.join(INPUT_DIR, "attribute_edge_cases.yaml"))
-                  expected = [
+              @pytest.mark.parametrize(

Collaborator Author

ialarmedalien May 6, 2025

parametric instead of iterating through a list

Contributor

Silvanoc commented May 7, 2025 •

edited

Loading

@ialarmedalien do I understand it correctly? The conversations that you've opened in this PR are mostly justifying some of the changes, right? So a reviewer that holds those changes as meaningful, can simply comment "makes sense" or so or "thumb-up" your comment and close those conversations, right?

Contributor

Silvanoc commented May 7, 2025 •

edited

Loading

@ialarmedalien I'm trying to run "ruff" myself just to be sure that I don't look at changes directly applied by the tool and I'm failing. I'm getting an error message.

[Edit]: I've seen that linkml does not declare ruff as a dependency at a project level, but within the tox configuration and tox takes care of installing it on virtualenv created for the execution.

Doing so I've realized that we are not specifying ruff as a development requirement. The issue was with the version of ruff I have installed locally (a prehistoric v0.0.289). I've added ruff to the dev dependencies (which gives me v0.11.8), and now it runs.

I've created this PR to your branch to add it. It might be a good idea to have separate commits to add whatever other ruff linters you think would be meaningful for the project.

Contributor

Silvanoc commented May 7, 2025 •

edited

Loading

@ialarmedalien I've realized that you are running ruff with different linters than those specified in the pyproject.toml. I could at least identify isort, and I've therefore added it to my PR. Could you please add any others that you are using?

I'm trying to review this PR, but I'm having difficulties to segregate those changes introduced by ruff from those you've introduced. According the commit messages, it looks as if you wouldn't have done any manual changes to the first two commits. But I'm failing to reproduce them. Could you please provide the commands/configurations that you've used?

[Edit]: I've just found out that you are running ruff check --select=D --fix --unsafe-fixes. We have some ruff configuration in the linkml repo which seems to be a bit outdated (notice the deprecation notes when running make lint-fix). Perhaps it would make sense to clean-up both in that repo and here.

These are the commands to reproduce my environment:

$ git clone --quiet https://github.com/linkml/linkml-runtime
$ cd linkml-runtime
$ git remote add silvanoc https://github.com/Silvanoc/linkml-runtime
$ git fetch --quiet silvanoc
$ git cherry-pick silvanoc/add-ruff-as-dev-dependency~1
[main a2a0a4e] chore: add ruff to dev dependencies
 Date: Wed May 7 12:41:47 2025 +0200
 2 files changed, 31 insertions(+), 1 deletion(-)
$ git cherry-pick silvanoc/add-ruff-as-dev-dependency
[main b376d8e] chore(lint): add import sorting
 Date: Wed May 7 13:15:32 2025 +0200
 1 file changed, 2 insertions(+), 1 deletion(-)
$ poetry install --quiet --only=dev

If I let ruff apply its own fixes this is what I get:

$ poetry run ruff check --fix linkml_runtime/utils/schemaview.py --config pyproject.toml
Found 2 errors (2 fixed, 0 remaining).
$ git diff
diff --git i/linkml_runtime/utils/schemaview.py w/linkml_runtime/utils/schemaview.py
index 1be70a3..3c8145e 100644
--- i/linkml_runtime/utils/schemaview.py
+++ w/linkml_runtime/utils/schemaview.py
@@ -1,25 +1,25 @@
+import collections
+import logging
 import os
 import sys
 import uuid
-import logging
-import collections
-from functools import lru_cache
-from copy import copy, deepcopy
+import warnings
 from collections import defaultdict, deque
+from collections.abc import Mapping
+from copy import copy, deepcopy
+from enum import Enum
+from functools import lru_cache
 from pathlib import Path, PurePath
 from typing import Optional, TypeVar
-from collections.abc import Mapping
-import warnings
-from urllib.parse import urlparse
 
-from linkml_runtime.utils.namespaces import Namespaces
 from deprecated.classic import deprecated
-from linkml_runtime.utils.context_utils import parse_import_map, map_import
-from linkml_runtime.utils.formatutils import camelcase, is_empty, sfx, underscore
-from linkml_runtime.utils.pattern import PatternResolver
-from linkml_runtime.linkml_model.meta import *
+
 from linkml_runtime.exceptions import OrderingError
-from enum import Enum
+from linkml_runtime.linkml_model.meta import *
+from linkml_runtime.utils.context_utils import map_import, parse_import_map
+from linkml_runtime.utils.formatutils import camelcase, is_empty, sfx, underscore
+from linkml_runtime.utils.namespaces import Namespaces
+from linkml_runtime.utils.pattern import PatternResolver
 
 logger = logging.getLogger(__name__)

The same can be done for the tests file and also with ruff format.

Collaborator Author

ialarmedalien commented May 7, 2025 •

edited

Loading

do I understand it correctly? The conversations that you've opened in this PR are mostly justifying some of the changes, right? So a reviewer that holds those changes as meaningful, can simply comment "makes sense" or so or "thumb-up" your comment and close those conversations, right?

The comments are to explain what I did, yes, to make life easier for the reviewer. There's no need to respond unless you particularly want to! 🙂

Contributor

Silvanoc commented May 7, 2025 •

edited

Loading

IMO the whole linting, ruff, tox,... set-up in both linkml (especially there) and linkml-runtime is a mess. I would seriously consider cleaning it up...

If looking at linkml, we can find ruff configurations in the pyproject.toml configuration (what applies if running ruff with Poetry) and in the tox.ini configuration (what applies if running tox, no idea if the pyproject.toml will be also considered...) 😵‍💫

Perhaps doing it "right" here first and using it as a pattern for the linkml repo would be the easiest approach...

Collaborator Author

ialarmedalien commented May 7, 2025

@Silvanoc re: ruff linting: I think it would be better if you made a direct PR to the linkml-runtime repo to update ruff and add it to the dev dependencies so that it gets the attention of the linkml-runtime maintainers.

The main reason I didn't add my config into this PR is that I have pretty much all the linters enabled locally, which I would not necessarily suggest for a repo like linkml-runtime.

This is my local config:

[tool.ruff]
line-length = 120
target-version = "py39"

[tool.ruff.lint]
select = [
    # core
    "F", # Pyflakes
    "E", # pycodestyle errors
    "W", # pycodestyle warnings
    "C90", # mccabe +
    "I", # isort
    "N", # pep8-naming
    "D", # pydocstyle
    "UP", # pyupgrade
    # extensions
    "YTT", # flake8-2020
    "ANN", # flake8-annotations
    "ASYNC", # flake8-async
    "S", # flake8-bandit
    "BLE", # flake8-blind-except
    "FBT", # flake8-boolean-trap
    "B", # flake8-bugbear
    "A", # flake8-builtins
    # "COM", # flake8-commas
    # "CPY", # flake8-copyright
    "C4", # flake8-comprehensions
    "DTZ", # flake8-datetimez
    "T10", # flake8-debugger
    # "DJ", # flake8-django
    "EM", # flake8-errmsg
    "EXE", # flake8-executable
    "FA", # flake8-future-annotations
    "ISC", # flake8-implicit-str-concat
    "ICN", # flake8-import-conventions
    "G", # flake8-logging-format
    "INP", # flake8-no-pep420
    "PIE", # flake8-pie
    "T20", # flake8-print
    "PYI", # flake8-pyi
    "PT", # flake8-pytest-style
    "Q", # flake8-quotes
    "RSE", # flake8-raise
    "RET", # flake8-return
    "SLF", # flake8-self
    "SLOT", # flake8-slots
    "SIM", # flake8-simplify
    "TID", # flake8-tidy-imports
    "TCH", # flake8-type-checking
    "INT", # flake8-gettext
    "ARG", # flake8-unused-arguments
    "PTH", # flake8-use-pathlib
    "TD", # flake8-todos
    "FIX", # flake8-fixme
    "ERA", # eradicate
    "PD", # pandas-vet
    "PGH", # pygrep-hooks
    "PL", # Pylint
    "TRY", # tryceratops
    "FLY", # flynt
    "NPY", # NumPy-specific rules
    "AIR", # Airflow
    "PERF", # Perflint
    "FURB", # refurb
    "LOG", # flake8-logging
    "RUF", # Ruff-specific rules
]

# Allow autofix for all enabled rules (when `--fix`) is provided.
fixable = ["ALL"]
unfixable = []

# D203: one-blank-line-before-class (conflicts with D211)
# D212: multi-line-summary-first-line (conflicts with D213)
# E203: whitespace before ',', ';', or ':'
# E501: line length
# ISC001: conflicts with Ruff's formatter
# W503: line break after binary operator
ignore = [
    "D203",
    "D213",
    "E203",
    "E501",
    "ISC001",
    "FBT001",
    "FBT002"
]

[tool.ruff.lint.per-file-ignores]
"tests/**/*.py" = ["S101"] # use of assert

[tool.ruff.lint.mccabe]
# Flag errors (`C901`) whenever the complexity level exceeds 15.
max-complexity = 15

I just ran ruff check --fix; I did not include the unsafe fixes.

I would love to format and lint auto-fix all the files in this repo, but there is an existing PR (#347) that does this and I didn't want to redo work that has already been done.

Collaborator Author

ialarmedalien commented May 7, 2025

IMO the whole linting, ruff, tox,... set-up in both linkml (especially there) and linkml-runtime is a mess. I would seriously consider cleaning it up...

I agree completely! I did some work on the linkml-map repo to introduce standardised ruff formatting and linting, which was fairly easy since the repo is low traffic. This repo would probably be the next good target before tackling the main linkml codebase.

Contributor

Silvanoc commented May 7, 2025

The main reason I didn't add my config into this PR is that I have pretty much all the linters enabled locally, which I would not necessarily suggest for a repo like linkml-runtime.

The problem without the ruff configuration (thanks for providing it) is that I wanted to look at this PR and I was not capable of taking the chaff (whatever changes ruff do) apart from the wheat (those changes made by you).

Anyway, you can simply ignore/close my PR on your branch.

I would love to format and lint auto-fix all the files in this repo, but there is an existing PR (#347) that does this and I didn't want to redo work that has already been done.

True! I saw it in the past and forgot it! I'll have a look at it.

Collaborator Author

ialarmedalien commented May 7, 2025

@Silvanoc Apologies for unintentionally giving you a load of detective work to do in figuring out my ruff config! I would have included it as a comment on the PR if I had known.

If you look at the last commit of the PR, it contains the edits that I made manually or via VSCode's ruff plugin code fixing feature. Hopefully that will be a little easier to review...

Contributor

Silvanoc commented May 7, 2025

@Silvanoc Apologies for unintentionally giving you a load of detective work to do in figuring out my ruff config! I would have included it as a comment on the PR if I had known.

No worries.

If you look at the last commit of the PR, it contains the edits that I made manually or via VSCode's ruff plugin code fixing feature. Hopefully that will be a little easier to review...

I'm proposing here some changes to streamline linting and formatting. That way we have a clear definition of what we expect, we also enforce it and we enable developers to have it locally.

ialarmedalien force-pushed the schemaview_tidy branch from 4c7db80 to c7e1c3a Compare

May 8, 2025 16:57

kevinschaper approved these changes

View reviewed changes

Contributor

kevinschaper left a comment

thank you for the big cleanup! (and especially catching that add_subset bug!)

kevinschaper merged commit b2206a4 into linkml:main

23 of 30 checks passed

Contributor

Silvanoc commented May 9, 2025

Hmm, I'm not sure this PR was expected to get merged before #347. @ialarmedalien can you please confirm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet