-
Notifications
You must be signed in to change notification settings - Fork 11
December schema release v1.15.0 #423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
fixes transposed text mentioned in #409
- Configure uv workspace - Add Makefile for cross-package testing - Add basic README with project overview - Add .gitignore - Add uv.lock for reproducible dependencies
- Implement model registration system infrastructure - Add geometry wrapper with Shapely integration - Support dual serialization (Shapely for Python, GeoJSON for JSON) - Introduce OvertureFeature as the base class for all features
Constraint-based validations: - Base constraint classes (BaseConstraint, StringConstraint, CollectionConstraint) - String pattern validation (language tags, country codes, ISO dates, JSON pointers) - Collection validation (uniqueness, size constraints, composite uniqueness) - Numeric constraints (confidence scores, zoom levels, non-negative values) - Conditional validation (required-if, mutually exclusive, at-least-one-of) - Mixin-based validation system with decorators for model-level validation - JSON Schema generation using constraint metadata
- Implement a test harness for example and counterexample parsing and validation that suppports both GeoJSON and flattened (Parquet-like) features
- Implement an Address model with fields and constraint-based validation - Enable address examples and counterexamples
- Introduce models for all types in the base theme - Enable base theme examples and counterexamples
- Introduce Building and BuildingPart models - Enable buildings examples and counterexamples - Fix datetime validation by importing proper ISO8601DateTime constraint
- Introduce models for the divisions theme - Add ParentDivisionValidator - Enable divisions examples and counterexamples The divisions theme introduces complex geographical and administrative boundary validation with conditional parent-child relationships.
- Add Place model and supporting types - Enable places examples and counterexamples
- Introduce transportation models with supporting types and validations - Introduce scoping system to overture-schema-core with support for geometric, temporal, directional, and vehicle-specific rules The scoping system uses a mix-in architecture enabling precise conditional application of transportation rules based on time, direction, vehicle type, and geometric position along segments.
- Enable common theme examples and counterexamples - Enable general validation counterexamples (bad-theme, empty-object, etc.)
Pydantic generates different names and structures, so we remove the specific paths triggering the error messages to work with both JSON Schemas. Note: some error message modifications are not backward-compatible with Overture's historical JSON Schema due to differences in the way that structures are modeled (although the effect matches).
This commit introduces comprehensive JSON Schema generation testing and baseline validation: - Add baseline test files for all theme packages to catch schema generation regressions
overture-schema-core re-exports types, decorators, and mixins intended for public consumption. As a result, we can drop the explicit dependency on overture-schema-validation from most theme packages. (Type definitions exist in overture-schema-validation to facilitate tests.)
"Overture" can be inferred from the context (including the module name).
The assumption that models have a `properties` field containing other fields is incorrect (and derives from a GeoJSON view of the world).
Ensures that at least 1 property is provided and generates `minProperties` in the JSON Schema.
JSON Schema generation is now implemented in the validator directly vs. requiring changes to ConstraintValidatedModel.
Note: Still waiting on some information from Jennings and Jonah to improve the documentation for `land`, `land_cover`, and `water`.
This type was unnecessary and misleading. The misleading part comes from the fact that, historically, the Overture feature in JSON Schema land had an `update_time` property. We eventually removed that property, making a reusable type called `FeatureUpdateTime` wholly redundant, but it seems the type was reused by the `update_time` property in the sources construct. This lead to misleading-ness not only at the type name/intent level, but also at the documentation level since the documentation inherited by the update time field in sources was: > Timestamp when the feature was last updated Which assuredly made no sense for sources. As a result of this commit, that documentation is now updated to: > Last update time of the source data record Which makes more sense (and hopefully is also correct).
This was losing a description and adding a Pydantic warning when running the tests.
The biggest subtle bug was that `level_rules` in the transportation segment were ending up with a default value of `0`, so that you could just omit `value` altogether. This was not intended as it's bizarre, better to not create the rules at all. Meanwhile, `level` in all the other features did NOT have a default of zero, which was again not intended. As a result of the `Stacked` model overriding the default value that is set on `Level` in its `level` field, we were also getting a Pydantic warning that was polluting the test run output. By fixing the underlying issue, this commit makes that warning go away.
This should hopefully prevent recurrence of the Pydantic warning we were seeing for a very long time, before I fixed them in the preceding commit.
This is based on querying the `v2025_10_22_0` release and finding that the range of levels in the rest of the dataset is -9,999 to 1,940. And really, do you need billions of levels? That's not the intention for Z-order stacking.
This commit also fixes one definite bug, and one weird thing that was arguably a bug. Definite bug: `Prominence` was modeled as `lt=100`, which would indicate a peculiar-sized 99-element range from 1 to 99 inclusive, but the actual data in the latest Overture release, `v2025_10_22_0`, ranges from 1 to 100 inclusive. Weird thing: Sort key had a default of zero, and zero is also supposed to be the highest rendering priority. This seems weird. It would mean that you can't use the sort key to give any feature higher rendering priority than a feature that has no rendering priority set. This won't work as expected, so I removed the default. To fix another day: prominence and sort key are reversed. A high number means high prominence, but a high number for sort key^[ means a low rendering priority.
As of this commit, running `make docformat` detects there are 22 remaining errors, less than one terminal full. They are all of the following two types: 1. D100: Missing docstring in public module 2. D101: Missing docstring in public class Moreover all the remaining errors pertain to modules and/or classes that are slated for further refactoring. Therefore, I'm going to leave finishing them for another day. As part of this commit, I edited the `make docformat` rule in the `Makefile` to suppress four new rules. The rules suppressed are listed below along with the description of the rule and the reason for suppressing it. 1. D102: Missing docstring in public method. This is suppressed because `pydocstyle` is dumb and doesn't traverse inheritance hierarchies or even look at the `@override` decorator, so it needlessly flags overridden methods for which documentation is superfluous and a waste of time. 2. D200: One-line docstring should fit on one line with quotes. This is suppressed because `pydocstyle` is dumb and isn't aware of reasonable line length limits or wrapping. 3. D205: 1 blank line required between summary line and description. This is suppressed for the same reason as D200. 4. D400: First line should end with a period. This is suppressed for the same erason as D200 and D205.
The issues are that `docformatter` doesn't understand NumPy-style documentation, which is how most of the API docs are written today, and that it edits code in the opposite way to what `pydocstyle` expects. Both `docformatter` and `pydocstyle` are somewhat underwhelming tools with very few configuration options and a very heavy-handed approach to what they do, but as `pydocstyle` understands more documentation "dialects", including NumPy, it seems reasonable to keep it and drop the other, at least for now.
This was a subtle bug. Only because we weren't previously exercising this functionality at the `OvertureFeature` level via unit test. Basically the Pydantic decorators like `@model_validator` break in an inheritance context if the derived class overrides a validator method that was used in a base class. This only makes sense and isn't really Pydantic's fault: it'll be trying to call `instance.validator_method()` and Python is going to resolve which code to invoke. If you overrode the base class' `validator_method()` then yes, it stands to reason that only the derived version will be called. As you can see from the diff, the wrap validator in `Feature` that had been fixing up the input data for GeoJSON conversion was called `validate_model`, and I had created a method of the same name in `OvertureFeature` to handle the soon-to-be-deprecated `ext_*` field exception. So, of course, `Feature`'s version got overridden, and with it the whole GeoJSON-to-flat-model transform that enabled the parsing to continue. I fixed the issue by giving the method names in both classes much more specific names and tried to make them a bit obscure with dunder-naming, which won't obfuscate them but will at least hopefully reduce the chances of name collisions. Also added a test, because the lack of it is the cause of this whole fiasco in the first place!
Bug Fix
-------
At least one bug is fixed in this commit: the class
`EnhancedJsonSchemaGenerator` (now rechristened with the more accurate
if more unwieldy name `GenerateOmitNullableOptionalJsonSchema`) was
eliminating the `null` option on nullable fields that were required,
which subtly changed the meaning of the schema.
Where before you had:
{
"required": ["foo"],
"properties": {
"foo": {
"anyOf": [
{"type": "string"},
{"type": "null"}
]
}
}
}
The anyOf/null option was being removed but the required remained, which
meant changed the meaning from "you have to put something, but it can be
a string or the value `null`" to "you have to put a string". That's not
an equivalent schema.
Re-Homing of `json_schema`
--------------------------
The function `json_schema` and its entourage, the JSON Schema generator
class `EnhancedJsonSchemaGenerator`, are promoted from core to system.
This is because they are very general-purpose "George" functionality
that isn't really tied to the Overture schema per se.
Longer term, I think there's a case to be made that we should drop
`json_schema` and `GenerateOmitNullableOptionalJsonSchema` and just rely
on the existing Pydantic functionality, which is rich, along with the
`Omitable` type that's already in system. I'd like to reopen this topic.
As part of this work, I gave `GenerateOmitNullableOptionalJsonSchema`
some very thorough documentation including doctests.
Re-Imagining of Parsing
-----------------------
I killed the `parse_feature` function as I don't see what value it
brings above and beyond the native Pydantic parsing. Now that I have
fully fixed the `Feature` and `OvertureFeature` Pydantic integrations
to correctly parse the GeoJSON in all cases, its hard to see why we
need a function that "can parse an Overture feature from GeoJSON" when
the existing Pydantic functions can already do that.
I renamed the `parse` function in the overture-schema package to
`validate` and split it into two functions, `validate` and
`validate_json`, mirroring the fairly consistent way that Pydantic
likes to break it down. I used the new GeoJSON-compatible discriminated
union functionality from `Feature` (see earlier commit) and brought the
functions more into line with how Pydantic generally works. Basically
now they're "do the Pydantic thing, but with the union of all the
discovered models" functions. Also bugs are fixed, docs are added, and
the docs and code are hopefully closer to being in harmony...
Enhance the model discovery system to support multiple namespaces and capture the fully qualified class name for each registered model. ModelKey dataclass now contains: - namespace: distinguishes "overture" from extensions like "annex" - theme: optional, as some models may not belong to a theme - type: the feature type name - class_name: the entry point value for introspection Entry point format changes from "theme.type" to "namespace:theme:type" (or "namespace:type" for non-themed models). This enables third-party schema extensions to register models without conflicting with core Overture types. discover_models() now accepts an optional namespace filter parameter.
Introduce a command-line interface for working with Overture schema
models. The CLI provides tools for schema introspection, validation,
and JSON Schema generation.
Commands:
list-types [--theme THEME] [--detailed]
List registered Overture types, optionally filtered by theme.
With --detailed, shows model descriptions from docstrings.
json-schema [--theme THEME] [--type TYPE]
Generate JSON Schema for specified themes/types or all models.
validate [--theme THEME] [--type TYPE] [--show-field FIELD] FILE
Validate GeoJSON features against Overture schemas. Supports:
- Single features and FeatureCollections
- Heterogeneous collections (mixed types)
- JSONL input from stdin (use '-' as FILE)
- Automatic type detection via discriminator fields
- Rich error display with data context windows
Type Resolution:
When --type is not specified, the validator builds a discriminated
union from registered models and uses Pydantic's tagged union support
to identify the most likely type. For heterogeneous collections, each
feature is validated against its detected type independently.
Error Display:
Validation errors show surrounding data context to help locate issues.
The --show-field option pins specific fields (e.g., id) in the display
header for easier identification in large datasets.
Pipeline Support:
The validate command accepts JSONL on stdin for integration with tools
like jq and gpq:
gpq convert file.geoparquet --to geojson | \
jq -c '.features[]' | \
overture-schema validate --type building -
Module Structure:
- commands.py: Click command definitions
- type_analysis.py: Union type construction and discriminator handling
- error_formatting.py: Validation error processing and display
- data_display.py: Context window and field extraction
- output.py: Rich console output helpers
* Initial commit of new places taxonomy * Fix TX region * Add operating_status to bad-categories-value * Fix mistakenly change alternate -> alternates * Revert change to bad categories counterexample * Add pydantic model for new taxonomy * Fix up place baseline schema json * Copy examples/couterexamples to references. Update docs * Update main place.yaml file to reflect unique/minItems constraints * Add backtick and reference to Places doc page * Update baseline json
Documentation for release
Documentation for release
vcschapp
approved these changes
Dec 17, 2025
jenningsanderson
approved these changes
Dec 17, 2025
Collaborator
jenningsanderson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
15!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
change type - minor 🤏
Minor schema change. See https://lf-overturemaps.atlassian.net/wiki/x/GgDa
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
December schema release. Version: v1.15.0
Changes: