Skip to content

schema mismatch in glob (when copy all divisions to local parquet file) #438

@chle-work

Description

@chle-work

I tried the example from https://docs.overturemaps.org/guides/divisions/#data-access-and-retrieval to export all division data to a local parquet file.

`LOAD spatial; -- noqa
LOAD httpfs; -- noqa
-- Access the data on AWS in this example
SET s3_region='us-west-2';

COPY (
SELECT
*
FROM
read_parquet('s3://overturemaps-us-west-2/release/2025-10-22.0/theme=divisions//')
) TO 'all_divisions.parquet';`

This stopped after about 10% with the following error:

Failed to read file "s3://overturemaps-us-west-2/release/2025-10-22.0/theme=divisions/type=division_area/part-00000-1899532f-3d83-4e2a-81b5-7095ffbff954-c000.zstd.parquet": schema mismatch in glob: column "cartography" was read from the original file "s3://overturemaps-us-west-2/release/2025-10-22.0/theme=divisions/type=division/part-00000-05305dd6-f53c-4d6b-a0c8-16c87b88d42d-c000.zstd.parquet", but could not be found in file "s3://overturemaps-us-west-2/release/2025-10-22.0/theme=divisions/type=division_area/part-00000-1899532f-3d83-4e2a-81b5-7095ffbff954-c000.zstd.parquet".
Candidate names: id, geometry, bbox, country, version, sources, subtype, class, names, is_land, is_territorial, region, division_id
If you are trying to read files with different schemas, try setting union_by_name=True

I also tried excluding the mentioned cartography column, but the same error appeared for the column 'wikidata'

If it's really required to enable union_by_name somewhere, I propose to update the example in the docs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions