-
Notifications
You must be signed in to change notification settings - Fork 168
Add schema.org structured data to events page #739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: lean4
Are you sure you want to change the base?
Add schema.org structured data to events page #739
Conversation
This change adds structured semantic data (JSON-LD) to the events page, making event information machine-readable for search engines and other tools. Changes: - Add city, country, state, and venue fields to Event dataclass - Generate schema.org Event markup with PostalAddress for physical events - Use VirtualLocation for virtual events - Compute location display strings from structured fields - Add structured data to sample events in events.yaml - Integrate validation in GitHub Actions using structured-data-testing-tool - Stub project downloads when NODOWNLOAD=1 to avoid rate limits The semantic data includes proper geographic information using PostalAddress with addressLocality (city), addressCountry (country), and addressRegion (state) fields, making events more discoverable and accessible.
When a venue is specified, the Place name should be just the venue (e.g., 'Spielfeld') rather than the full computed location string (e.g., 'Spielfeld, Berlin, Germany'). The full address details are already captured in the PostalAddress structure. For events without a venue, the city name is used as the Place name.
Convert all events from unstructured 'location' strings to structured fields (city, country, state, venue). This enables proper schema.org PostalAddress markup with addressLocality, addressCountry, and addressRegion. Changes: - 54 events updated with structured location data - State abbreviations expanded (RI, CA, MA, CO, PA, etc.) - Country names standardized (NL → Netherlands, UK → United Kingdom) - Venue abbreviations preserved (ICERM, ICTS, MSRI, etc.) - Virtual events unchanged (location: virtual) - Fixed typo: Tblisi → Tbilisi The Place name in schema.org now uses venue (if available) or city (as fallback), rather than the full location string. PostalAddress provides structured address details. All 70 events pass validation (100%).
Add hybrid event support to distinguish events that offer both in-person and virtual attendance options. Changes: - Add 'hybrid: bool' field to Event dataclass - Add validation: hybrid events must have city and country - Update schema.org generation for MixedEventAttendanceMode - Location for hybrid events is an array containing both: - Place with PostalAddress (physical location) - VirtualLocation (online access) - Mark 'Learning Mathematics with Lean' as hybrid event The schema.org location field now properly represents three modes: - OnlineEventAttendanceMode: VirtualLocation only - OfflineEventAttendanceMode: Place with PostalAddress only - MixedEventAttendanceMode: Array with both Place and VirtualLocation All 70 events pass validation (100%).
Update addressCountry values to use standard two-letter country codes (e.g., US, GB, DE) instead of full country names for better machine readability and compliance with schema.org recommendations.
- Remove duplicate description field from event JSON-LD (was just repeating the title) - Fix Formalization class to respect NODOWNLOAD environment variable - Prevents GitHub API rate limiting during local development with NODOWNLOAD=1
Move Node.js and npm setup earlier in the workflow and remove conditional checks. This makes the workflow simpler and allows validation to run as a standard part of the build process rather than an optional step. Changes: - Move setup-node action to run right after Python setup - Move npm ci to run before build step - Remove conditional checks from Node.js setup and validation steps - Simplify validation step to just run npm command - Use version tag (v4.1.0) instead of commit hash for setup-node action
Introduce a is_fully_remote() method to centralize the logic for detecting virtual/online events, replacing duplicate inline checks throughout the code. Also extend validation to ensure fully remote events don't have physical location fields (city, state, country), maintaining data consistency. Changes: - Add Event.is_fully_remote() method - Update compute_location() to use new method - Update generate_schema_org_json() to use new method - Add validation check for fully remote events
Change generate_schema_org_json to throw ValueError exceptions with helpful error messages instead of silently returning empty strings when dates cannot be parsed. This makes debugging easier and ensures all events have valid dates. Changes: - Combine date parsing and ISO 8601 formatting into single lines - Use separate try/except blocks for start and end dates - Raise ValueError with specific error messages indicating which event and which date field is invalid
Co-authored-by: jessealama <56691+jessealama@users.noreply.github.com>
|
Example of the generated markup for an event: <li> <a href="https://pitmonticone.github.io/ItaLean2025/">ItaLean 2025: Bridging Formal Mathematics and AI</a> (Bologna, IT. December 9–12, 2025)
<small class="align-middle">
<span class="badge badge-secondary align-middle event-tag-conference">conference</span>
</small>
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Event",
"name": "ItaLean 2025: Bridging Formal Mathematics and AI",
"url": "https://pitmonticone.github.io/ItaLean2025/",
"startDate": "2025-12-09",
"endDate": "2025-12-12",
"eventAttendanceMode": "https://schema.org/OfflineEventAttendanceMode",
"location": {
"@type": "Place",
"name": "Bologna",
"address": {
"@type": "PostalAddress",
"addressLocality": "Bologna",
"addressCountry": "IT"
}
},
"eventStatus": "https://schema.org/EventScheduled"
}
</script>
</li> |
|
Indeed, the NODOWNLOAD env var is mentioned in our README; the other option is to pass a GitHub token. I hadn't heard of this before but it looks like it's relatively widely used on other such community pages, cf. rust foundation events, python events, so I'm not opposed to this. I am a little wary of introducing a node dependency, even if only in CI. Is there a comparable Python package for validation that we could run as part of |
Thanks for taking a look! I reailze this is probably a bit of a curveball PR. I'll take a look at equivalent Python tools to see what can be done there; surely there's an equivalent. |
|
Converting to draft while I investigate Python equivalents for JavaScript's structured data test tool. |
889c0bd to
9e1644f
Compare
|
I found pydantic2-schemaorg. It's a neat package (and could also be used elsewhere, too) that does just what we want. I've removed the NPM/JS setup. |
Add structured location data to Swiss Math Soc Spring Meeting event.
|
ping @bryangingechen |
Adds JSON-LD structured data to the events page for better search engine indexing using schema.org/Event markup. Fixes #738
Notes and potential areas for feedback
data/events.yamlnow allowscity,country, andvenuefields, all optional.locationis freeform text. For thecountryfield we have no checks, so it still remains freeform, but one might imagine defining a check. Another approach would be to render e.g. "DE" as "Germany".NODOWNLOADenvironment variable. I'm not sure if this is quite right; any feedback welcome.