Skip to content

Conversation

@poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Nov 24, 2025

What this PR does / why we need it:

Which issue(s) this PR closes:

Special notes for your reviewer:
To be done.

Suggestions on how to test this:
Execute Migration DB Tests with mvn verify -PskipUnitTests -Dit.groups=migration
Edit: This PR absolutely needs to be tested with a real deployment on a real prod. database. Among other things, please test and confirm that any configured controlled vocabularies are still working, on the off chance that the :CVocConf setting is going to be treated differently. (L.A.)

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Nope

Is there a release notes update needed for this change?:
Nope

Additional documentation:
Introducing proper migration test here for the first time. Will need discussion and documentation. Docs may be beyond scope of this PR.
Happy to talk about the idea during tech hours.

…nge upstream)

Upgraded Testcontainers to `2.0.2` and updated related artifact names for consistency (`testcontainers-junit-jupiter`, `testcontainers-postgresql`, etc.). Bumped `testcontainers-keycloak` to `4.0.0` which is compatible with TC2.
…guration

Before, the Postgres Server version was only present in the container profile. Now moving it into the parent POM and aligning with our documented settings allows reuse in tests, too.

Using the caps of Maven Failsafe to expose the server version in JUnit Tests will allow to use it with Testcontainers, deploying a Postgres Container for migration tests as documented in our guides.
Expanded integration test tags in `pom.xml` to include `migration` and added `DBunit` as a test-scoped dependency to support database migration testing. Updated `Tags.java` to reflect the new tag, to be used in tests that are migration tests (so they can be easily excluded or run as necessary).
…gration tests

Added a reusable PostgresContainer to optimize resource usage in database migration tests. Ensures a single container instance is shared across all migration test cases, with automatic cleanup on JVM shutdown.
…ration`

Introduced comprehensive tests for `V6_8_0_1__SettingsDataMigration` using Testcontainers and DBUnit. Includes scenarios for migrating settings to new formats, handling null and invalid values, and verifying JSON transformations.

This is a reproducer for an issue discovered after merging PR #11654 by @landreev. See also #11654 (comment)
The migration script `V6.8.0.1.sql` failed with a `PSQLException` stating "there is no unique or exclusion constraint matching the ON CONFLICT specification".

This occurred because the script used a partial index syntax for the conflict target:
`ON CONFLICT (name) WHERE lang IS NULL`

However, PostgreSQL's `ON CONFLICT` inference requires the target to syntactically match an existing unique index or constraint. The actual index on the `setting` table is a functional index defined as:
`CREATE UNIQUE INDEX unique_settings ON setting (name, (COALESCE(lang, '')))`

While `WHERE lang IS NULL` and `COALESCE(lang, '')` logically handle nulls similarly for uniqueness in this context, Postgres does not treat them as interchangeable for arbiter inference.

This commit updates the migration script to explicitly target the functional index expression:
`ON CONFLICT (name, (coalesce(lang, '')))`

References:
- PostgreSQL 16 Documentation on INSERT: https://www.postgresql.org/docs/16/sql-insert.html#SQL-ON-CONFLICT
@pdurbin pdurbin added this to the 6.9 milestone Nov 25, 2025
@pdurbin pdurbin moved this to In Progress 💻 in IQSS Dataverse Project Nov 25, 2025
…ak client is autoclosed

Removing the try-with-resources avoids autoclosing the resource, which triggers a logout. As we reuse the token in multiple tests, we don't want that.
@coveralls
Copy link

coveralls commented Nov 26, 2025

Coverage Status

coverage: 24.174% (+0.004%) from 24.17%
when pulling 8ddd5b8 on 11996-fix-settings
into f2a250f on develop.

@github-actions

This comment has been minimized.

…res container in migration tests

Extracted DBUnit helper methods to a new `DBUnitHelper` class for reusability and cleaner test code. Updated `V6_8_0_1__SettingsDataMigrationIT` to utilize these helpers and the shared Postgres container for better maintainability and reduced redundancy in test setup.
… simplify access methods #11996

Replaced hard-coded workflow keys with structured enum-based keys in `TriggerType`. Updated `WorkflowServiceBean` and `SettingsServiceBean` to use consistent key resolution methods, improving readability and maintainability. Updated related database migration script to align with new key naming schema.

This adds the missing keys after introduction of naming restrictions in #11654. The config keys for the default workflows will no longer be cleansed from the database during deployment.
@github-actions

This comment has been minimized.

Expanded documentation to include new Settings API options for managing workflows. Added references for `WorkflowsAdminIpWhitelist`, `PrePublishDatasetWorkflowId`, and `PostPublishDatasetWorkflowId` with usage examples. Enhanced developer guide with a new "Administration" section for workflow-related settings.
…ings

Enhanced `V6_8_0_1__SettingsDataMigrationIT` to include tests for new workflow settings (`PrePublishDatasetWorkflowId` and `PostPublishDatasetWorkflowId`). Updated expected data assertions and display names for better clarity and completeness.
@github-actions

This comment has been minimized.

@poikilotherm poikilotherm marked this pull request as ready for review November 28, 2025 11:22
…ve validation and logging

Enhanced `getTabularIngestSizeLimits` to accept numeric values in addition to strings for size limits. Improved validation to handle invalid types, numbers, or decimal values. Updated related tests and configuration documentation.

This was done because the limitation to long numbers as string was artificial. Users can choose which way they like best. Also, the data migration uses numbers, so this lead to errors.
Copy link
Member

@qqmyers qqmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to allow numeric values in JSON looks fine, unit tests pass.

@github-project-automation github-project-automation bot moved this from In Progress 💻 to Ready for QA ⏩ in IQSS Dataverse Project Dec 4, 2025
…tSizeLimits

Refactored `getTabularIngestSizeLimits` to utilize try-with-resources for safer JSON parsing. Enhanced logging with lambdas and improved iteration over JSON entries.
@github-actions

This comment has been minimized.

1 similar comment
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@pdurbin pdurbin moved this from Reviewed but Frozen ❄️ to Ready for QA ⏩ in IQSS Dataverse Project Dec 5, 2025
@landreev
Copy link
Contributor

landreev commented Dec 5, 2025

Just to have it on record here, both forms of the new json setting are now parsed properly, {"Rdata": "1048576", "default": "157286400"} and {"Rdata": 1048576, "default": 157286400}.

@landreev
Copy link
Contributor

landreev commented Dec 5, 2025

But please note that these ingest-related RestAssured tests are still failing:

	Test Result (8 failures / ±0)

    edu.harvard.iq.dataverse.api.DatasetsIT.testSemanticMetadataAPIs
    edu.harvard.iq.dataverse.api.FilesIT.testIngestSizeLimits
    edu.harvard.iq.dataverse.api.FilesIT.testDataSizeInDataverse
    edu.harvard.iq.dataverse.api.FilesIT.testAddFileToDatasetSkipTabIngest
    edu.harvard.iq.dataverse.api.FilesIT.testIngestWithAndWithoutVariableHeader
    edu.harvard.iq.dataverse.api.FilesIT.testUningestFileViaApi
    edu.harvard.iq.dataverse.api.FilesIT.test_006_ReplaceFileGoodTabular
    edu.harvard.iq.dataverse.api.FilesIT.testValidateDDI_issue6027

https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/view/change-requests/job/PR-12002/14/#showFailuresLink

(although I'm not 100% sure if the 1st one on the list is "ingest-related")

@landreev landreev self-assigned this Dec 5, 2025
@landreev landreev moved this from Ready for QA ⏩ to QA ✅ in IQSS Dataverse Project Dec 5, 2025
@landreev
Copy link
Contributor

landreev commented Dec 5, 2025

I'll take a closer look at the logs from the last Jenkins runs where all these failures happened.
As a very random thought - SettingsServiceBean is not a singleton; is there a scenario where we can have an instance of it that's stuck with a stale cached copy of a setting, when we keep changing it in a busy Dataverse instance?

@qqmyers
Copy link
Member

qqmyers commented Dec 5, 2025

FWIW: I created #12026 since I think we've seen the semantic web fail before.

@landreev
Copy link
Contributor

landreev commented Dec 5, 2025

You may want to completely disregard my guess about "stale settings".
But, could you tell me what's going on in this test: (starting line 1335 in FilesIT.java)

I mean, 987654321 is no longer an invalid value (?).
Is this passing simply because you are not giving ingest enough time to finish before asking for the datatable?
In Jenkins run number 14 this test failed further down, in line 1376. And that would in turn be consistent with the upload not succeeding on account of the ingest lock still being present. Locally, I was able to reproduce the failure in that spot only 2 out of 20 or so times. And that would be consistent with the ingest of this tiny file happening fast enough on most runs, such that the lock is gone by that point; and only failing intermittently when the ingest decides to take longer.
Again, just a guess/may be missing something... etc.

        // As the guides say, you MUST provide a string, not a JSON number.
        // That is, `"123"` in quotes rather than `123` with no quotes.
        // If you provide a number (no quotes) rather than a string,
        // all ingest will be disabled and you'll see an error in server.log
        // about how the system is misconfigured.
        String invalidNonString = """
{
  "default": 987654321
}
""";

        setLimit = UtilIT.setSetting(SettingsServiceBean.Key.TabularIngestSizeLimit, invalidNonString);
        setLimit.then().assertThat().statusCode(OK.getStatusCode());

        uploadFile = UtilIT.uploadFileViaNative(datasetId.toString(), pathToDataFile.toString(), apiToken);
        uploadFile.prettyPrint();
        uploadFile.then().assertThat()
                .statusCode(OK.getStatusCode())
                .body("data.files[0].label", equalTo("data-4.csv"));

        String fileId5 = JsonPath.from(uploadFile.body().asString()).getString("data.files[0].dataFile.id");

        getTabularFails = UtilIT.getFileDataTables(fileId5, apiToken);
        getTabularFails.prettyPrint();
        getTabularFails.then().assertThat()
                .statusCode(BAD_REQUEST.getStatusCode())
                .body("message", equalTo(BundleUtil.getStringFromBundle("files.api.only.tabular.supported")));

I did not investigate the other failures in similar depth.

@landreev
Copy link
Contributor

landreev commented Dec 5, 2025

(that makes me cautiously optimistic re: possibility that these are something finicky in the tests themselves, vs. an indication of any issues in the functionality)

…zed approach #11639

Simplified and improved test structure by introducing nested classes, parameterized tests, and utility methods for TabularIngestSizeLimit configurations. Enhanced code readability and reusability.
…ric value cases #11639

Added more test cases to validate support for various numeric formats, including integers, strings, and decimals, in TabularIngestSizeLimit configurations.
@github-actions

This comment has been minimized.

1 similar comment
@github-actions

This comment has been minimized.

@github-actions
Copy link

github-actions bot commented Dec 9, 2025

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:11996-fix-settings
ghcr.io/gdcc/configbaker:11996-fix-settings

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

@landreev
Copy link
Contributor

landreev commented Dec 9, 2025

I ran yet another Jenkins job by hand, and everything passed this time around. Merging.

@landreev landreev merged commit 70b6e26 into develop Dec 9, 2025
21 checks passed
@github-project-automation github-project-automation bot moved this from QA ✅ to Merged 🚀 in IQSS Dataverse Project Dec 9, 2025
@scolapasta scolapasta moved this from Merged 🚀 to Done 🧹 in IQSS Dataverse Project Dec 10, 2025
@landreev landreev removed their assignment Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FY26 Sprint 11 FY26 Sprint 11 (2025-11-20 - 2025-12-03) FY26 Sprint 12 FY26 Sprint 12 (2025-12-03 - 2025-12-17) Size: 10 A percentage of a sprint. 7 hours. Type: Bug a defect

Projects

Status: Done 🧹

Development

Successfully merging this pull request may close these issues.

Pre/Post Publication workflows broken as of #11654

7 participants