Alphabetical order in testing environment different than in regular R environment?

I'm developing a package for project-specific data processing. One step is checking whether a number of names are really distinct, or if similar names refer to the same person. For this I first generate from a database a `data.table` of pairs that are similar based on string similarity, and compare this to a `data.table` of pairs for that I have manually checked whether they refer to the same person. If all similar sounding names have been covered in my manually compiled list, the test passes.

I do this via a negative join with data.table: 

```
dt_redux <- dt_pairs_from_db[!dt_manually_checked_pairs, on = .(name1, name2)]
expect_true(nrow(dt_redux)==0)
```

This test did pass when calling `test_all` or `build_install_test`, but failed in `R CMD check`.

After some searching I tracked it down to the name order in `dt_pairs_from_db`. Here the pairs are generated from a string similarity function, which creates two entries for each couple (name1, name2 and name2, name1). To avoid having to check each couple twice, I only cover the cases where `name1 > name2`. However for one couple, "İnan Kıraç" and "Suna Kıraç", the alphabetical order differs between the normal R environment and the testing environment: In the normal R environment, `expect_true("İnan Kıraç" > "Suna Kıraç")` fails, but in the testing environment (in my `test_package.R` file), `expect_true("İnan Kıraç" > "Suna Kıraç")` passes.

This difference in alphabetical order lead to a `dt_pairs_from_db` being generated that didn't match the order of pairs to check in my `dt_manually_checked_pairs`, which caused the test to fail.

I've now fixed it by just adding this particular couple in both comparisons to my `dt_manually_checked_pairs`, but I'm curious what caused this; any ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Alphabetical order in testing environment different than in regular R environment? #127

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Alphabetical order in testing environment different than in regular R environment? #127

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions