Discussion: Custom delimiters for unflatten

I notice that some CSVs uploaded to the OCDS Data Review Tool use semi-colons.

With commas:

```csv
ocid,id,date,tag,initiationType,tender/id
ocds-1234567-abc,ocds-1234567-abc-1,2000-01-02T00:00:00Z,tender,tender,abc
```

With semicolons:

```csv
ocid;id;date;tag;initiationType;tender/id
ocds-1234567-abc;ocds-1234567-abc-1;2000-01-02T00:00:00Z;tender;tender;abc
```

Some possible behaviors:

1. Leave as is. With above example, field is read in as "ocid;id;date;tag;initiationType;tender" which shows up under additional fields.
1. Allow a [dialect](https://docs.python.org/3/library/csv.html#csv.Dialect) to be passed in. This defers all responsibility to the calling code.
1. Add a `sniff` boolean argument. If enabled, flatten-tool [sniffs](https://docs.python.org/3/library/csv.html#csv.Sniffer) the dialect. The sample size and/or possible delimiters could also be passed in.

For CoVEs, flatten-tool's `unflatten` is called within lib-cove's `convert_spreadsheet`, which is called by a CoVE's view. The `flattentool_options` are derived from arguments to `convert_spreadsheet` – except for paths, `encoding` (utf-8-sig, cp1252, latin_1), `metatab_vertical_orientation` (True), `convert_titles` (True). So, whatever new arguments are added to `unflatten` will need to be added to `convert_spreadsheet`.

I think (2) is best, as it gives the most flexibility to the calling code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discussion: Custom delimiters for unflatten #454

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion: Custom delimiters for unflatten #454

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions