Skip to content

tacl diff is better done as ratios #23

@ajenhl

Description

@ajenhl

Rather than a binary (not-)unique-to-sub-corpus, it would be useful to provide a graded result, based on, perhaps, frequency of occurrence. This would at least mostly avoid the issue whereby a single instance of an n-gram in a single witness among potentially thousands of other texts in a sub-corpus will ensure that that n-gram does not occur in the results, despite otherwise appearing solely in another sub-corpus.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions