-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Overlapping Ratio
Currently, find_overlap will be True when any single overlap occurs.
nervaluate/src/nervaluate/evaluate.py
Line 330 in df0e695
| if find_overlap(true_range, pred_range) and true not in true_which_overlapped_with_pred: |
nervaluate/src/nervaluate/utils.py
Lines 85 to 104 in df0e695
| def find_overlap(true_range: range, pred_range: range) -> set: | |
| """Find the overlap between two ranges | |
| Find the overlap between two ranges. Return the overlapping values if | |
| present, else return an empty set(). | |
| Examples: | |
| >>> find_overlap((1, 2), (2, 3)) | |
| 2 | |
| >>> find_overlap((1, 2), (3, 4)) | |
| set() | |
| """ | |
| true_set = set(true_range) | |
| pred_set = set(pred_range) | |
| overlaps = true_set.intersection(pred_set) | |
| return overlaps |
However, in most cases, we hope there could be an overlapping ratio threshold.
That is something like this
pred = {'start':10, 'end':15}
label = {'start':12, 'end':18}
# calculate union and intersection
union = {'start':10, 'end':18}
intersection = {'start':12, 'end':15}
#calculate ratio
ratio = (15-12) / (18-10)
return ratio > thresholdThe current find_overlap uses set operation to find overlaps, which seems to be time inefficient. It would be directly obtained via start and end values:
Here's my implementation:
def find_overlap(self, true: dict[str, int | str], pred: dict[str, int | str]) -> bool:
start_max = max(true['start'], pred['start'])
end_min = min(true['end'], pred['end'])
if start_max >= end_min:
return False
start_min = min(true['start'], pred['start'])
end_max = max(true['end'], pred['end'])
overlap_ratio = (end_min - start_max) / (end_max - start_min)
return overlap_ratio > self.overlap_ratio_thresholdLast Character excluded
I wonder why we consider the last token, which is very counter-intuition. This comes from #32. Maybe @aflueckiger could provide any explanation on this? Does your data end includes the last character?
I think for most data, the start and end are the offsets in the original text string:
text[start:end] which means the last character is excluded. text[1:3] and text[3:5] don't have any overlapping.
nervaluate/src/nervaluate/evaluate.py
Lines 294 to 296 in df0e695
| # overlapping needs to take into account last token as well | |
| pred_range = range(pred["start"], pred["end"] + 1) | |
| true_range = range(true["start"], true["end"] + 1) |
Any support for huggingface Evaluate?
Would the maintainers consider using the standard of huggingface Evaluate? which means inheriting evaluate.Metric and pushing to huggingface hub. Afterwards, users could directly call metric = evaluate.load('{hub_url}')
Example: https://huggingface.co/spaces/evaluate-metric/glue/blob/main/glue.py