Skip to content
This repository was archived by the owner on Feb 19, 2020. It is now read-only.

Conversation

@hiroshinoji
Copy link

NewLineSentenceSegmenter did not trim each segmented sentence, so for example, it always outputted an error:

$ echo I live in Osaka . | java -Xmx4g -cp assembly.jar epic.parser.ParseText --model parsers/SpanModel-300.parser --sentences newline --tokens whitespace
(TOP (S (NP (PRP He) ) (VP (VBZ lives)  (PP (IN in)  (NP (NNP Osaka) )))))
### Could not tag Vector(), because No parse for Vector(): infinite partition... epic.parser.projections.ChartProjector$class.project(ChartProjector.scala:36);epic.parser.projections.AnchoredRuleMarginalProjector.project(EnumeratedAnchoring.scala:78)

I added an filter for empty sentences as in MLSentenceSegmenter, which avoids this by trimming every sentence. Now no error is outputted.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant