Skip to content

Idea for tool: resegment into well-formed sentences #87

@bricksdont

Description

@bricksdont

Hi, this is a great library!

I added one more tool in my fork that does automatic sentence segmentation: bricksdont#1

It changes the distribution of subtitle segments so that each subtitle is exactly one well-formed (and complete) sentence. It's not perfect, a machine learning model is involved.

Here is an example:

# Input

10:01:23,880 --> 10:01:27,640
Regelmässig nimmt er an Veranstaltungen
von FRAGILE Suisse teil,

23
10:01:27,720 --> 10:01:31,840
der Patientenorganisation
für Menschen mit Hirnverletzungen.

# Output

10:01:23,880 --> 10:01:31,840
Regelmässig nimmt er an Veranstaltungen
von FRAGILE Suisse teil, der Patientenorganisation
für Menschen mit Hirnverletzungen.

Would you be interested in a PR for this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions