Ad

Hindi Subtitle (srt File) Parsing Issue

- 1 answer

The NSRegularExpression in iOS couldn't parse the whole hindi srt file using the regex expression below:

(\\d+)\\n([\\d:,.]+)\\s+-{2}\\>\\s+([\\d:,.]+)\\n([\\s\\p{P}]*?(?=\\n{2,}|$))

The above expression is working fine with English subtitles. In case of hindi subtitle the result after the function

let matches = regex.matches(in:<SubtitleStringToParse>, options: NSRegularExpression.MatchingOptions(rawValue: 0), range: NSMakeRange(0, <SubtitleStringToParse.count>))

gives matches array with lesser values as expected. If suppose actually there should be 10 matches, it is showing only 8 matches, rest 2 are missing. As long the hindi subtitles are we can see some number of missing matches towards the end.

Is there any way to resolve this issue or for different languages what should be done for parsing subtitles to get the accurate matches?

Is there any alternative?

Ad

Answer

Try this regular expression.

"((\\d+)\\n([\\d:,.]+)\\s+-{2}\\>\\s+[\\d:,.]+\\n[\\s\\S]*?(?=\\n{2,}|$))"
Ad
source: stackoverflow.com
Ad