Microsoft technology licensing, llc (20250131035). CLUSTERING-BASED RECOGNITION OF TEXT IN VIDEOS
CLUSTERING-BASED RECOGNITION OF TEXT IN VIDEOS
Organization Name
microsoft technology licensing, llc
Inventor(s)
Maayan Yedidia of Ramat Gan IL
CLUSTERING-BASED RECOGNITION OF TEXT IN VIDEOS
This abstract first appeared for US patent application 20250131035 titled 'CLUSTERING-BASED RECOGNITION OF TEXT IN VIDEOS
Original Abstract Submitted
systems and methods for spatial-textual clustering-based recognition of text in videos are disclosed. a method includes performing textual clustering on a first subset of a set of predictions that correspond to numeric characters only and performing spatial-textual clustering on a second subset of the set of predictions that correspond to alphabetical characters only. the method includes, for each cluster of predictions associated with the first subset of the set of predictions, choosing a first cluster representative to correct any errors in each cluster of predictions associated with the first subset of the set of predictions and outputting any recognized numeric characters. the method includes, for each cluster of predictions associated with the second subset of the set of predictions, choosing a second cluster representative to correct any errors in each cluster of predictions associated with the second subset of the set of predictions and outputting any recognized alphabetical characters.