Lip movement detection under surgical mask
To identify the speaker on any video call with a mask, I created a Dataset of Surgical Mask and used Mask R-CNN for segmentation. Used sparse optical flow to get flow vectors of features of interest. Detected speaker with a webcam, speaking at a speed of 120 WPM
