Duties: Train Speech Recognition (Acoustic, Language, Punctuation), Speech-to-text translation (AST), and Speech-to-speech (S2S) translation models; Develop and maintain speech processing blocks and tools (alignment, segmentation, normalization etc.); Improve processes for speech data processing, augmentation, filtering & Training sets preparation; Measure and benchmark model performance; Gather knowhow on speech datasets for training & evaluation
Requirements: Master’s degree (or equivalent experience) or PhD in Computer Science, Electrical Engineering, Artificial Intelligence, or Applied Math with 5+ years of experience; Excellent programming skills in Python, Strong fundamentals in Programming, optimizations and Software design; Hands-on experience on Speech Technologies like Automatic Speech Recognition, Speech Command detection, Text to Speech, Speaker Recognition and Identification, speaker diarization, Noise robustness Techniques, Voice activity detection, End of utterance detection etc; Strong knowledge of RNN-T, CTC, and transformer decoders