www.acad.jobs : academic jobs worldwide – and the best jobs in industry
                
     
Position: Research Scientist (Machine Learning Training Systems), TikTok Applied Machine Learning
Institution: TikTok
Location: Singapore
Duties: Research and develop our machine learning systems, including heterogeneous computing architecture, management, scheduling, and monitoring; Manage cross-layer optimisation of system and AI algorithms and hardware for machine learning (GPU, ASIC); Implement both general purpose training framework features and model specific optimisations (e.g. LLM, diffusions); Improve efficiency and stability for extremely large scale distributed training jobs; Plan and lead the development of new and advanced data analytic techniques, methodologies and analytical solutions from design, prototyping, and testing; Identify and develop core data and AI science components for the delivery of projects, architect specialised database and computing environments, explore and visualise complex data set to provide incremental business value
Requirements: Bachelor or above degree in distributed, parallel computing principles and know the recent advances in computing, storage, networking, and hardware technologies; Familiar with machine learning algorithms, platforms and frameworks such as PyTorch and Jax; Have basic understanding of how GPU and/or ASIC works; Expert in at least one or two programming languages in Linux environment: C/C++, CUDA, Python
   
Text: Research Scientist (Machine Learning Training Systems), TikTok Applied Machine Learning Research and develop our machine learning systems, including heterogeneous computing architecture, management, scheduling, and monitoring; Manage cross-layer optimisation of system and AI algorithms and hardware for machine learning (GPU, ASIC); Implement both general purpose training framework features and model specific optimisations (e.g. LLM, diffusions); Improve efficiency and stability for extremely large scale distributed training jobs; Plan and lead the development of new and advanced data analytic techniques, methodologies and analytical solutions from design, prototyping, and testing; Identify and develop core data and AI science components for the delivery of projects, architect specialised database and computing environments, explore and visualise complex data set to provide incremental business value Bachelor or above degree in distributed, parallel computing principles and know the recent advances in computing, storage, networking, and hardware technologies; Familiar with machine learning algorithms, platforms and frameworks such as PyTorch and Jax; Have basic understanding of how GPU and/or ASIC works; Expert in at least one or two programming languages in Linux environment: C/C++, CUDA, Python
Please click here, if the job didn't load correctly.







Please wait. You are being redirected to the job in 3 seconds.