Project Proposal
In this project, we will develop a machine learning processing pipeline that can train and classify the single-molecule fluorescence data generated from an electron multiplying charge coupled devices (EMCCD). Each dataset consists of thousands of image files where each frame represents a snapshot of single fluorescent molecules. The signature of the signal changes over time and is representative of the oligomerization state of the molecule, and chemistry experts manually classify each signature of the signal into different categories which take a tremendous amount of time.
We will develop a machine learning based data analysis platform that uses Microsoft's cloud-based system for single molecule studies of in vivo protein oligomerization. We will utilize a convolutional neural network (CNN) with multilayer classification and other machine learning techniques to develop a rapid and accurate data analysis platform.
As we collect dataset, a local machine will pre-process the captured images and data to prepare clean dataset with pre-fixed parameters. It will transfer the pre-processed data to the Microsoft VM with our custom machine learning code. We are planning to use docker images such as the NVIDIA GPU cloud (NGC) for training and classifying procedures to get GPU supports. Alternatively, we may build custom images of Azure VM that contains all required artifacts to run our machine learning pipeline. After we train a solid machine learning model, we will make a script that researchers can execute using Azure CLI to automatically classify newly collected data which will significantly reduce the time and effort that was needed to process the data manually previously.
Brief Bio
Yongwook Song is a Computational Research Engineer in the Center for Computational Science of the University of Kentucky, where he has been since 2008. He received an M.S. in the Computer Science from the University of Kentucky in 2008. He is pursuing Ph.D. in Computer Science from the University of Kentucky. His research interests focus on machine learning for network analytics. During his career at the University of Kentucky, he developed expertise in image processing, computer graphics, autostereoscopic display system, network monitoring, machine learning, and AR / VR software development. He also has personal interests in developing mobile applications and published educational mobile applications on the public.