Projects

Interpretable factors in scRNA-seq with disentangled generative models

New microfluidic platforms have enabled bioinformatics at an unprecedented level of resolution for single cells, but interpreting these vast arrays poses new challenges. Principle component analysis is a common tool for dimension reduction and interpretation, but does not perform well in this context due to nonlinear coordination between genes. I helped develop a method that has it both ways, modelling nonlinearities with variational autoencoders while remaining interpretable.

Hierarchical matrix factorization for cell-surface proteins

My initial foray into probabilistic models for single-cell data, I applied an efficient matrix factorization method to arrays of counts of cell-surface proteins for thousands of cells. Previous methods applied this to counts of mRNA for cells (scRNA-seq). I compared data simulated from the posterior parameters with the raw data and found the distributions did not match under a variety of settings and diagnostics. I concluded that while this was a poor model choice for cell-surface proteins.

Lo-res data visualization methods

I worked with Soft Monitor to come up with simple and intuitive methods for summarizing streaming environmental data that could be displayed in textiles.