Modern molecular biology data, such as microarrays, present major challenges for statistical methodology, such as the requirement of multiple testing procedures and increasingly, empirical Bayes or similar methods that share information across all observation to improve inference. The latest explosion of data comes from technologies that sequence millions of DNA or RNA molecules simultaneously, commonly known as next-generation sequencing (NGS). For microarrays, the abundance of a particular species is measured as a fluorescence intensity, effectively a continuous response, whereas for NGS data the expression is observed as a count. Therefore, the procedures that are used for microarray data are not directly applicable. Instead, we developed an approach based on the negative binomial model using weighted conditional likelihood. We show that it outperforms standard methods for finding differentially expressed genes.
About the speaker: Dr Mark Robinson has a PhD in Medical Biology. He is interested in Statistical bioinformatics, Genomic data analysis, Epigenomics, Empirical Bayes methods, Small sample procedures, Approximate conditional inference, Robust methods, and in Statistical software design and applications.