Distinguished Speaker Series: De novo transcriptome reconstruction from long reads

Prof. Paul Medvedev, Associate Professor in the Department of Computer Science and Engineering and the Department of Biochemistry and Molecular Biology, Director of the Center for Computational Biology and Bioinformatics, Pennsylvania State University
26 June 2019, 12:00 
Schreiber 006, Computer Science, TAU 
Distinguished Speaker Series

Abstract:  Long-read sequencing of transcripts with PacBio Iso-Seq and Oxford  Nanopore  Technologies  has  proven  to  be  central  to  the  study of  complex  isoform  landscapes  in  many  organisms.  However,  current de  novo transcript  reconstruction  algorithms  from  long-read  data  are limited,  leaving  the  potential  of  these  technologies  unfulfilled.  A  common  bottleneck  is  the  dearth  of  scalable  and  accurate  algorithms  for clustering  long  reads  according  to  their  gene  family  of  origin.  A second bottleneck is to be able to distinguish sequences errors from true variation with these families. I will present two recent methods to address these challenges. The first is IsoCon (Sahlin et al, 2018, Nat Comm),  a method to determine the full-length transcripts of multicopy gene families at nucleotide-level precision, from PacBio data. I will show how IsoCon was applied to Y chromosome ampliconic gene families, each of which contains many nearly identical gene copies. The second is isONclust (Sahlin & Medvedev, RECOMB 2019), a clustering algorithm that can assign Nanopore reads to their gene family of origin. ​


Host: Prof. Ron Shamir, Computer Science School, TAU.




