GROUP LEADER: Prof. HG Patterton
Office: 2010 AJ Perold Building
PhD, University of Cape Town, 1991
Epigenomics and Bioinformatics
Eukaryotes and multicellular organisms evolved concomitant with a significant increase in genome size. This increase in genome size was facilitated by the compaction of the poly-anionic DNA molecule into repetitive structures known as nucleosomes, and the packaging of nucleosomes into the higher-order structures of chromatin. If the DNA complement of a human cell is unravelled, and the DNA from the chromosomes laid end-to-end, it will span approximately 2m. Incredibly, this length of DNA fits into a cell nucleus of approximately 10 um diameter. Although the compaction of DNA in chromatin solved the structural problem of fitting the DNA into a nucleus, it also introduced a new problem: access to the DNA molecule. In eukaryotes a complex system has evolved whereby access to the DNA molecule is regulated by ATP-dependent chromatin remodeller enzymes, reversible modifications of the core histones in the nucleosome to provide molecular landmarks as well as modulate the local structural properties of the chromatin, chemical modifications of the DNA molecule itself, and the swapping of histone isotypes of the nucleosomes depending on functional requirements of the DNA molecule. This research field, studying the role of chromatin and chromatin modifications on the genetic function of DNA, is the field of epigenetics, and, when performed on a genome-wide scale, epigenomics. The epigenome is the interface by which DNA function is regulated.
The analyses of epigenomic data often requires significant computation resources, and research questions often need the development of new bioinformatics methods and tools. This includes tools for generic analyses such as the processing of ChIP-seq data, or require the development and coding of novel programs, dependent on specific research questions. A significant proportion of epigenomics research therefore involves bioinformatics and bioinformatics research.
Some current projects in the Patterton, group:
Epigenomic landscape of Trypanosoma brucei
T. brucei is an unicellular eukaryote of the Excavata supergroup that diverged from the Amorphea supergroup, containing animals, fungi and amoebozoa, some 2 billion years ago. T. brucei is a dixenous parasite that is transmitted to humans by a Tsetse fly during a blood meal, and causes human African trypanosomiasis. We studied the epigenome of T. brucei in its human (BF) and insect (PF) life cycle stages by mapping genome-wide nucleosome positions by MNase-seq, and identified surprizing differences in the nucleosomal organization in polymerase II initiation regions compared to that of members of the Amorphea supergroup. We have also identified novel modification patterns on the N-terminal tail of histone H3 by LC-MS/MS, and mapped the genomic distribution of the modifications in the genome of T. brucei by ChIP-seq. This project gave a comprehensive insight into the epigenome of BF and PF stage T. brucei, and the degree by which the epigenome regulated the passage of the parasite between the two hosts.
Role of the epigenome in lifespan
Calorific restriction has been shown to extend chronological lifespan in numerous model organisms. Many effectors of lifespan extension also exert their effect through the sirtuins, orthologs to S. cerevisiae Sir2, a NADH-dependent histone deactylase. We were interested in studying the role of the epigenome in chronological lifespan in S. cerevisiae, and quantitated, by barcode sequencing, the survival rate of all non-essential histone mutants in a batch mixed culture. We identified many epigenetic modification mimics associated with extended as well as shortened lifespans. The residues associated with extended lifespans were situated on the side surface of the nucleosome, implicated in Sir3 binding, a protein associated with the formation of transcriptionally repressive heterochromatin at telomeres and elsewhere. We performed an RNA-seq analysis of the various long living histone mutants, and identified two main transcriptomic pathways involved in lifespan extension. We also performed an iTRAQ quantitation of the proteome by LC-MS/MS and identified proteins involved in longevity.
NucPos is a suite of C++ programs developed for the analysis of genome-wide nucleosome positions. The individual programs include NucFrag, which generates a text datafile of the number of co-aligned nucleosome dyads at each sequence position in the genome from the mapped SAM/BAM files produced by Bowtie2. Programs to convert the dyad data to enclosed bases (electronic "footprints"), incorporated sequence preference analyses, and Fourier amplitude of dinucleotide distributions are part of the suite.
Transcriptomic studies that we performed in S. cerevisiae suggested that some genes that were adjacent or physically close in the genome, were co-regulated. We therefore developed a program Pyxis to identify statistically significant clusters of co-regulated genes. We showed that some proteins involved in repression acted in extended domains, and that clusters of genes became activated during exit of stationary phase.
The group is currently involved in several homology modelling, genome assembly and SNP calling projects, as well as the migration of C++ programs to the Galaxy platform, and the development of new bioinformatics programs.