Previous Analyses

Here is a brief overview of some of the analyses that I have done in the past:

Drosophila melanogaster
quantitative phenotyping, differential expression analysis
Environmental samples
16s rRNA mapping, large-scale BLAST analysis, taxonomy analysis, genome assembly
Homo sapiens
genome mapping, methylation analysis, SNP analysis, STR analysis, population clustering, haplotype analysis, differential expression analysis, mitochondrial analysis, RNA biomarkers
Mus musculus
genome mapping, transcriptome sequencing, differential expression analysis, tumour growth modelling, flow cytometry
Neisseria meningitidis
genome mapping, differential expression analysis
Nippostrongylus brasiliensis
genome mapping, transcriptome mapping, transcriptome assembly, genome assembly, differential expression analysis
Saccharomyces cerevisiae
codon analysis, gradient peak analysis, genome mapping, transcriptome mapping
Schmidtea mediterranea
transcriptome assembly, genome mapping, transcriptome mapping, differential expression analysis

PhD Thesis

A copy of my PhD thesis can be found here. During the course of this PhD project, a number of interesting side projects cropped up that were beyond the scope of the thesis.

Haplotype Blocks

An analysis of gene variation within the ADH gene region on chromosome 4 led to the discovery of interesting patterns of Linkage Disequilibrium within the HapMap Caucasian (CEU) population. With a carefully biased selection of SNPs within the ADH gene, a pattern of haplotype blocks emerged that didn't fit the standard block model:

Three definite blocks can be seen on the image on the left with 100% LD for all SNPs within the block (each covering regions of about 100kb). In addition to this, there appears to be a superblock that encompasses Block 1 and Block 2 with partial (but still fairly high) LD between the Block1 SNPs and the Block2 SNPs.

Another thing that was discovered over the course of the PhD project (but not shown in this image) is that there also seems be overlapping haplotype blocks in various places in the genome. In other words, where the beginning of one block is placed before the end of a previous block. These situations create a characteristic pattern of a mini triangle of different LD within the overlap region.

There is an assumption with the current haplotype block model of human chromosomes that the blocks are composed of discrete segments of DNA (see section 1.2.6 of my thesis). While there is some tentative discussions about haplotype block "holes", further research about the nature of haplotype blocks has essentially been swamped out by genome-wide association studies, next-generation sequencing, and a number of other big-data hypothesis-free analysis technologies.

Note that this is a biased selection of SNPs — SNPs were deliberately excluded when they "looked wrong" based on the pattern observed among the neighbouring SNPs. However, a discussion at the 2016 Queenstown Research Week MapNet satellite (in Nelson), suggested that such exclusion is warranted, at least for the generation of linkage maps for genome size estimation. It is possible for linkage to be broken by a number of different processes, including artificial breaks from sequencing error, but also due to mutation events on the background of a parental haplotype. As sequence lengths are increased, the chance of mutations and/or sequencing errors also increase, making the background LD patterns harder to discern and increasing estimated recombinational sizes of the genome (in centimorgans) if such filtering is not carried out.

Bootstrap Sub-sampling

Genome-wide Association Studies are carried out on a large number of genetic variants in a large number of people, allowing the detection of small genetic effects that are associated with a trait. Natural variation of genotypes within populations means that any particular sample from the population may not represent the true genotype frequencies within that population. This may lead to the observation of marker-disease associations when no such association exists.

A bootstrap population sub-sampling technique can reduce the influence of allele frequency variation in producing false-positive results for particular samplings of the population. In order to utilise bioinformatics in the service of a serious disease, this sub-sampling method was applied to the Type 1 Diabetes dataset from the Wellcome Trust Case Control Consortium in order to evaluate its effectiveness. These results are compared to results from both a low-heritability dataset and a high-heritability dataset, suggesting that the usual noise inherent in GWAS studies might be able to be removed by this method.

The generation of a panel of validated group-specific markers is possible even when using a low-density marker set and small sample group sizes. This method is likely applicable to other genome-wide studies, and provides one way in which false positive associations could be quickly excluded from candidate marker sets.

More details about the bootstrap sub-sampling method can be found here.

Monoamine Oxidase A Gene

If you ever want to see a good example of inadequate research becoming ingrained in the scientific record, have a look at the Monoamine Oxidase A gene (MAOA). Even if media-encouraged controversy associated with the gene variant associations is ignored, there is a striking lack of original research about the function of MAOA in humans. Few (if any) recent studies have demonstrated that primary substrates of MAOA in humans actually include serotonin. I am not suggesting that these assumptions are incorrect, just that a confirmation of the assumptions is needed, particularly in light of results that suggest gene variants (and hence probably also expressed proteins) have been selected for in human populations.

Many review articles that talk about MAOA preferentially acting on serotonin do not reference this statement, treating it as an accepted fact, rather than a hypothesis (e.g. Gokturk et al., 2008; Jacob et al., 2005). Some articles do cite studies in mammals other than humans (e.g. Guo et al. 2008), and follow a (usually implicit) transitive argument: because MAOA preferentially acts on serotonin in other animals, it should act similarly in humans. This argument may be incorrect, given that human and rat MAOA have different structures, different polymerisation states, and different catalytic profiles (Son et al., 2008).

Other articles cite original-research papers that only refer to MAOA interactions in the introduction section, with the research outcome of the cited paper different than what is expected from the citation (e.g. Bach et al., 1988, cited in addition to other references in Sabol et al., 1998).

Papers that do cite original research for the interaction between MAOA and serotonin in humans are difficult to find, possibly because research on substrates for monoamine oxidases was mostly carried out in the late 1970s and early 1980s. This research was prior to the discovery that there were two classes of monoamine oxidases, MAOA and MAOB. The best reviews that I have found on MAOA biochemistry have been those of Shih et al. (1999) and Nagatsu (2004).

Peer-reviewed Research Papers

  1. Contrasting signal transduction mechanisms in bacterial and eukaryotic gene transcription
    doi: 10.1111/j.1574-6968.2006.00295.x (2006) [BibTeX citation]
  2. Haplotype analysis at the alcohol dehydrogenase gene region in New Zealand Māori
    doi: 10.1007/s10038-006-0094-1 (2007) [BibTeX citation]
  3. Metastatic susceptibility locus, an 8p hot-spot for tumour progression disrupted in colorectal liver metastases: 13 candidate genes examined at the DNA, mRNA and protein level
    doi: 10.1186/1471-2407-8-187 (2008) [BibTeX citation]
  4. DWP6-1 Identifying a genomic signature for diabetes risk
    doi: 10.1016/S0168-8227(08)70718-6 (2008) [BibTeX citation]
  5. Predicting protein structures with a multiplayer online game (Foldit player)
    doi: 10.1038/nature09304 (2010) [BibTeX citation]
  6. Testing the thrifty gene hypothesis: the Gly482Ser variant in PPARGC1A is associated with BMI in Tongans
    doi: 10.1186/1471-2350-12-10 (2011) [BibTeX citation]
  7. The p. Ala510Val mutation in the SPG7 (paraplegin) gene is the most common mutation causing adult onset neurogenetic disease in patients of British ancestry
    doi: 10.​1007/​s00415-012-6792-z (2012) [BibTeX citation]
  8. A unique demographic history exists for the MAO-A gene in Polynesians
    doi: 10.1038/jhg.2012.19 (2012) [BibTeX citation]
  9. Complete mitochondrial genome sequencing reveals novel haplotypes in a Polynesian population
    doi: 10.1371/journal.pone.0035026 (2012) [BibTeX citation]
  10. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis
    doi: 10.1038/nprot.2013.084 (2013) [BibTeX citation]
  11. Mapping eQTLs in the Norfolk Island Genetic Isolate Identifies Candidate Genes for CVD Risk Traits
    doi: 10.1016/j.ajhg.2013.11.004 (2013) [BibTeX citation]
  12. β-catenin-dependent control of positional information along the AP body axis in planarians involves a Teashirt family member
    doi: 10.1016/j.celrep.2014.12.018 (2015) [BibTeX citation]
  13. Mitochondrial genome acquisition restores respiratory function and tumorigenic potential of cancer cells without mitochondrial DNA
    doi: 10.1016/j.cmet.2014.12.003 (2015) [BibTeX citation]
  14. An analysis of DNA methylation in human adipose tissue reveals differential modification of obesity genes before and after gastric bypass and weight loss
    doi: 10.1186/s13059-014-0569-x (2015) [BibTeX citation]
  15. MinION nanopore sequencing of an influenza genome
    doi: 10.3389/fmicb.2015.00766 (2015) [BibTeX citation]
  16. 'Mutiny on the Bounty': the genetic history of Norfolk Island reveals extreme gender-biased admixture
    doi: 10.1186/s13323-015-0028-9 (2015) [BibTeX citation]
  17. MinION Analysis and Reference Consortium: Phase 1 data release and analysis
    doi: 10.12688/f1000research.7201.1 (2015) [BibTeX citation]
  18. A Phenomic Scan of the Norfolk Island Genetic Isolate Identifies a Major Pleiotropic Effect Locus Associated with Metabolic and Renal Disorder Markers
    doi: 10.1371/journal.pgen.1005593 (2015) [BibTeX citation]
  19. Serum bilirubin concentration is modified by UGT1A1 Haplotypes and influences risk of Type-2 diabetes in the Norfolk Island genetic isolate
    doi: 10.1186/s12863-015-0291-z (2015) [BibTeX citation]
  20. Th2 responses are primed by skin dendritic cells with distinct transcriptional profiles
    doi: 10.1084/jem.20160470 (2016) [BibTeX citation]
  21. Gene-Centric Analysis Implicates Nuclear Encoded Mitochondrial Protein Gene Variants in Migraine Susceptibility
    doi: 10.1002/mgg3.270 (2016) [BibTeX citation]
  22. Annotated mitochondrial genome with Nanopore R9 signal for Nippostrongylus brasiliensis [version 1; referees: 1 approved, 2 approved with reservations]
    doi: 10.12688/f1000research.10545.1 (2017) [BibTeX citation]
  23. MinION Analysis and Reference Consortium: Phase 2 data release and analysis of R9.0 chemistry
    doi: 10.12688/f1000research.11354.1 (2017) [BibTeX citation]
  24. Investigation of chimeric reads using the MinION [version 2; referees: 2 approved]
    doi: 10.12688/f1000research.11547.2 (2017) [BibTeX citation]
  25. Genome-wide linkage and association analysis of primary open-angle glaucoma endophenotypes in the Norfolk Island isolate
    doi: N/A (2017) [BibTeX citation]
  26. Genomic, Transcriptomic, and Phenotypic Analyses of Neisseria meningitidis Isolates from Disease Patients and Their Household Contacts
    doi: 10.1128/mSystems.00127-17 (2017) [BibTeX citation]
  27. Expression QTL analysis of glaucoma endophenotypes in the Norfolk Island isolate provides evidence that immune-related genes are associated with optic disc size
    doi: 10.1038/s10038-017-0374-y (2017) [BibTeX citation]
  28. De novo assembly of the complex genome of Nippostrongylus brasiliensis using MinION long reads
    doi: 10.1186/s12915-017-0473-4 (2018) [BibTeX citation]
  29. Exome Sequencing Diagnoses X-Linked Moesin-Associated Immunodeficiency in a Primary Immunodeficiency Case
    doi: 10.3389/fimmu.2018.00420 (2018) [BibTeX citation]
  30. Harnessing the MinION: An example of how to establish long‐read sequencing in a laboratory using challenging plant tissue from Eucalyptus pauciflora
    doi: 10.1111/1755-0998.12938 (2018) [BibTeX citation]
  31. Tree Lab: Portable genomics for Early Detection of Plant Viruses and Pests in Sub-Saharan Africa
    doi: 10.3390/genes10090632 (2019) [BibTeX citation]
  32. Associations of autozygosity with a broad range of human phenotypes
    doi: 10.1038/s41467-019-12283-6 (2019) [BibTeX citation]

Acknowledged Contribution (in peer-reviewed papers)

  1. Rates of Chlamydia trachomatis testing and chlamydial infection in pregnant women (mentioned in acknowledgements)
    NZ Medical Journal (2004)
  2. Sequence variants on 17q21 are associated with the susceptibility of asthma in the population of Lahore, Pakistan (mentioned in acknowledgements)
    doi: 10.3109/02770903.2015.1012590 (2015) [BibTeX citation]