Postdoc, University of Waterloo
The modern analytical technique such as LC-MS could help us collect infomation from thousands of compounds in one sample at the same time. Thus we could compare the differences among different samples. For example, we could compare LC-MS data from two groups of people, one with cancer and another without. And we might find the biomarkers for this cancer with thousands of compounds and make insight to cure the patients. However, such process needs to make data correction to ensure the accuracy. My research employed novel analytical methods, statistical analysis, as well as interactive data visualization technique to find and reduce the uncertainty during high throughput data analysis, as well as sample preparation.
Abstract: Metabolic fingerprinting is a relatively young scientific discipline requiring robust, yet flexible and fit-for-purpose analytical methods. Here, we introduce a simple approach to select reversed phase LC systems with electrospray MS detection for fingerprinting of polar and amphiphilic plant metabolites. The approach does not rely on isotopic labeling or biological origin of sample constituent and can also be used for non-biological matrices (e.g., oil or sewage sludge) or for other optimization purposes (e.g., mass spectrometric source parameterization). The LC systems varied in column chemistry and temperature, mobile phase pH/additive, gradient steepness/eluotropic strength, and electrospray mode of operation. The systems were evaluated based on the number of features detected using the matchedFilter algorithm from XCMS and the repeatability of this detection across analytical replicates. For negative ion mode detection, the best performances were obtained with an HSS T3 column operated at low pH, which produced a 3-fold increase in the number of reliable features extracted compared with the worst system. The best system for positive ion mode (i.e., the BEH C18 column operated at intermediate pH) only produced a 50 % increase in the number of reliable features. The data also indicate that baseline removal is unavoidable for reliable intensity estimations using peak areas, and that peak heights may be a more robust measure of intensity when baselines cannot be completely removed or in case of coelution, fronting or tailing.
Pub.: 28 Jun '16, Pinned: 28 Aug '17
Abstract: Several natural polyketides (PKs) have been associated with important pharmaceutical properties. Type III polyketide synthases (PKS) that generate aromatic PK polyketides have been studied extensively for their substrate promiscuity and product diversity. Stilbene synthase-like (STS) enzymes are unique in the type III PKS class as they possess a hydrogen bonding network, furnishing them with thioesterase-like properties, resulting in aldol condensation of the polyketide intermediates formed. Chalcone synthases (CHS) in contrast, lack this hydrogen-bonding network, resulting primarily in the Claisen condensation of the polyketide intermediates formed. We have attempted to expand the chemical space of this interesting class of compounds generated by creating structure-guided mutants of Vitis vinifera STS. Further, we have utilized a previously established workflow to quickly compare the wild-type reaction products to those generated by the mutants and identify novel PKs formed by using XCMS analysis of LC-MS and LC-MS/MS data. Based on this approach, we were able to generate 15 previously unreported PK molecules by exploring the substrate promiscuity of the wild-type enzyme and all mutants using unnatural substrates. These structures were specific to STSs and cannot be formed by their closely related CHS-like counterparts.
Pub.: 07 Jun '15, Pinned: 28 Aug '17
Abstract: A major goal of ecotoxicology is the prediction of adverse outcomes for populations from sensitive and early physiological responses. A snapshot of the physiological state of an organism can be provided by metabolic fingerprints. However, to inform chemical risk assessment, multivariate metabolic fingerprints need to be converted to readable end points suitable for effect estimation and comparison. The concentration- and time-dependent responsiveness of metabolic fingerprints to the PS-II inhibitor isoproturon was investigated by use of a Myriophyllum spicatum bioassay. Hydrophilic and lipophilic leaf extracts were analyzed with gas chromatography-mass spectrometry (GC-MS) and preprocessed with XCMS. Metabolic changes were aggregated in the quantitative metabolic effect level index (MELI), allowing effect estimation from Hill-based concentration-response models. Hereby, the most sensitive response on the concentration scale was revealed by the hydrophilic MELI, followed by photosynthetic efficiency and, 1 order of magnitude higher, by the lipophilic MELI and shoot length change. In the hydrophilic MELI, 50% change compares to 30% inhibition of photosynthetic efficiency and 10% inhibition of dry weight change, indicating effect development on different response levels. In conclusion, aggregated metabolic fingerprints provide quantitative estimates and span a broad response spectrum, potentially valuable for establishing adverse outcome pathways of chemicals in environmental risk assessment.
Pub.: 29 May '15, Pinned: 28 Aug '17
Abstract: Tandem mass spectral library search (MS/MS) is the fastest way to correctly annotate MS/MS spectra from screening small molecules in fields such as environmental analysis, drug screening, lipid analysis, and metabolomics. The confidence in MS/MS-based annotation of chemical structures is impacted by instrumental settings and requirements, data acquisition modes including data-dependent and data-independent methods, library scoring algorithms, as well as post-curation steps. We critically discuss parameters that influence search results, such as mass accuracy, precursor ion isolation width, intensity thresholds, centroiding algorithms, and acquisition speed. A range of publicly and commercially available MS/MS databases such as NIST, MassBank, MoNA, LipidBlast, Wiley MSforID, and METLIN are surveyed. In addition, software tools including NIST MS Search, MS-DIAL, Mass Frontier, SmileMS, Mass++, and XCMS(2) to perform fast MS/MS search are discussed. MS/MS scoring algorithms and challenges during compound annotation are reviewed. Advanced methods such as the in silico generation of tandem mass spectra using quantum chemistry and machine learning methods are covered. Community efforts for curation and sharing of tandem mass spectra that will allow for faster distribution of scientific discoveries are discussed.
Pub.: 25 Apr '17, Pinned: 28 Aug '17
Abstract: XCMS and MZmine 2 are two widely used software packages for preprocessing untargeted LC/MS metabolomics data. Both construct extracted ion chromatograms (EICs) and detect peaks from the EICs, the first two steps in the data preprocessing workflow. While both packages have performed admirably in peak picking, they also detect a problematic number of false positive EIC peaks and can also fail to detect real EIC peaks. The former and latter translate downstream into spurious and missing compounds, and present significant limitations with most existing software packages that preprocess untargeted mass spectrometry metabolomics data. We seek to understand the specific reasons why XCMS and MZmine 2 find the false positive EIC peaks that they do, and in what ways they fail to detect real compounds. We investigate differences of EIC construction methods in XCMS and MZmine 2 and find several problems in the XCMS centWave peak detection algorithm which we show are partly responsible for the false positive and false negative compound identifications. In addition, we find a problem with MZmine 2's use of centWave. We hope that a detailed understanding of the XCMS and MZmine 2 algorithms will allow users to work with them more effectively, and will also help with future algorithmic development.
Pub.: 29 Jul '17, Pinned: 28 Aug '17
Join Sparrho today to stay on top of science
Discover, organise and share research that matters to you