i

The Beauty of Proteomics [Invited]

Cover art by Julie Newdoll for MCP April issue.




i

The human proteome project: Current state and future direction [Invited]

After successful completion of the Human Genome Project (HGP), HUPO has recently officially launched a global Human Proteome Project (HPP) which is designed to map the entire human protein set. Given the presence of about 30% undisclosed proteins out of 20,300 protein gene products, a systematic global effort is necessary to achieve this goal with respect to protein abundance, distribution, subcellular localization, interaction with other biomolecules, and functions at specific time points. As a general experimental strategy, HPP groups employ the three working pillars for HPP: mass spectrometry, antibody capture, and bioinformatics tools and knowledge base. The HPP participants will take advantage of the output and cross-analyses from the ongoing HUPO initiatives and a chromosome-based protein mapping strategy, termed C-HPP with many national teams currently engaged. In addition, numerous biologically-driven projects will be stimulated and facilitated by the HPP. Timely planning with proper governance of HPP will deliver a protein parts list, reagents and tools for protein studies and analyses, and a stronger basis for personalized medicine. HUPO urges each national research funding agency and the scientific community at large to identify their preferred pathways to participate in aspects of this highly promising project in a HPP consortium of funders and investigators.




i

Fourier transform mass spectrometry [Invited]

This article provides an introduction to Fourier transform-based mass spectrometry (FTMS). The key performance characteristics of FTMS, mass accuracy and resolution, are presented in the view of how they impact the interpretation of measurements in proteomic applications. The theory and principles of operation of two types of mass analyzer, Fourier transform ion cyclotron resonance and Orbitrap, are described. Major benefits as well as limitations of FTMS technology are discussed in the context of practical sample analysis, and illustrated with examples included as figures in this text and in the accompanying slide set. Comparisons highlighting the performance differences between the two mass analyzers are made where deemed useful in assisting the user with choosing the most appropriate technology for his/her application. Recent developments of these high-performing mass spectrometers are mentioned to provide a future outlook.




i

Principles of electrospray ionization [Biophysical Methods]

Electrospray ionization is today the most widely used ionization technique in chemical and bio-chemical analysis. Interfaced with a mass spectrometer it allows to investigate the molecular composition of liquid samples. With electrospray a large variety of chemical substances can be ionized. There is no limitation in mass which enables even the investigation of large non-covalent protein complexes. Its high ionization efficiency profoundly changed bio-molecular sciences because proteins can be identified and quantified on trace amounts in a high throughput fashion. This review article focusses mainly on the exploration of the underlying ionization mechanism. Some ionization characteristics are discussed which are related to this mechanism. Typical spectra of peptides, proteins and non-covalent complexes are shown and the quantitative character of spectra is highlighted. Finally the possibilities and limitations in measuring the association constant of bivalent non-covalent complexes are described.




i

The ProteoRed MIAPE web toolkit: A user-friendly framework to connect and share proteomics standards [Technology]

The development of the HUPO-PSI's (Proteomics Standards Initiative) standard data formats and MIAPE (Minimum Information About a Proteomics Experiment) guidelines should improve proteomics data sharing within the scientific community. Proteomics journals have encouraged the use of these standards and guidelines to improve the quality of experimental reporting and ease the evaluation and publication of manuscripts. However, there is an evident lack of bioinformatics tools specifically designed to create and edit standard file formats and reports, or embed them within proteomics workflows. In this article, we describe a new web-based software suite (The ProteoRed MIAPE web toolkit) that performs several complementary roles related to proteomic data standards. Firstly, it can verify the reports fulfill the minimum information requirements of the corresponding MIAPE modules, highlighting inconsistencies or missing information. Secondly, the toolkit can convert several XML-based data standards directly into human readable MIAPE reports stored within the ProteoRed MIAPE repository. Finally, it can also perform the reverse operation, allowing users to export from MIAPE reports into XML files for computational processing, data sharing or public database submission. The toolkit is thus the first application capable of automatically linking the PSI's MIAPE modules with the corresponding XML data exchange standards, enabling bidirectional conversions. This toolkit is freely available at http://www.proteored.org/MIAPE/.




i

Bayesian Proteoform Modeling Improves Protein Quantification of Global Proteomic Measurements [Technology]

As the capability of mass spectrometry-based proteomics has matured, tens of thousands of peptides can be measured simultaneously, which has the benefit of offering a systems view of protein expression. However, a major challenge is that with an increase in throughput, protein quantification estimation from the native measured peptides has become a computational task. A limitation to existing computationally-driven protein quantification methods is that most ignore protein variation, such as alternate splicing of the RNA transcript and post-translational modifications or other possible proteoforms, which will affect a significant fraction of the proteome. The consequence of this assumption is that statistical inference at the protein level, and consequently downstream analyses, such as network and pathway modeling, have only limited power for biomarker discovery. Here, we describe a Bayesian model (BP-Quant) that uses statistically derived peptides signatures to identify peptides that are outside the dominant pattern, or the existence of multiple over-expressed patterns to improve relative protein abundance estimates. It is a research-driven approach that utilizes the objectives of the experiment, defined in the context of a standard statistical hypothesis, to identify a set of peptides exhibiting similar statistical behavior relating to a protein. This approach infers that changes in relative protein abundance can be used as a surrogate for changes in function, without necessarily taking into account the effect of differential post-translational modifications, processing, or splicing in altering protein function. We verify the approach using a dilution study from mouse plasma samples and demonstrate that BP-Quant achieves similar accuracy as the current state-of-the-art methods at proteoform identification with significantly better specificity. BP-Quant is available as a MatLab ® and R packages at https://github.com/PNNL-Comp-Mass-Spec/BP-Quant.




i

The Proteomics of Networks and Pathways: A Movie is Worth a Thousand Pictures [Editorial]

none




i

Quantitative profiling of protein tyrosine kinases in human cancer cell lines by multiplexed parallel reaction monitoring assays [Technology]

Protein tyrosine kinases (PTKs) play key roles in cellular signal transduction, cell cycle regulation, cell division, and cell differentiation. Dysregulation of PTK-activated pathways, often by receptor overexpression, gene amplification, or genetic mutation, is a causal factor underlying numerous cancers. In this study, we have developed a parallel reaction monitoring (PRM)-based assay for quantitative profiling of 83 PTKs. The assay detects 308 proteotypic peptides from 54 receptor tyrosine kinases and 29 nonreceptor tyrosine kinases in a single run. Quantitative comparisons were based on the labeled reference peptide method. We implemented the assay in four cell models: 1) a comparison of proliferating versus epidermal growth factor (EGF)-stimulated A431 cells, 2) a comparison of SW480Null (mutant APC) and SW480APC (APC restored) colon tumor cell lines, and 3) a comparison of 10 colorectal cancer cell lines with different genomic abnormalities, and 4) lung cancer cell lines with either susceptibility (11-18) or acquired resistance (11-18R) to the epidermal growth factor receptor tyrosine kinase inhibitor erlotinib. We observed distinct PTK expression changes that were induced by stimuli, genomic features or drug resistance, which were consistent with previous reports. However, most of the measured expression differences were novel observations. For example, acquired resistance to erlotinib in the 11-18 cell model was associated not only with previously reported upregulation of MET, but also with upregulation of FLK2 and downregulation of LYN and PTK7. Immunoblot analyses and shotgun proteomics data were highly consistent with PRM data. Multiplexed PRM assays provide a targeted, systems-level profiling approach to evaluate cancer-related proteotypes and adaptations. Data are available through Proteome eXchange Accession PXD002706.




i

WITHDRAWN: Quantitative mass spectrometry analysis of PD-L1 protein expression, N-glycosylation and expression stoichiometry with PD-1 and PD-L2 in human melanoma [Research]

This article has been withdrawn by the authors. We discovered an error after this manuscript was published as a Paper in Press. Specifically, we learned that the structures of glycans presented for the PD-L1 peptide were drawn and labeled incorrectly. We wish to withdraw this article and submit a corrected version for review.




i

Translating Divergent Environmental Stresses into a Common Proteome Response through Hik33 in a Model Cyanobacterium [Research]

The histidine kinase Hik33 plays important roles in mediating cyanobacterial response to divergent types of abiotic stresses including cold, salt, high light (HL), and osmotic stresses. However, how these functions are regulated by Hik33 remains to be addressed. Using a hik33-deficient strain (hik33) of Synechocystis sp. PCC 6803 (Synechocystis) and quantitative proteomics, we found that Hik33 depletion induces differential protein expression highly similar to that induced by divergent types of stresses. This typically includes downregulation of proteins in photosynthesis and carbon assimilation that are necessary for cell propagation, and upregulation of heat shock proteins, chaperons, and proteases that are important for cell survival. This observation indicates that depletion of Hik33 alone mimics divergent types of abiotic stresses, and that Hik33 could be important for preventing abnormal stress response in the normal condition. Moreover, we found the majority of proteins of plasmid origin were significantly upregulated in hik33, though their biological significance remains to be addressed. Together, the systematically characterized Hik33-regulated cyanobacterial proteome, which is largely involved in stress responses, builds the molecular basis for Hik33 as a general regulator of stress responses.




i

WITHDRAWN: Heralds of parallel MS: Data-independent acquisition surpassing sequential identification of data dependent acquisition in proteomics [Research]

This article has been withdrawn by the authors. This article did not comply with the editorial guidelines of MCP. Specifically, single peptide based protein identifications of 9-19% were included in the analysis and discussed in the results and conclusions. We wish to withdraw this article and resubmit a clarified, corrected manuscript for review.




i

Recent Advances in Analytical Approaches for Glycan and Glycopeptide Quantitation [Review]

Growing implications of glycosylation in physiological occurrences and human disease have prompted intensive focus on revealing glycomic perturbations through absolute and relative quantification. Empowered by seminal methodologies and increasing capacity for detection, identification, and characterization, the past decade has provided a significant increase in the number of suitable strategies for glycan and glycopeptide quantification. Mass spectrometry-based strategies for glycomic quantitation have grown to include metabolic incorporation of stable isotopes, deposition of mass difference and mass defect isotopic labels, and isobaric chemical labeling, providing researchers with ample tools for accurate and robust quantitation. Beyond this, workflows have been designed to harness instrument capability for label-free quantification and numerous software packages have been developed to facilitate reliable spectrum scoring. In this review, we present and highlight the most recent advances in chemical labeling and associated techniques for glycan and glycopeptide quantification.




i

Recent advances in software tools for more generic and precise intact glycopeptide analysis [Review]

Intact glycopeptide identification has long been known as a key and challenging barrier to the comprehensive and accurate understanding the role of glycosylation in an organism. Intact glycopeptide analysis is a blossoming field that has received increasing attention in recent years. Mass spectrometry (MS)-based strategies and relative software tools are major drivers that have greatly facilitated the analysis of intact glycopeptides, particularly intact N-glycopeptides. This manuscript provides a systematic review of the intact glycopeptide identification process using mass spectrometry data generated in shotgun proteomic experiments, which typically focus on N-glycopeptide analysis. Particular attention is paid to the software tools that have been recently developed in the last decade for the interpretation and quality control of glycopeptide spectra acquired using different MS strategies. The review also provides information about the characteristics and applications of these software tools, discusses their advantages and disadvantages, and concludes with a discussion of outstanding tools.




i

Meta-heterogeneity: evaluating and describing the diversity in glycosylation between sites on the same glycoprotein [Review]

Mass spectrometry-based glycoproteomics has gone through some incredible developments over the last few years. Technological advances in glycopeptide enrichment, fragmentation methods, and data analysis workflows have enabled the transition of glycoproteomics from a niche application, mainly focused on the characterization of isolated glycoproteins, to a mature technology capable of profiling thousands of intact glycopeptides at once. In addition to numerous biological discoveries catalyzed by the technology, we are also observing an increase in studies focusing on global protein glycosylation and the relationship between multiple glycosylation sites on the same protein. It has become apparent that just describing protein glycosylation in terms of micro- and macro-heterogeneity, respectively the variation and occupancy of glycans at a given site, is not sufficient to describe the observed interactions between sites. In this perspective we propose a new term, meta-heterogeneity, to describe a higher level of glycan regulation: the variation in glycosylation across multiple sites of a given protein. We provide literature examples of extensive meta-heterogeneity on relevant proteins such as antibodies, erythropoietin, myeloperoxidase and a number of serum and plasma proteins. Furthermore, we postulate on the possible biological reasons and causes behind the intriguing meta-heterogeneity observed in glycoproteins.




i

Calculating glycoprotein similarities from mass spectrometric data [Review]

Complex protein glycosylation occurs through biosynthetic steps in the secretory pathway that create macro- and microheterogeneity of structure and function.  Required for all life forms, glycosylation diversifies and adapts protein interactions with binding partners that underpin interactions at cell surfaces and pericellular and extracellular environments. Because these biological effects arise from heterogeneity of structure and function, it is necessary to measure their changes as part of the quest to understand nature.  Quite often, however, the assumption behind proteomics that post-translational modifications are discrete additions that can be modeled using the genome as a template does not apply to protein glycosylation.  Rather, it is necessary to quantify the glycosylation distribution at each glycosite and to aggregate this information into a population of mature glycoproteins that exist in a given biological system.  To date, mass spectrometric methods for assigning singly glycosylated peptides are well-established.  But it is necessary to quantify glycosylation heterogeneity accurately in order to gauge the alterations that occur during biological processes.  The task is to quantify the glycosylated peptide forms as accurately as possible and then apply appropriate bioinformatics algorithms to the calculation of micro- and macro-similarities.  In this review, we summarize current approaches for protein quantification as they apply to this glycoprotein similarity problem.




i

Peak Filtering, Peak Annotation, and Wildcard Search for Glycoproteomics [Research]

Glycopeptides in peptide or digested protein samples pose a number of analytical and bioinformatics challenges beyond those posed by unmodified peptides or peptides with smaller posttranslational modifications. Exact structural elucidation of glycans is generally beyond the capability of a single mass spectrometry experiment, so a reasonable level of identification for tandem mass spectrometry, taken by several glycopeptide software tools, is that of peptide sequence and glycan composition, meaning the number of monosaccharides of each distinct mass, for example HexNAc(2)Hex(5) rather than man5. Even at this level, however, glycopeptide analysis poses challenges:  finding glycopeptide spectra when they are a tiny fraction of the total spectra; assigning spectra with unanticipated glycans, not in the initial glycan database; and finding, scoring, and labeling diagnostic peaks in tandem mass spectra.  Here we discuss recent improvements to Byonic, a glycoproteomics search program, that address these three issues. Byonic now supports filtering spectra by m/z peaks, so that the user can limit attention to spectra with diagnostic peaks, for example, at least two out of three of 204.087 for HexNAc, 274.092 for NeuAc (with water loss), and 366.139 for HexNAc-Hex, all within a set mass tolerance, for example, ± 0.01 Daltons. Also new is glycan "wildcard" search, which allows an unspecified mass within a user-set mass range to be applied to N- or O-linked glycans and enables assignment of spectra with unanticipated glycans. Finally the next release of Byonic supports user-specified peak annotations from user-defined posttranslational modifications. We demonstrate the utility of these new software features by finding previously unrecognized glycopeptides in publicly available data, including glycosylated neuropeptides from rat brain.




i

Developments in Mass Spectrometry for Glycosaminoglycan Analysis: A Review [Review]

This review covers recent developments in glycosaminoglycan (GAG) analysis via mass spectrometry (MS). GAGs participate in a variety of biological functions, including cellular communication, wound healing, and anticoagulation, and are important targets for structural characterization. GAGs exhibit a diverse range of structural features due to the variety of O- and N-sulfation modifications and uronic acid C-5 epimerization that can occur, making their analysis a challenging target. Mass spectrometry approaches to the structure assignment of GAGs have been widely investigated, and new methodologies remain the subject of development. Advances in sample preparation, tandem MS techniques (MS/MS), on-line separations and automated analysis software have advanced the field of GAG analysis. These recent developments have led to remarkable improvements in the precision and time efficiency for the structural characterization of GAGs.




i

Methods for Enrichment and Assignment of N-Acetylglucosamine Modification Sites [Review]

O-GlcNAcylation, the addition of a single N-acetylglucosamine residue to serine and threonine residues of cytoplasmic, nuclear, or mitochondrial proteins, is a widespread regulatory post-translational modification. It is involved in response to nutritional status and stress and its dysregulation is associated with diseases ranging from Alzheimer’s to diabetes.  While the modification was first detected over thirty-five years ago, research into the function of O-GlcNAcylation has accelerated dramatically in the last ten years due to the development of new enrichment and mass spectrometry techniques that facilitate its analysis.  This article summarizes methods for O-GlcNAc enrichment, key mass spectrometry instrumentation advancements, particularly those that allow modification site localization, and software tools that allow analysis of data from O-GlcNAc modified peptides.




i

A Pragmatic Guide to Enrichment Strategies for Mass Spectrometry-based Glycoproteomics [Review]

Glycosylation is a prevalent, yet heterogeneous modification with a broad range of implications in molecular biology. This heterogeneity precludes enrichment strategies that can be universally beneficial for all glycan classes. Thus, choice of enrichment strategy has profound implications on experimental outcomes. Here we review common enrichment strategies used in modern mass spectrometry (MS)-based glycoproteomic experiments, including lectins and other affinity chromatographies, hydrophilic interaction chromatography (HILIC) and its derivatives, porous graphitic carbon (PGC), reversible and irreversible chemical coupling strategies, and chemical biology tools that often leverage bioorthogonal handles. Interest in glycoproteomics continues to surge as MS instrumentation and software improve, so this review aims to help equip researchers with necessary information to choose appropriate enrichment strategies that best complement these efforts.




i

Quantitative data independent acquisition glycoproteomics of sparkling wine [Research]

Sparkling wine is an alcoholic beverage enjoyed around the world. The sensory properties of sparkling wine depend on a complex interplay between the chemical and biochemical components in the final product. Glycoproteins have been linked to positive and negative qualities in sparkling wine, but the glycosylation profiles of sparkling wine have not been previously investigated in detail. We analysed the glyco/proteome of sparkling wines using protein- and glycopeptide-centric approaches. We developed an automated workflow that created ion libraries to analyse Sequential Window Acquisition of all THeoretical mass spectra (SWATH) Data Independent Acquisition (DIA) mass spectrometry data based on glycopeptides identified by Byonic. We applied our workflow to three pairs of experimental sparkling wines to assess the effects of aging on lees and of different yeast strains used in the Liqueur de Tirage for secondary fermentation. We found that aging a cuvée on lees for 24 months compared to 8 months led to a dramatic decrease in overall protein abundance and an enrichment in large glycans at specific sites in some proteins. Secondary fermentation of a Riesling wine with Saccharomyces cerevisiae yeast strain Siha4 produced more yeast proteins and glycoproteins than with S. cerevisiae yeast strain DV10. The abundance and glycosylation profiles of grape glycoproteins were also different between grape varieties. This work represents the first in-depth study into protein- and peptide-specific glycosylation in sparkling wines and describes a quantitative glycoproteomic SWATH/DIA workflow that is broadly applicable to other sample types.




i

Integrated glycoproteomics identifies a role of N-glycosylation and galectin-1 on myogenesis and muscle development [Research]

Many cell surface and secreted proteins are modified by the covalent addition of glycans that play an important role in the development of multicellular organisms. These glycan modifications enable communication between cells and the extracellular matrix via interactions with specific glycan-binding lectins and the regulation of receptor-mediated signaling. Aberrant protein glycosylation has been associated with the development of several muscular diseases suggesting essential glycan- and lectin-mediated functions in myogenesis and muscle development but our molecular understanding of the precise glycans, catalytic enzymes and lectins involved remain only partially understood. Here, we quantified dynamic remodeling of the membrane-associated proteome during a time-course of myogenesis in cell culture. We observed wide-spread changes in the abundance of several important lectins and enzymes facilitating glycan biosynthesis. Glycomics-based quantification of released N-linked glycans confirmed remodeling of the glycome consistent with the regulation of glycosyltransferases and glycosidases responsible for their formation including a previously unknown di-galactose-to-sialic acid switch supporting a functional role of these glycoepitopes in myogenesis. Furthermore, dynamic quantitative glycoproteomic analysis with multiplexed stable isotope labelling and analysis of enriched glycopeptides with multiple fragmentation approaches identified glycoproteins modified by these regulated glycans including several integrins and growth factor receptors. Myogenesis was also associated with the regulation of several lectins most notably the up-regulation of galectin-1 (LGALS1). CRISPR/Cas9-mediated deletion of Lgals1 inhibited differentiation and myotube formation suggesting an early functional role of galectin-1 in the myogenic program. Importantly, similar changes in N-glycosylation and the up-regulation of galectin-1 during postnatal skeletal muscle development were observed in mice. Treatment of new-born mice with recombinant adeno-associated viruses to overexpress galectin-1 in the musculature resulted in enhanced muscle mass. Our data form a valuable resource to further understand the glycobiology of myogenesis and will aid the development of intervention strategies to promote healthy muscle development or regeneration.




i

CIITA-transduced glioblastoma cells uncover a rich repertoire of clinically relevant tumor-associated HLA-II antigens [Research]

CD4+ T cell responses are crucial for inducing and maintaining effective anti-cancer immunity, and the identification of human leukocyte antigen class II (HLA-II) cancer-specific epitopes is key to the development of potent cancer immunotherapies. In many tumor types, and especially in glioblastoma (GBM), HLA-II complexes are hardly ever naturally expressed. Hence, little is known about immunogenic HLA-II epitopes in GBM. With stable expression of the class II major histocompatibility complex transactivator (CIITA) coupled to a detailed and sensitive mass spectrometry based immunopeptidomics analysis, we here uncovered a remarkable breadth of the HLA-ligandome in HROG02, HROG17 and RA GBM cell lines. The effect of CIITA expression on the induction of the HLA-II presentation machinery was striking in each of the three cell lines, and it was significantly higher compared to interferon gamma (IFN) treatment. In total, we identified 16,123 unique HLA-I peptides and 32,690 unique HLA-II peptides. In order to genuinely define the identified peptides as true HLA ligands, we carefully characterized their association with the different HLA allotypes. In addition, we identified 138 and 279 HLA-I and HLA-II ligands, respectively, most of which are novel in GBM, derived from known GBM-associated tumor-antigens that have been used as source proteins for a variety of GBM vaccines. Our data further indicate that CIITA-expressing GBM cells acquired an antigen presenting cell-like phenotype as we found that they directly present external proteins as HLA-II ligands. Not only that CIITA-expressing GBM cells are attractive models for antigen discovery endeavors, but also such engineered cells have great therapeutic potential through massive presentation of a diverse antigenic repertoire.




i

Glycomics, Glycoproteomics and Glycogenomics: an Inter-Taxa Evolutionary Perspective [Review]

Glycosylation is a highly diverse set of co- and post-translational modification of proteins. For mammalian glycoproteins, glycosylation is often site-, tissue- and species-specific, and diversified by microheterogeneity. Multitudinous biochemical, cellular, physiological and organismic effects of their glycans have been revealed, either intrinsic to the carrier proteins or mediated by endogenous reader proteins with carbohydrate recognition domains. Furthermore, glycans frequently form the first line of access by or defense from foreign invaders, and new roles for nucleocytoplasmic glycosylation are blossoming. We now know enough to conclude that the same general principles apply in invertebrate animals and unicellular eukaryotes – different branches of which spawned the plants or fungi and animals. The two major driving forces for exploring the glycomes of invertebrates and protists are (i) to understand the biochemical basis of glycan-driven biology in these organisms, especially of pathogens, and (ii) to uncover the evolutionary relationships between glycans, their biosynthetic enzyme genes, and biological functions for new glycobiological insights. With an emphasis on emerging areas of protist glycobiology, here we offer an overview of glycan diversity and evolution, to promote future access to this treasure trove of glycobiological processes.




i

Chromatin proteomics to study epigenetics - challenges and opportunities [Review]

Regulation of gene expression is essential for the functioning of all eukaryotic organisms. Understanding gene expression regulation requires determining which proteins interact with regulatory elements in chromatin. Mass spectrometry-based analysis of chromatin has emerged as a powerful tool to identify proteins associated with gene regulation, as it allows studying protein function and protein complex formation in their in vivo chromatin-bound context. Total chromatin isolated from cells can be directly analysed using mass spectrometry or further fractionated into transcriptionally active and inactive chromatin prior to MS-based analysis. Newly formed chromatin that is assembled during DNA replication can also be specifically isolated and analysed. Furthermore, capturing specific chromatin domains facilitates the identification of previously unknown transcription factors interacting with these domains. Finally, in recent years, advances have been made towards identifying proteins that interact with a single genomic locus of interest. In this review, we highlight the power of chromatin proteomics approaches and how these provide complementary alternatives compared to conventional affinity purification methods. Furthermore, we discuss the biochemical challenges that should be addressed to consolidate and expand the role of chromatin proteomics as a key technology in the context of gene expression regulation and epigenetics research in health and disease.




i

Site-specific N-glycosylation Characterization of Recombinant SARS-CoV-2 Spike Proteins [Research]

The glycoprotein spike (S) on the surface of SARS-CoV-2 is a determinant for viral invasion and host immune response. Herein, we characterized the site-specific N-glycosylation of S protein at the level of intact glycopeptides. All 22 potential N-glycosites were identified in the S-protein protomer and were found to be preserved among the 753 SARS-CoV-2 genome sequences. The glycosites exhibited glycoform heterogeneity as expected for a human cell-expressed protein subunit. We identified masses that correspond to 157 N-glycans, primarily of the complex type. In contrast, the insect cell-expressed S protein contained 38 N-glycans, completely of the high-mannose type. Our results revealed that the glycan types were highly determined by the differential processing of N-glycans among human and insect cells, regardless of the glycosites’ location. Moreover, the N-glycan compositions were conserved among different sizes of subunits. Our study indicate that the S protein N-glycosylation occurs regularly at each site, albeit the occupied N-glycans were diverse and heterogenous. This N-glycosylation landscape and the differential N-glycan patterns among distinct host cells are expected to shed light on the infection mechanism and present a positive view for the development of vaccines and targeted drugs.




i

Systematic Proteome and Lysine Succinylome Analysis Reveals the Enhanced Cell Migration by Hyposuccinylation in Esophageal Squamous Cell Cancer [Research]

Esophageal squamous cell cancer (ESCC) is an aggressive malignancy with poor therapeutic outcomes. However, the alterations in proteins and post-translational modifications (PTMs) leading to the pathogenesis of ESCC remains unclear. Here, we provide the comprehensive characterization of the proteome, phosphorylome, lysine acetylome and succinylome for ESCC and matched control cells using quantitative proteomic approach. We identify abnormal protein and post-translational modification (PTM) pathways, including significantly downregulated lysine succinylation sites in cancer cells. Focusing on hyposuccinylation, we reveal that this altered PTM was enriched on enzymes of metabolic pathways inextricably linked with cancer metabolism. Importantly, ESCC malignant behaviors such as cell migration are inhibited once the level of succinylation was restored in vitro or in vivo. This effect was further verified by mutations to disrupt succinylation sites in candidate proteins. Meanwhile, we found that succinylation has a negative regulatory effect on histone methylation to promote cancer migration. Finally, hyposuccinylation is confirmed in primary ESCC specimens. Our findings together demonstrate that lysine succinylation may alter ESCC metabolism and migration, providing new insights into the functional significance of PTM in cancer biology.




i

N-glycomic signature of stage II colorectal cancer and its association with the tumor microenvironment [Research]

The choice for adjuvant chemotherapy in stage II colorectal cancer (CRC) is controversial as many patients are cured by surgery alone and it is difficult to identify patients with high-risk of recurrence of the disease. There is a need for better stratification of this group of patients. Mass spectrometry imaging could identify patients at risk. We report here the N-glycosylation signatures of the different cell populations in a group of stage II CRC tissue samples. The cancer cells, compared to normal epithelial cells, have increased levels of sialylation and high-mannose glycans, as well as decreased levels of fucosylation and highly branched N-glycans. When looking at the interface between cancer and its microenvironment, it seems that the cancer N-glycosylation signature spreads into the surrounding stroma at the invasive front of the tumor. This finding was more outspoken in patients with a worse outcome within this sample group.




i

Transcriptome and secretome analysis of intra-mammalian life-stages of the emerging helminth pathogen, Calicophoron daubneyi reveals adaptation to a unique host environment. [Research]

Paramphistomosis, caused by the rumen fluke, Calicophoron daubneyi, is a parasitic infection of ruminant livestock which has seen a rapid rise in prevalence throughout Western Europe in recent years. Following ingestion of metacercariae (parasite cysts) by the mammalian host, newly-excysted juveniles (NEJs) emerge and invade the duodenal submucosa which causes significant pathology in heavy infections. The immature larvae then migrate upwards, along the gastrointestinal tract, and enter the rumen where they mature and begin to produce eggs. Despite their emergence, and sporadic outbreaks of acute disease, we know little about the molecular mechanisms used by C. daubneyi to establish infection, acquire nutrients and to avoid the host immune response. Here, transcriptome analysis of four intra-mammalian life-cycle stages, integrated with secretome analysis of the NEJ and adult parasites (responsible for acute and chronic disease respectively), revealed how the expression and secretion of selected families of virulence factors and immunomodulators are regulated in accordance with fluke development and migration. Our data show that whilst a family of cathepsins B with varying S2 sub-site residues (indicating distinct substrate specificities) are differentially secreted by NEJs and adult flukes, cathepsins L and F are secreted in low abundance by NEJs only. We found that C. daubneyi has an expanded family of aspartic peptidases, which is up-regulated in adult worms, although they are underrepresented in the secretome. The most abundant proteins in adult fluke secretions were helminth defence molecules (HDMs) that likely establish an immune environment permissive to fluke survival and/or neutralise pathogen-associated molecular patterns (PAMPs) such as bacterial lipopolysaccharide in the microbiome-rich rumen. The distinct collection of molecules secreted by C. daubneyi allowed the development of the first coproantigen-based ELISA for paramphistomosis which, importantly, did not recognise antigens from other helminths commonly found as co-infections with rumen fluke.




i

Antibody binding epitope Mapping (AbMap) of hundred antibodies in a single run [Research]

Antibodies play essential roles in both diagnostics and therapeutics. Epitope mapping is essential to understand how an antibody works and to protect intellectual property. Given the millions of antibodies for which epitope information is lacking, there is a need for high-throughput epitope mapping. To address this, we developed a strategy, Antibody binding epitope Mapping (AbMap), by combining a phage displayed peptide library with next generation sequencing. Using AbMap, profiles of the peptides bound by 202 antibodies were determined in a single test, and linear epitopes were identified for >50% of the antibodies. Using spike protein (S1 and S2)-enriched antibodies from the convalescent serum of one COVID-19 patient as the input, both linear and conformational epitopes of spike protein specific antibodies were identified. We defined peptide-binding profile of an antibody as the Binding Capacity (BiC). Conceptually, the BiC could serve as a systematic and functional descriptor of any antibody. Requiring at least one order of magnitude less time and money to map linear epitopes than traditional technologies, AbMap allows for high-throughput epitope mapping and creates many possibilities.




i

Thermal proteome profiling in zebrafish reveals effects of napabucasin on retinoic acid metabolism [Research]

Thermal proteome profiling (TPP) allows for the unbiased detection of drug – target protein engagements in vivo. Traditionally, one cell type is used for TPP studies, with the risk of missing important differentially expressed target proteins. The use of whole organisms would circumvent this problem. Zebrafish embryos are amenable to such an approach. Here, we used TPP on whole zebrafish embryo lysate to identify protein targets of napabucasin, a compound that may affect Signal transducer and activator of transcription 3 (Stat3) signaling through an ill-understood mechanism. In zebrafish embryos, napabucasin induced developmental defects consistent with inhibition of Stat3 signaling. TPP profiling showed no distinct shift in Stat3 upon napabucasin treatment, but effects were detected on the oxidoreductase, Pora, which might explain effects on Stat3 signaling. Interestingly, thermal stability of several aldehyde dehydrogenases (Aldhs) was affected. Moreover, napabucasin activated ALDH enzymatic activity in vitro. Aldhs have crucial roles in retinoic acid metabolism and functionally we validated napabucasin-mediated activation of the retinoic acid pathway in zebrafish in vivo. We conclude that TPP profiling in whole zebrafish embryo lysate is feasible and facilitates direct correlation of in vivo effects of small molecule drugs with their protein targets.




i

Blockade of High-Fat Diet Proteomic Phenotypes using Exercise as Prevention or Treatment [Technological Innovation and Resources]

The increasing consumption of high-fat foods combined with a lack of exercise is a major contributor to the burden of obesity in humans. Aerobic exercise such as running is known to provide metabolic benefits, but how the over-consumption of a high fat diet (HFD) and exercise interact is not well characterized at the molecular level. Here, we examined the plasma proteome in mice for the effects of aerobic exercise as both a treatment and as a preventative regime for animals on either HFD or a healthy control diet. This analysis detected large changes in the plasma proteome induced by the HFD, such as increased abundance of SERPINA7, ALDOB, and down-regulation of SERPINA1E, CFD (adipsin). Some of these changes were significantly reverted using exercise as a preventative measure, but not as a treatment regime. To determine if either the intensity, or duration, of exercise influenced the outcome, we compared high-intensity interval training (HIIT) and endurance running. Endurance running slightly out-performed HIIT exercise, but overall, both provided similar reversion in abundance of plasma proteins modulated by the high-fat diet including SERPINA7, APOE, SERPINA1E, and CFD. Finally, we compared the changes induced by over-consumption of HFD to previous data from mice fed an isocaloric high saturated fat (SFA) or polyunsaturated fat (PUFA) diet. This identified several common changes including increased APOC2 and APOE, but also highlighted changes specific for either over-consumption of HFD (ALDOB, SERPINA7, CFD), SFA-based diets (SERPINA1E), or PUFA-based diets (Haptoglobin - Hp). Together, these data highlight the importance of early intervention with exercise to revert HFD-induced phenotypes and suggest some of the molecular mechanisms leading to the changes in the plasma proteome generated by high fat diet consumption. Web-based interactive visualizations are provided for this dataset (larancelab.com/hfd-exercise), which give insight into diet and exercise phenotypic interactions on the plasma proteome.




i

The complexity and dynamics of the tissue glycoproteome associated with prostate cancer progression [Research]

The complexity and dynamics of the immensely heterogeneous glycoproteome of the prostate cancer (PCa) tumour micro-environment remain incompletely mapped, a knowledge gap that impedes our molecular-level understanding of the disease. To this end, we have used sensitive glycomics and glycoproteomics to map the protein-, cell- and tumour grade-specific N- and O-glycosylation in surgically-removed PCa tissues spanning five histological grades (n = 10/grade) and tissues from patients with benign prostatic hyperplasia (n = 5). Quantitative glycomics revealed PCa grade-specific alterations of the oligomannosidic-, paucimannosidic- and branched sialylated complex-type N-glycans, and dynamic remodelling of the sialylated core 1- and core 2-type O-glycome. Deep quantitative glycoproteomics identified ~7,400 unique N-glycopeptides from 500 N-glycoproteins and ~500 unique O-glycopeptides from nearly 200 O-glycoproteins. With reference to a recent Tissue and Blood Atlas, our data indicate that paucimannosidic glycans of the PCa tissues arise mainly from immune cell-derived glycoproteins. Further, the grade-specific PCa glycosylation arises primarily from dynamics in the cellular makeup of the PCa tumour microenvironment across grades involving increased oligomannosylation of prostate-derived glycoproteins and decreased bisecting GlcNAcylation of N-glycans carried by the extracellular matrix proteins. Further, elevated expression of several oligosaccharyltransferase subunits and enhanced N-glycoprotein site occupancy were observed associated with PCa progression. Finally, correlations between the protein-specific glycosylation and PCa progression were observed including increased site-specific core 2-type O-glycosylation of collagen VI. In conclusion, integrated glycomics and glycoproteomics have enabled new insight into the complexity and dynamics of the tissue glycoproteome associated with PCa progression generating an important resource to explore the underpinning disease mechanisms.




i

Isolation of acetylated and unmodified protein N-terminal peptides by strong cation exchange chromatographic separation of TrypN-digested peptides [Technological Innovation and Resources]

We developed a simple and rapid method to enrich protein N-terminal peptides, in which the protease TrypN is first employed to generate protein N-terminal peptides without Lys or Arg and internal peptides with two positive charges at their N-termini, and then the N-terminal peptides with or without N-acetylation are separated from the internal peptides by strong cation exchange chromatography according to a retention model based on the charge/orientation of peptides. This approach was applied to 20 μg of human HEK293T cell lysate proteins to profile the N-terminal proteome. On average, 1,550 acetylated and 200 unmodified protein N-terminal peptides were successfully identified in a single LC/MS/MS run with less than 3% contamination with internal peptides, even when we accepted only canonical protein N-termini registered in the Swiss-Prot database. Since this method involves only two steps, protein digestion and chromatographic separation, without the need for tedious chemical reactions, it should be useful for comprehensive profiling of protein N-termini, including proteoforms with neo-N-termini.




i

Protein modification characteristics of the malaria parasite Plasmodium falciparum and the infected erythrocytes [Research]

Malaria elimination is still pending on the development of novel tools that rely on a deep understanding of parasite biology. Proteins of all living cells undergo a myriad number of posttranslational modifications (PTMs) that are critical to multifarious life processes. An extensive proteome-wide dissection revealed a fine PTM map of most proteins in both Plasmodium falciparum, the causative agent of severe malaria, and the infected red blood cells. More than two-thirds of proteins of the parasite and its host cell underwent extensive and dynamic modification throughout the erythrocytic developmental stage. PTMs critically modulate the virulence factors involved in the host-parasite interaction and pathogenesis. Furthermore, P. falciparum stabilized the supporting proteins of erythrocyte origin by selective de-modification. Collectively, our multiple omic analyses, apart from having furthered a deep understanding of the systems biology of P. falciparum and malaria pathogenesis, provide a valuable resource for mining new antimalarial targets.




i

On the robustness of graph-based clustering to random network alterations [Research]

Biological functions emerge from complex and dynamic networks of protein-protein interactions. Because these protein-protein interaction networks, or interactomes, represent pairwise connections within a hierarchically organized system, it is often useful to identify higher-order associations embedded within them, such as multi-member protein complexes. Graph-based clustering techniques are widely used to accomplish this goal, and dozens of field-specific and general clustering algorithms exist. However, interactomes can be prone to errors, especially when inferred from high-throughput biochemical assays. Therefore, robustness to network-level noise is an important criterion for any clustering algorithm that aims to generate robust, reproducible clusters. Here, we tested the robustness of a range of graph-based clustering algorithms in the presence of noise, including algorithms common across domains and those specific to protein networks. Strikingly, we found that all of the clustering algorithms tested here markedly amplified noise within the underlying protein interaction network. Randomly rewiring only 1% of network edges yielded more than a 50% change in clustering results, indicating that clustering markedly amplified network-level noise. Moreover, we found the impact of network noise on individual clusters was not uniform: some clusters were consistently robust to injected noise while others were not. To assist in assessing this, we developed the clust.perturb R package and Shiny web application to measure the reproducibility of clusters by randomly perturbing the network. We show that clust.perturb results are predictive of real-world cluster stability: poorly reproducible clusters as identified by clust.perturb are significantly less likely to be reclustered across experiments. We conclude that graph-based clustering amplifies noise in protein interaction networks, but quantifying the robustness of a cluster to network noise can separate stable protein complexes from spurious associations.




i

Peptidomics-driven strategy reveals peptides and predicted proteases associated with oral cancer prognosis [Research]

Protease activity has been associated with pathological processes that can lead to cancer development and progression. However, understanding the pathological unbalance in proteolysis is challenging since changes can occur simultaneously at protease, their inhibitor and substrate levels. Here, we present a pipeline that combines peptidomics, proteomics and peptidase predictions for studying proteolytic events in the saliva of seventy-nine patients and their association with oral squamous cell carcinoma (OSCC) prognosis. Our findings revealed differences in the saliva peptidome of patients with (pN+) or without (pN0) lymph node metastasis and delivered a panel of ten endogenous peptides correlated with poor prognostic factors plus five molecules able to classify pN0 and pN+ patients (ROC-AUC>0.85). In addition, endo- and exopeptidases putatively implicated in the processing of differential peptides were investigated using cancer tissue gene expression data from publicly repositories reinforcing their association with poorer survival rates and prognosis in oral cancer. The dynamics of the OSCC-related proteolysis was further explored via the proteomic profiling of saliva. This revealed that peptidase/endopeptidase inhibitors exhibited reduced levels in the saliva of pN+ patients, as confirmed by SRM-MS, whilst minor changes were detected in the level of saliva proteases. Taken together, our results indicated that proteolytic activity is accentuated in the saliva of OSCC patients with lymph node metastasis and, at least in part, this is modulated by reduced levels of salivary peptidase inhibitors. Therefore, this integrated pipeline provided better comprehension and discovery of molecular features with implications in the oral cancer metastasis prognosis.




i

Proteomic identification of Coxiella burnetii effector proteins targeted to the host cell mitochondria during infection [Research]

Modulation of the host cell is integral to the survival and replication of microbial pathogens. Several intracellular bacterial pathogens deliver bacterial proteins, termed ‘effector proteins’ into the host cell during infection by sophisticated protein translocation systems, which manipulate cellular processes and functions. The functional contribution of individual effectors is poorly characterised, particularly in intracellular bacterial pathogens with large effector protein repertoires. Technical caveats have limited the capacity to study these proteins during a native infection, with many effector proteins having only been demonstrated to be translocated during over-expression of tagged versions. Here we developed a novel strategy to examine effector proteins in the context of infection. We coupled a broad, unbiased proteomics-based screen with organelle purification to study the host-pathogen interactions occurring between the host cell mitochondrion and the Gram-negative, Q fever pathogen Coxiella burnetii. We identify 4 novel mitochondrially-targeted C. burnetii effector proteins, renamed Mitochondrial Coxiella effector protein (Mce) B to E. Examination of the subcellular localisation of ectopically expressed proteins confirmed their mitochondrial localisation, demonstrating the robustness of our approach. Subsequent biochemical analysis and affinity enrichment proteomics of one of these effector proteins, MceC, revealed the protein localises to the inner membrane and can interact with components of the mitochondrial quality control machinery. Our study adapts high-sensitivity proteomics to study intracellular host-pathogen interactions, providing a robust strategy to examine the sub-cellular localisation of effector proteins during native infection. This approach could be applied to a range of pathogens and host cell compartments to provide a rich map of effector dynamics throughout infection.




i

Accelerating the field of epigenetic histone modification through mass spectrometry-based approaches [Review]

Histone post-translational modifications (PTMs) are one of the main mechanisms of epigenetic regulation. Dysregulation of histone PTMs leads to many human diseases, such as cancer. Due to its high-throughput, accuracy, and flexibility, mass spectrometry (MS) has emerged as a powerful tool in the epigenetic histone modification field, allowing the comprehensive and unbiased analysis of histone PTMs and chromatin-associated factors. Coupled with various techniques from molecular biology, biochemistry, chemical biology and biophysics, MS has been employed to characterize distinct aspects of histone PTMs in the epigenetic regulation of chromatin functions. In this review we will describe advancements in the field of MS that have facilitated the analysis of histone PTMs and chromatin biology.  




i

Quantitative proteomics reveal neuron projection development genes ARF4, KIF5B and RAB8A associated with Hirschsprung disease [Research]

Hirschsprung disease (HSCR) is a heterogeneous group of neurocristopathy characterized by the absence of the enteric ganglia along a variable length of the intestine. Genetic defects play a major role in the pathogenesis of HSCR while family studies of pathogenic variants in all the known genes (loci) only demonstrate incomplete penetrance and variable expressivity for unknown reasons. Here, we applied large-scale, quantitative proteomics of human colon tissues from 21 patients using iTRAQ method followed by bioinformatics analysis. Selected findings were confirmed by parallel reaction monitoring (PRM) verification. At last the interesting differentially expressed proteins were confirmed by western blot. A total of 5341 proteins in human colon tissues were identified. Among them, 664 proteins with >1.2-fold difference were identified in 6 groups: groups A1 and A2 pooled protein from the ganglionic and aganglionic colon of male, long-segment HSCR patients (L-HSCR, n=7); groups B1 and B2 pooled protein from the ganglionic and aganglionic colon of male, short-segment HSCR patients (S-HSCR, n=7); and groups C1 and C2 pooled protein from the ganglionic and aganglionic colon of female, S-HSCR patients (n=7). Based on these analyses, 49 proteins from 5 pathways were selected for PRM verification, including ribosome, endocytosis, spliceosome, oxidative phosphorylation and cell adhesion. The downregulation of three neuron projection development genes ARF4, KIF5B and RAB8A in the aganglionic part of the colon were verified in 15 paired colon samples using WB. The findings of this study will shed new light on the pathogenesis of HSCR and facilitate the development of therapeutic targets.




i

Thyroglobulin interactome profiling defines altered proteostasis topology associated with thyroid dyshormonogenesis [Research]

Thyroglobulin (Tg) is a secreted iodoglycoprotein serving as the precursor for T3 and T4 hormones. Many characterized Tg gene mutations produce secretion-defective variants resulting in congenital hypothyroidism (CH). Tg processing and secretion is controlled by extensive interactions with chaperone, trafficking, and degradation factors comprising the secretory proteostasis network. While dependencies on individual proteostasis network components are known, the integration of proteostasis pathways mediating Tg protein quality control and the molecular basis of mutant Tg misprocessing remain poorly understood. We employ a multiplexed quantitative affinity purification–mass spectrometry approach to define the Tg proteostasis interactome and changes between WT and several CH-variants. Mutant Tg processing is associated with common imbalances in proteostasis engagement including increased chaperoning, oxidative folding, and engagement by targeting factors for ER-associated degradation (ERAD). Furthermore, we reveal mutation-specific changes in engagement with N-glycosylation components, suggesting distinct requirements for one Tg variant on dual engagement of both oligosaccharyltransferase complex isoforms for degradation. Modulating dysregulated proteostasis components and pathways may serve as a therapeutic strategy to restore Tg secretion and thyroid hormone biosynthesis.




i

Proteome analysis reveals a significant host-specific response in Rhizobium leguminosarum bv viciae endosymbiotic cells [Research]

The Rhizobium-legume symbiosis is a beneficial interaction in which the bacterium converts atmospheric nitrogen into ammonia and delivers it to the plant in exchange for carbon compounds. This symbiosis implies the adaptation of bacteria to live inside host plant cells. In this work we apply RP-LC-MS/MS and  iTRAQ techniques to study the proteomic profile of endosymbiotic cells (bacteroids) induced by Rhizobium leguminosarum bv viciae strain UPM791 in legume nodules. Nitrogenase subunits, tricarboxylic acid cycle enzymes, and stress response proteins are amongst the most abundant from over one thousand rhizobial proteins identified in pea (Pisum sativum) bacteroids. Comparative analysis of bacteroids induced in pea and in lentil (Lens culinaris)nodules revealed the existence of a significant host-specific differential response affecting dozens of bacterial proteins, including stress-related proteins, transcriptional regulators, and proteins involved in the carbon and nitrogen metabolisms. A mutant affected in one of these proteins, homologous to a GntR-like transcriptional regulator, showed a symbiotic performance significantly  impaired in symbiosis with pea, but not with lentil plants. Analysis of the proteomes of bacteroids isolated from both hosts also revealed the presence of different sets of plant-derived nodule-specific cysteine rich (NCR) peptides, indicating that the endosymbiotic bacteria find a host-specific cocktail of chemical stressors inside the nodule. By studying variations of the bacterial response to different plant cell environments we will be able to identify specific limitations imposed by the host that might give us clues for the improvement of rhizobial performance.




i

Imaging Mass Spectrometry and Lectin Analysis of N-linked Glycans in Carbohydrate Antigen Defined Pancreatic Cancer Tissues [Research]

The early detection of pancreatic ductal adenocarcinoma is a complex clinical obstacle yet is key to improving the overall likelihood of patient survival. Current and prospective carbohydrate biomarkers CA19-9 and sTRA are sufficient for surveilling disease progression yet are not approved for delineating PDAC from other abdominal cancers and non-cancerous pancreatic pathologies. To further understand these glycan epitopes, an imaging mass spectrometry approach was utilized to assess the N-glycome of the human pancreas and pancreatic cancer in a cohort of PDAC patients represented by tissue microarrays and whole tissue sections. Orthogonally, these same tissues were characterized by multi-round immunofluorescence which defined expression of CA19-9 and sTRA as well as other lectins towards carbohydrate epitopes with the potential to improve PDAC diagnosis. These analyses revealed distinct differences not only in N-glycan spatial localization across both healthy and diseased tissues but importantly between different biomarker-categorized tissue samples. Unique sulfated bi-antennary N-glycans were detected specifically in normal pancreatic islets. N-glycans from CA19-9 expressing tissues tended to be bi-, tri- and tetra-antennary structures with both core and terminal fucose residues and bisecting N-acetylglucosamines. These N-glycans were detected in less abundance in sTRA-expressing tumor tissues, which favored tri- and tetra-antennary structures with polylactosamine extensions. Increased sialylation of N-glycans was detected in all tumor tissues. A candidate new biomarker derived from IMS was further explored by fluorescence staining with selected lectins on the same tissues. The lectins confirmed the expression of the epitopes in cancer cells and revealed different tumor-associated staining patterns between glycans with bisecting GlcNAc and those with terminal GlcNAc. Thus, the combination of lectin-IHC and IMS techniques produces more complete information for tumor classification than the individual analyses alone. These findings potentiate the development of early assessment technologies to rapidly and specifically identify PDAC in the clinic that may directly impact patient outcomes.




i

Proteogenomic characterization of the pathogenic fungus Aspergillus flavus reveals novel genes involved in aflatoxin production [Research]

Aspergillus flavus (A. flavus), a pathogenic fungus, can produce carcinogenic and toxic aflatoxins that are a serious agricultural and medical threat worldwide. Attempts to decipher the aflatoxin biosynthetic pathway have been hampered by the lack of a high-quality genome annotation for A. flavus. To address this gap, we performed a comprehensive proteogenomic analysis using high-accuracy mass spectrometry data for this pathogen. The resulting high-quality dataset confirmed the translation of 8,724 previously-predicted genes, and identified 732 novel proteins, 269 splice variants, 447 single amino acid variants, 188 revised genes. A subset of novel proteins was experimentally validated by RT-PCR and synthetic peptides. Further functional annotation suggested that a number of the identified novel proteins may play roles in aflatoxin biosynthesis and stress responses in A. flavus. This comprehensive strategy also identified a wide range of post-translational modifications (PTMs), including 3,461 modification sites from 1,765 proteins. Functional analysis suggested the involvement of these modified proteins in the regulation of cellular metabolic and aflatoxin biosynthetic pathways. Together, we provided a high quality annotation of A. flavus genome and revealed novel insights into the mechanisms of aflatoxin production and pathogenicity in this pathogen.




i

Proteome Turnover in the Spotlight: Approaches, Applications & Perspectives [Review]

In all cells, proteins are continuously synthesized and degraded in order to maintain protein homeostasis and modify gene expression levels in response to stimuli. Collectively, the processes of protein synthesis and degradation are referred to as protein turnover. At steady state, protein turnover is constant to maintain protein homeostasis, but in dynamic responses, proteins change their rates of synthesis and degradation in order to adjust their proteomes to internal or external stimuli. Thus, probing the kinetics and dynamics of protein turnover lends insight into how cells regulate essential processes such as growth, differentiation, and stress response. Here we outline historical and current approaches to measuring the kinetics of protein turnover on a proteome-wide scale in both steady-state and dynamic systems, with an emphasis on metabolic tracing using stable-isotope-labeled amino acids. We highlight important considerations for designing proteome turnover experiments, key biological findings regarding the conserved principles of proteome turnover regulation, and future perspectives for both technological and biological investigation.




i

Prediction and validation of mouse meiosis-essential genes based on spermatogenesis proteome dynamics [Research]

The molecular mechanism associated with mammalian meiosis has yet to be fully explored, and one of the main reasons for this lack of exploration is that some meiosis-essential genes are still unknown. The profiling of gene expression during spermatogenesis has been performed in previous studies, yet few studies have aimed to find new functional genes. Since there is a huge gap between the number of genes that are able to be quantified and the number of genes that can be characterized by phenotype screening in one assay, an efficient method to rank quantified genes according to phenotypic relevance is of great importance. We proposed to rank genes by the probability of their function in mammalian meiosis based on global protein abundance using machine learning. Here, nine types of germ cells focusing on continual substages of meiosis prophase I were isolated, and the corresponding proteomes were quantified by high-resolution mass spectrometry. By combining meiotic labels annotated from the MGI mouse knockout database and the spermatogenesis proteomics dataset, a supervised machine learning package, FuncProFinder, was developed to rank meiosis-essential candidates. Of the candidates whose functions were unannotated, four of ten genes with the top prediction scores, Zcwpw1, Tesmin, 1700102P08Rik and Kctd19, were validated as meiosis-essential genes by knockout mouse models. Therefore,  mammalian meiosis-essential genes could be efficiently predicted based on the protein abundance dataset, which provides a paradigm for other functional gene mining from a related abundance dataset.




i

Proteomic analyses identify differentially expressed proteins and pathways between low-risk and high-risk subtypes of early-stage lung adenocarcinoma and their prognostic impacts [Research]

The histopathological subtype of lung adenocarcinoma (LUAD) is closely associated with prognosis. Micropapillary or solid predominant LUAD tends to relapse after surgery at an early stage, whereas lepidic pattern shows a favorable outcome. However, the molecular mechanism underlying this phenomenon remains unknown. Here, we recruited 31 lepidic predominant LUADs (LR: low-risk subtype group) and 28 micropapillary or solid predominant LUADs (HR: high-risk subtype group). Tissues of these cases were obtained and label-free quantitative proteomic and bioinformatic analyses were performed. Additionally, prognostic impact of targeted proteins was validated using The Cancer Genome Atlas databases (n=492) and tissue microarrays composed of early-stage LUADs (n=228). A total of 192 differentially expressed proteins were identified between tumor tissues of LR and HR and three clusters were identified via hierarchical clustering excluding eight proteins. Cluster 1 (65 proteins) showed a sequential decrease in expression from normal tissues to tumor tissues of LR and then to HR and was predominantly enriched in pathways such as tyrosine metabolism and ECM-receptor interaction, and increased matched mRNA expression of 18 proteins from this cluster predicted favorable prognosis. Cluster 2 (70 proteins) demonstrated a sequential increase in expression from normal tissues to tumor tissues of LR and then to HR and was mainly enriched in pathways such as extracellular organization, DNA replication and cell cycle, and high matched mRNA expression of 25 proteins indicated poor prognosis. Cluster 3 (49 proteins) showed high expression only in LR, with high matched mRNA expression of 20 proteins in this cluster indicating favorable prognosis. Furthermore, high expression of ERO1A and FEN1 at protein level predicted poor prognosis in early-stage LUAD, supporting the mRNA results. In conclusion, we discovered key differentially expressed proteins and pathways between low-risk and high-risk subtypes of early-stage LUAD. Some of these proteins could serve as potential biomarkers in prognostic evaluation.




i

A proteomic approach to understand the clinical significance of acute myeloid leukemia-derived extracellular vesicles reflecting essential characteristics of leukemia [Research]

Extracellular vesicle (EV) proteins from acute myeloid leukemia (AML) cell lines were analyzed using mass spectrometry. The analyses identified 2450 proteins, including 461 differentially expressed proteins (290 upregulated and 171 downregulated). CD53 and CD47 were upregulated and were selected as candidate biomarkers. The association between survival of patients with AML and the expression levels of CD53 and CD47 at diagnosis was analyzed using mRNA expression data from The Cancer Genome Atlas database. Patients with higher expression levels showed significantly inferior survival than those with lower expression levels. Enzyme-linked immunosorbent assay results of the expression levels of CD53 and CD47 from EVs in the bone marrow of patients with AML at diagnosis and at the time of complete remission with induction chemotherapy revealed that patients with downregulated CD53 and CD47 expression appeared to relapse less frequently. Network model analysis of EV proteins revealed several upregulated kinases, including LYN, CSNK2A1, SYK, CSK, and PTK2B. The potential cytotoxicity of several clinically applicable drugs that inhibit these kinases was tested in AML cell lines. The drugs lowered the viability of AML cells. The collective data suggest that AML-derived EVs could reflect essential leukemia biology.




i

PTM-Shepherd: analysis and summarization of post-translational and chemical modifications from open search results [Technological Innovation and Resources]

Open searching has proven to be an effective strategy for identifying both known and unknown modifications in shotgun proteomics experiments. Rather than being limited to a small set of user-specified modifications, open searches identify peptides with any mass shift that may correspond to a single modification or a combination of several modifications. Here we present PTM-Shepherd, a bioinformatics tool that automates characterization of PTM profiles detected in open searches based on attributes such as amino acid localization, fragmentation spectra similarity, retention time shifts, and relative modification rates. PTM-Shepherd can also perform multi-experiment comparisons for studying changes in modification profiles, e.g. in data generated in different laboratories or under different conditions. We demonstrate how PTM-Shepherd improves the analysis of data from formalin-fixed paraffin-embedded samples, detects extreme underalkylation of cysteine in some datasets, discovers an artefactual modification introduced during peptide synthesis, and uncovers site-specific biases in sample preparation artifacts in a multi-center proteomics profiling study.




i

The Mechanism of NEDD8 Activation of CUL5 Ubiquitin E3 Ligases [Research]

Cullin RING E3 Ligases (CRLs) ubiquitylate hundreds of important cellular substrates. Here we have assembled and purified the Ankyrin repeat and SOCS Box protein 9 CUL5 RBX2 Ligase (ASB9-CRL) in vitro and show how it ubiquitylates one of its substrates, CKB. CRLs occasionally collaborate with RING between RING E3 ligases (RBRLs) and indeed, mass spectrometry analysis showed that CKB is specifically ubiquitylated by the ASB9-CRL-ARIH2-UBE2L3 complex. Addition of other E2s such as UBE2R1 or UBE2D2 contribute to polyubiquitylation but do not alter the sites of CKB ubiquitylation. Hydrogen-deuterium exchange mass spectrometry (HDX-MS) analysis revealed that CUL5 neddylation allosterically exposes its ARIH2 binding site, promoting high affinity binding, and it also sequesters the NEDD8 E2 (UBE2F) binding site on RBX2. Once bound, ARIH2 helices near the Ariadne domain active site are exposed, presumably relieving its autoinhibition. These results allow us to propose a model of how neddylation activates ASB-CRLs to ubiquitylate their substrates.




i

The peptide vaccine of the future [Review]

The approach of peptide-based anti-cancer vaccination has proven the ability to induce cancer-specific immune responses in multiple studies for various cancer entities. However, clinical responses remain so far limited to single patients and broad clinical applicability was not achieved. Therefore, further efforts are required to improve peptide vaccination in order to integrate this low side effect therapy into the clinical routine of cancer therapy. To design clinically effective peptide vaccines in the future, different issues have to be addressed and optimized comprising antigen target selection as well as choice of optimal adjuvants and vaccination schedules. Furthermore, the combination of peptide-based vaccines with other immuno- and molecular targeted therapies as well as the development of predictive biomarkers could further improve efficacy. In this review, current approaches in the development of peptide-based vaccines and critical implications for optimal vaccine design are discussed.