Skip to main page content (VSports注册入口)
U.S. flag

An official website of the United States government

Dot gov

The . gov means it’s official. Federal government websites often end in VSports app下载. gov or . mil. Before sharing sensitive information, make sure you’re on a federal government site. .

Https

The site is secure V体育官网. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely. .

. 2019 Dec 12;10(1):5679.
doi: 10.1038/s41467-019-13528-0.

"VSports注册入口" Pan-cancer molecular subtypes revealed by mass-spectrometry-based proteomic characterization of more than 500 human cancers

Affiliations

Pan-cancer molecular subtypes revealed by mass-spectrometry-based proteomic characterization of more than 500 human cancers

Fengju Chen et al. Nat Commun. .

Abstract

Mass-spectrometry-based proteomic profiling of human cancers has the potential for pan-cancer analyses to identify molecular subtypes and associated pathway features that might be otherwise missed using transcriptomics. Here, we classify 532 cancers, representing six tissue-based types (breast, colon, ovarian, renal, uterine), into ten proteome-based, pan-cancer subtypes that cut across tumor lineages VSports手机版. The proteome-based subtypes are observable in external cancer proteomic datasets surveyed. Gene signatures of oncogenic or metabolic pathways can further distinguish between the subtypes. Two distinct subtypes both involve the immune system, one associated with the adaptive immune response and T-cell activation, and the other associated with the humoral immune response. Two additional subtypes each involve the tumor stroma, one of these including the collagen VI interacting network. Three additional proteome-based subtypes-respectively involving proteins related to Golgi apparatus, hemoglobin complex, and endoplasmic reticulum-were not reflected in previous transcriptomics analyses. A data portal is available at UALCAN website. .

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

"V体育官网入口" Figures

Fig. 1
Fig. 1. Proteomic and transcriptomic human tumor datasets and associated gene features used in this study.
a We used the CPTAC Confirmatory/Discovery dataset, of 532 cancer cases (cases not represented in TCGA) for proteome-based, pan-cancer subtype discovery (see Fig. 2). For the following datasets of TCGA cases, we classified cases according to the CPTAC Confirmatory/Discovery-based subtypes (see Fig. 3): CPTAC-TCGA dataset (TCGA cases for which mass-spectrometry-based proteomic profiling by CPTAC was carried out), TCGA RPPA dataset (TCGA cases profiled for a focused panel of proteins by RPPA platform), and TCGA pan32 mRNA dataset (TCGA cases with RNA-sequencing data). For TCGA datasets, Venn diagram represents shared cases. b Among CPTAC Confirmatory/Discovery, TCGA RPPA, and TCGA pan32 mRNA datasets, numbers of shared gene features (protein or mRNA levels). CPTAC proteomic and TCGA transcriptomic data, as provided by their respective public data portals, were processed at the gene level, rather than at the protein isoform or mRNA transcript levels. See also Supplementary Data S1 and Supplementary Fig. 1.
Fig. 2
Fig. 2. De novo pan-cancer molecular subtypes as defined by mass-spectrometry-based proteomics.
a By ConsensusClusterPlus of 532 proteomic profiles in the CPTAC Confirmation/Discovery cohort, 10 proteomic-based subtypes—k1 through k10—were identified (columns). For these same cases, pan-cancer class assignments—c1 through c10—based on the previous pan32 mRNA-based discovery were also made (rows, mapping the previous pan32 mRNA classifier to CPTAC protein expression patterns). Significances of overlap between the two sets of classifications are represented. P-values by one-sided Fisher’s exact test. b For CPTAC Confirmation/Discovery cohort, differential expression patterns (values normalized within each tissue-based cancer type; SD, standard deviation from the median) for a set of 1000 proteins (top heat map) and for a set of 500 phospho-proteins (bottom heat map) found to best distinguish between the 10 proteome-based subtypes (see the “Methods” section, top 100 over-expressed proteins for each subtype). Proteins highlighted by name have GO annotation “cell surface receptor signaling pathway” and DrugBank association (lists provide examples of differentially expressed proteins but these would not necessarily have more importance over the other proteins in the heat map, full lists provided in Supplementary Datas 2 and 3). c For the TCGA pan32 cohort (n = 10,224 cases), we made CPTAC-based pan-cancer subtype assignments (columns, mapping the CPTAC protein expression patterns to TCGA mRNA patterns). Significances of overlap between the CPTAC-based subtypes (columns, k1 through k10) to the previous pan32 mRNA-based pan-cancer class assignments (rows, c1 through c10) are represented. P-values by one-sided Fisher’s exact test. d For each cancer type represented in CPTAC Confirmation/Discovery cohort, distributions by proteome-based subtype. e For the top over-expressed proteins associated with each subtype (from part b, top panel), represented categories by GO were assessed, with selected enriched categories represented here. P-values by one-sided Fisher’s exact test. See also Supplementary Figs. 2–4 and Supplementary Data S2 and S3 and S4.
Fig. 3
Fig. 3. Observation of CPTAC pan-cancer proteome-based subtypes in additional multi-cancer protein expression profiling datasets.
a The 364 TCGA cases with mass-spectrometry-based proteomic data from CPTAC were classified according to proteome-based pan-cancer subtype as originally defined using CPTAC Confirmatory/Discovery cohort. Expression patterns for the top set of 757 proteins distinguishing between the 10 subtypes (from Fig. 2a, based on available data) are shown for both CPTAC Confirmatory/Discovery and CPTAC-TCGA proteomic datasets (values normalized within each tissue-based cancer type; SD, standard deviation from the median). Gene patterns in the CPTAC-TCGA sample profiles sharing similarity with a subtype-specific signature pattern are highlighted. b The 7694 TCGA cases with reverse-phase protein array (RPPA) data were classified according to proteome-based pan-cancer subtype. Expression patterns for a top set of 99 proteins distinguishing between the 10 subtypes (see the section “Methods”, based on available data) are shown for both CPTAC Confirmatory/Discovery and TCGA RPPA proteomic datasets. Gene patterns in the RPPA sample profiles sharing similarity with a subtype-specific signature pattern are highlighted. Proteins highlighted by name were individually significantly associated with the given subtype (P < 0.001, t-test) in TCGA RPPA dataset. c Significances of overlap between the proteome-based subtype assignments made for the CPTAC-TCGA dataset (columns), with proteome-based subtype assignments for the TCGA RPPA dataset (rows), based on the 345 cases represented in both datasets. P-values by one-sided Fisher’s exact test. d Significances of overlap between the proteome-based subtype assignments made for the TCGA RPPA dataset (columns), with subtype assignments for the transcriptome profiles in TCGA pan32 cohort (rows, mapping the CPTAC protein expression patterns to TCGA mRNA patterns), based on the 7206 cases represented in both datasets. P-values by one-sided Fisher’s exact test. Patient-level subtyping and cancer type information for all datasets represented are provided in Supplementary Data 1. See also Supplementary Figs. 5–7.
Fig. 4
Fig. 4. Proteome-based subtype-specific differences involving metabolic pathways.
a For CPTAC Confirmatory/Discovery proteomic dataset and for TCGA pan32 mRNA dataset, pathway-associated gene signatures (using values normalized within each cancer type; SD, standard deviation from the median). For each dataset, purple-cyan heat maps denote t-statistics for comparing the given subtype versus the other tumors (bright purple/cyan, highly significant; black, not significant; shades close to black, borderline significant). Selected pathways surveyed by signatures included several related to metabolism (FA fatty acid; GNG gluconeogenesis; TCA tricarboxylic acid; OX-PHOS oxidative phosphorylation). b Pathway diagram representing core metabolic pathways, with differential expression patterns represented (using values normalized within cancer type), comparing tumors in pan-cancer subtypes k1, k5, or k7 with the rest of the tumors. For each protein represented, the top portion represents results from differential protein analysis (CPTAC Confirmatory/Discovery proteomic dataset) and the bottom portion represents results from differential mRNA analysis (TCGA pan32 mRNA dataset). Red denotes significantly higher expression in k1/k6/k8 and blue denotes significantly lower expression. See also Supplementary Fig. 8.
Fig. 5
Fig. 5. Immune system-related differences underscore k2 and k3 proteome-based subtypes.
a For a set of 162 immune-related proteins (FDR < 5% for either k2 or k3 subtypes and association with one of the indicated GO annotation categories), heat maps of differential protein expression patterns (expression values normalized within cancer type; SD, standard deviation from the median), across CPTAC Confirmatory/Discovery proteomic profiles, ordered by subtype. Purple-cyan heat map denotes t-statistics for comparing the given subtype versus the other tumors (bright purple/cyan, highly significant; black, not significant; shades close to black, borderline significant). Proteins found specifically expressed in immune-related tissues (according to Human Protein Atlas, or HPA, www.proteinatlas.org) are indicated. b Heat maps of gene expression-based signatures of immune cell infiltrates, across CPTAC Confirmatory/Discovery proteomic profiles, ordered by subtype (expression values normalized within cancer type; SD, standard deviation from the median). Purple-cyan heat map denotes t-statistics for comparing the given subtype versus the other tumors. APM1/APM2, antigen presentation on MHC class I/class II, respectively; DC dendritic cells; iDC immature DCs; aDC activated DCs; NK cells natural killer cells; Tcm cells T central memory cells; Tem cells T effector memory cells. c Diagram of immune cell types and associated protein markers. Red denotes significantly higher expression in k2 or k3 subtypes as indicated, and blue denotes significantly lower expression. FDR false discovery rate. d Similar to part c, but for complement activation pathway. See also Supplementary Fig. 9.
Fig. 6
Fig. 6. Tumor stroma-related differences underscore k6 and k7 proteome-based subtypes.
a For a set of 606 extracellular matrix-related proteins (FDR < 5% for either k6 or k7 subtypes and association with one of the indicated GO annotation categories), heat maps of differential protein expression patterns (expression values normalized within cancer type; SD, standard deviation from the median), across CPTAC Confirmatory/Discovery proteomic profiles, ordered by subtype. Purple-cyan heat map denotes t-statistics for comparing the given subtype versus the other tumors (bright purple/cyan, highly significant; black, not significant; shades close to black, borderline significant). Selected proteins of interest are listed by name. b Protein–protein interaction networks involving the top proteins over-expressed in k6 tumors (top network, using cutoff of FDR < 1E−6) and the top proteins over-expressed in k7 tumors (bottom network, using cutoff of FDR < 1E−14). Nodes represent proteins that were found over-expressed in either k6 or k7 subtypes as indicated. Red node coloring denotes significantly higher expression in k6 or k7 subtypes as indicated, and blue coloring denotes significantly lower expression. A line between two nodes signifies that the corresponding protein products of the genes can physically interact (according to the literature, from Entrez gene interactions database). Colored edges (other than gray) denote a common GO term annotation shared by both of the connected proteins. c Diagram of collagen VI interactions and associated proteins. Red denotes significantly higher expression in k6 or k7 subtypes as indicated, and blue denotes significantly lower expression. FDR false discovery rate. See also Supplementary Fig. 10 and Supplementary Data S5.
Fig. 7
Fig. 7. Overview of k8, k9, and k10 proteome-based subtypes.
a For a set of 59 Golgi apparatus-related proteins elevated in the k8 subtype (FDR < 5%), heat maps of differential protein expression patterns (expression values normalized within cancer type; SD, standard deviation from the median), across CPTAC Confirmatory/Discovery proteomic profiles ordered by subtype. Listed proteins are the subset highest in k8 over all other subtypes. b For a set of nine hemoglobin complex proteins elevated in the k9 subtype (FDR < 1%), heat maps of differential protein expression patterns, across CPTAC Confirmatory/Discovery proteomic profiles ordered by subtype. c For a set of 154 endoplasmic reticulum-related proteins elevated in the k10 subtype (FDR < 1%), heat maps of differential protein expression patterns, across CPTAC Confirmatory/Discovery proteomic profiles ordered by subtype. Listed proteins are those among the top 50 most over-expressed for the k10 subtype. d Diagram of steroid biosynthesis pathway and associated proteins (from KEGG database). Red denotes significantly higher expression in the k10 subtype. e Protein–protein interaction networks involving the top proteins over-expressed in k8 tumors (top network), the top proteins over-expressed in k9 tumors (middle network), and the top proteins over-expressed in k10 tumors (bottom network). Nodes represent proteins that were found over-expressed in the given subtype. Nodes are colored according to patterns of differential expression in additional cohorts (left, protein data from CPTAC-TCGA cohort; right, mRNA data from TCGA pan32 cohort). A line between two nodes signifies that the corresponding protein products of the genes can physically interact (according to the literature, from Entrez gene interactions database). Colored edges (other than gray) denote a common GO term annotation shared by both of the connected proteins. FDR false discovery rate.

References

    1. Perou C, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. - VSports注册入口 - PubMed
    1. Chen F, et al. Pan-cancer molecular classes transcending tumor lineage across 32 cancer types, multiple data platforms, and over 10,000 cases. Clin. Cancer Res. 2018;24:2182–2193. doi: 10.1158/1078-0432.CCR-17-3378. - DOI - PMC - PubMed
    1. Chen G, et al. Discordant protein and mRNA expression in lung adenocarcinomas. Mol. Cell. Proteom. 2002;1:304–313. doi: 10.1074/mcp.M200008-MCP200. - "VSports注册入口" DOI - PubMed
    1. Edwards N, et al. The CPTAC data portal: a resource for cancer proteomics research. J. Proteome Res. 2015;14:2707–2713. doi: 10.1021/pr501254j. - DOI - PubMed
    1. Akbani R, et al. A pan-cancer proteomic perspective on The Cancer Genome Atlas. Nat. Commun. 2014;5:3887. doi: 10.1038/ncomms4887. - DOI - PMC - PubMed

"V体育2025版" Publication types

"VSports app下载" Substances

LinkOut - more resources