WEB-based GEne SeT AnaLysis Toolkit

Translating gene lists into biological insights...


Introduction

WebGestalt (WEB-based Gene SeT AnaLysis Toolkit) is a functional enrichment analysis web tool, which has on average 26,000 unique users from 144 countries and territories per year according to Google Analytics. The WebGestalt 2005, WebGestalt 2013 and WebGestalt 2017 papers have been cited in more than 2,500 scientific papers according to Google Scholar.

WebGestalt 2019 significantly improved the output report with emphasis on providing user-friendly interfaces which could directly translate into publication-ready figures. The R package WebGestaltR has been updated to work with the new verison, which provides an interface to integrate into other pipelines or run batch jobs locally. We also support loading data from third-party websites or services through an API to perform enrichment analysis. WebGestalt supports three well-established and complementary methods for enrichment analysis, including Over-Representation Analysis (ORA), Gene Set Enrichment Analysis (GSEA), and Network Topology-based Analysis (NTA).

Data source

Data sources for WebGestalt 2019 was updated on 01/14/2019, which supports 12 organisms, 354 gene identifiers from various databases and technology platforms, and 321,251 functional categories from public databases and computational analyses. Experimental data from organisms or with gene identifiers not covered by the WebGestalt database can also be analyzed in WebGestalt. We recently added phophosite data for kinase target enrichment analysis. Information in this version was collected from the following resources:

  • ID mapping
  • Affymetrix, Agilent, Illumina, ABI SOLid (Accessed in November 2018), NCBI Gene (Accessed on 01/14/2019), Biomart (Ensembl 94, Accessed in November 2018), dbSNP (Version 151, 03/22/2018).

  • Functional categories
    • Gene Ontology (Daily build accessed on 01/14/2019.)

    • Pathway
    • KEGG (Release 88.2, 11/01/2018), WikiPathways (Release 02/10/2020), Reactome (Version 66, September 2018), PANTHER (v3.6.1, 01/22/2018)

    • Network
    • Hierarchical mRNA co-expression modules: The modules are computationally derived from the RNA-Seq data sets across 33 TCGA (The Cancer Genome Atlas, Release 01/28/2016) and 6 CPTAC (Clinical Proteomic Tumor Analysis Consortium) cancer types. Based on the method described in our recently published paper (Proteome profiling outperforms transcriptome profiling for co-expression based gene function prediction), we first constructed the consensus co-expression network for each cancer type and then used NetSAM to identify the hierarchical co-expression modules.

      Hierarchical protein interaction modules: The modules are computationally derived from the protein-protein interaction networks downloaded from BioGRID (Build 3.5.167, December 2018) using NetSAM.

      microRNA target: MSigDB (MSigDB database v6.2, July 2018)

      Transcript factor target: MSigDB (MSigDB database v6.2, July 2017)

      Mammalian protein complexes: CORUM (Release 3.0, 09/03/2018)

    • Phenotype
    • Human Phenotype Ontology (Monthly build 201810), Mammalian Phenotype Ontology (Accessed on 11/14/2018)

    • Disease
    • DisgeNET (Version 5.0, 05/28/2017)
      GLAD4U: Disease terms were downloaded from PharmGKB (Accessed in November 2018). Genes associated with individual disease term were inferred using GLAD4U.

    • Drug
    • DrugBank (Version 5.1.1, 07/03/1028)
      GLAD4U: Drug terms were downloaded from PharmGKB (Accessed on Jan 2017). Genes associated with individual drug term were inferred using GLAD4U.

    • Chromosomal location (NCBI, Accessed on 12/20/2018)

    • Phosphosite
    • Kinase-specific phosphorylation sites are from RegPhos 2.0

      PTM signatures are from PTMsigDB v1.8.1

News

  • Network modules of 3 CPTAC3 cancer cohorts are added. (06/02/2021)
  • Parameter p is available for GSEA. (12/09/2019)
  • GSEA enrichment plot is now available in SVG. (09/26/2019)
  • WebGestalt 2019 manuscript is online. Please consider citing the lastest publication. (05/22/2019)
  • Input validation of ID mapping is added. (04/24/2019)
  • WebGestalt now supports multilple databases combination for ORA. GSEA is supported in WebGestaltR package. (04/22/2019)
  • WebGestalt 2019 is now online. (01/17/2019)