4.2. Side menu and main table

4.2.1. Components in side menu

The side menu on the left of main report body provides an interface for users to control and adjust parameters in CGAR, thus facilitating interactive analysis.

The Tab navigator on the top reflects current tab in the main report, and provides an alternative to navigate betweet tabs. Selecting the name of different tab from the dropdown list will change the current tab in the main report.

Following Tab navigator, the dropdown list under headings Report for: shows current sample by default. It shows the same list of analysis-ready samples as in Upload sample. Users can also select different sample in this dropdown list, and (optionally) adjust parameters below to generate a new report for different sample, moving between samples continuously.

Some of contents in the side menu changes depending on the current tab in the main screen. For example, if ClinVar tab is focused in the main screen, the side menu will show controls specific to the tab such as the review status of variants in ClinVar. However, if Pharmacogenomics tab becomes the current tab, the side menu will change to show controls for the new current tab. The controls in side menu specific to each tab will be described in subsections for the tab.

Controls in the side menu that are common to all tabs are as follows:

  • Zygosity: The genotype of variants. Can be either Any (default, includes variants of all genotype), Hom (homozygous variants), or Het (heterozygous variants).
  • Ancestry: The population group to take allele frequency values.
  • Allele frequency: The upper threshold of allele frequencies for variants to be included in the report. Can be either All (no restriction on variant allele frequencies), <0.5%, <1%, <3%, or <5% (the maximum allowed variant allele frequencies).

The combination of Ancestry and Allele frequency determines how variants are filtered by variant allele frequencies as follows (the default combination of values are varied in each tab):

  1. Any for Ancestry
  1. All for Allele frequency: No filtering on variant allele frequency.
  2. Other values for Allele frequency: Use only variants whose maximum value from allele frequency in any population is less than or equal to the specified threshold value.
  1. Other values for Ancestry
  1. All for Allele frequency: No filtering on variant allele frequency.
  2. Other values for Allele frequency: Use only variants whose allele frequency in the specified population is less than or equal to the specified threshold value.
  • Calculated variant consequences: The consequence of variants on genes or transcripts calculated using the Variant Effect Predictor. Can be either Any (no restriction for variant consequences), or specific consequence(s). The full list of possible consequences from VEP is available in this link. CGAR allows variant filtering on individual consequences with high, moderate, or low predicted impacts or any combination of them. The default values also varies by tabs.
  • Allele frequency based on: Specifies how to use ancestry to determine allele frequency of variants. Can be either Max (default, the same as All in Ancestry), Predicted by genotype (sets values for Ancestry with majority group in the predicted ancestry composition), or Self-identified (only available if Ancestry is specified upon uploading, sets values for Ancestry with the specified value from uploading time).

Note

This option will be merged into Ancestry in future release.

4.2.2. Main report body

The main tables in the report organize variants in multiple tabs that correspond to specific analytic purpose.

_images/main_table.png

Under the name (identifier) of the current sample (Miller) and links to open ancestry prediction, tabs corresponding to report sections accessible to the current user are listed.

The small gray labels under the tag Your query: shows the values used to generate current report (tab), to remind users of the current setting and help to change settings for subsequent analysis. In the above example, the Zygosity is set to Any, Pathogenicity (specific to ClinVar tab) is set to both Pathogenic and likely pathogenic, All is used for Ancestry, All is used for Allele frequency, and Any for Calculated variant consequences (Variant impact in the label).

The buttons Copy, CSV, Excel, PDF, and Print provide various options to save or export variants in current tab.

Using the Search box on the top right, users can quickly search for variants. Any text or value in the table can be searched here.

Columns common to tables in all tabs are as follows:

  • Gene symbol: The official symbol for the gene by HUGO Gene Nomenclature Committee (HGNC).
  • HGVS nomenclature: The variant representation as recommended by the Human Genome Variation Society (HGVS). The latest recommendation can be found in this link.
  • Zygosity: Zygosity of the variant.
  • Variant impact: The Sequence Ontology (SO) terms describing the calculated consequence of the variant by VEP.
  • Max allele frequency: The variant allele frequence of the variant allele from Genome Aggregation Database (gnomAD), release 2.0.2. The value in this column is allele frequency from exomes in gnomAD, and always shows the maximum value from 5 population groups in gnomAD (AFR: African, EAS: East Asian, SAS: South Asian, AMR: Latino, EUR: non-Finnish European (NFE in gnomAD)). The population group of the maximum value is also shown in parentheses. If Ancestry is specified, the allele frequencies of the specified population will be shown (under the column heading Allele frequency).
  • Coverage metric: The percentage of gnomAD exomes with minimum of 20x read depth on the variant’s locus. The coverage values are graded by 4 different colors: green (99% or more exomes with 20x), blue (90% or more), brown (50% or more), and red (less than 50%).

The green plus sign left to the Gene symbol for each variant opens a hidden row containing links to more details on variants and to external sources.

_images/child_row.png

The links in the hidden row are:

  • Detailed view: Opens a separate window to show various detailed information about the variant.
  • gnomAD: Links to a variant page in gnomAD, showing detailed allele frequencies and coverages.
  • Marrvel: Links to Marrvel, a web application to prioritize human variants for rare diseases. It features ortholog search across model organisms including alignment of protein domains in ortholog proteins.
  • Varsome: Links to Varsom, a community-based application of variant interpretation. It provides a variety of genetic and clinically relevant information for the variant.
  • Beacon: Links to GA4GH Beacon Network, a search engine of genetic variants across various institutes and organizations.
  • WEScover: Links to WEScover to investigate breadth of coverage of a gene over exomes in 1000 Genomes Project. In contrast to Coverage metric that provides locus-specific value, it provides a gene-centric value.
  • (Restricted) Orphanet: Opens a new window with lists of phenotypes associated to the gene.
  • VarSite: Links to residue report for the variant by VarSite. VarSite features potential effects of the variant on protein 3D structure.

Also, the hidden row shows variant allele frequency in gnomAD exomes for each of 5 population groups (AFR, AMR, EAS, EUR (NFE), and SAS).

Besides the above common columns, the main table may contain additional columns depending on the current tab. The additional columns specific to each tab will be explained in subsections for the tab.

4.2.3. Variant details

Each row on the main table only shows the essential information as well as the calculated consequence of most severity. However, the calculated consequence of a variant can change depending on the transcript used for prediction Also, more information on the variant such as the predicted pathogenicity score, allele frequencies in multiple population-scale data, or protein families or domains affected by the variant can be very useful to interpret the variant. The Detailed view on the main table opens a new window containing the following information.

_images/detailed_view.png
  • Calculated variant consequences: variant consequences predicted with VEP and pathogenicity scores calculated by multiple methods are organized by each gene or transcript on the variant’s location.
    • Includes a score of gene’s tolerance to loss-of-function variant (ExACpLI score `Re>`_).
    • For each transcript, the representation of variant for cNDA or protein is shown.
    • Pathogenicity scores calculated by Condel, SIFT, CADD, FATHMM, MutationAssessor <http://mutationassessor.org/>, MutationTaster, and PROVEAN.
  • Allele frequencies: variant allele frequencies from 3 population-scale data (gnomAD, 1000 Genomes Project, and the NHLBI Exome Sequencing Project).
    • Allele frequencies are calculated per each population group.
  • Additional information for variants in splice sites: prediction scores for splicing-altering effects for variants in splice sites.
  • Scores for sequence conservation at variant site: scores for sequence conservation from multiple sequence alignment of various species. Scores from `phastCon <>`_, `phyloP <>`_, and `GERP <>`_ are provided.
  • Protein families or domains overlapping with variant site: lists protein or protein domain identifiers that overlaps with variant’s position.
  • Publications: list of PubMed identifiers for publications that cite the variant.