Data Analysis Related Questions


How can I see the differences between the alleles in the Genome Browser?

You can easily compare the sequences of alleles in the Genome Browser using the “Masked reference” mode. This can be toggled using the “Toggle reference masked” function in the right click context menu or by pressing Ctrl + D (keep the cursor over the alignment while you do this).

Can I see the exact number of supporting reads at a position?

Yes, base statistics is easily made visible by pressing F2 or choosing “Show base statistics at cursor” in the right click context menu.

I see multiple alleles or allele pairs called for a locus, how does that happen?

Multiple alleles or allele pairs are called in case they have about equal amount of supporting reads (SG algorithm) or equal amount of mismatches (CG algorithm). This means that the software was not able to clearly decide which allele is present in the original biological sample. You can investigate and decide about the result or the necessary course of action based on the Genome Browser.

I see two alleles called ambiguously (analysed with CG algorithm) but one of them has no mismatches while the other does. Why is the one with no mismatches not selected as best?

In case the reference sequence of the allele without mismatch is not fully defined and the other one is, then we use a method called “fair compare”. As we don’t know if the partially defined allele would have any mismatches if it was fully defined, we don’t discard the allele with the mismatch but show it as an ambiguous result.

For a novel allele I see two reference tracks in the Genome Browser. What are these?

The “Novel ref” sequence is the generated novel sequence while the “Rel ref” is the sequence of the selected base allele from the reference database. The novel allele is assumed to be derived from the base allele.

QC Issues

The overall quality control light is not green, what does that mean?

The QC traffic light system is meant to indicate the quality of the input data. The overall QC traffic light is calculated as the value of the worst QC measure of the given locus. The meaning of the different colours is the following:

  • (green) – PASSED: the locus passed all QC tests
  • (yellow/green) – INFO: one or more QC tests produced lower than average results
  • (yellow) – INSPECT: one or more QC tests produced concerning results, manual inspection of the results needed
  • (red/yellow) – INVESTIGATE: one or more QC tests showed low result quality, manual inspection and possibly reanalysis is needed
  • (red) – FAILED: one or more QC tests showed very low result quality, manual inspection is needed to determine the cause and the locus or sample likely needs re-sequencing or to be re-typed by alternative methods
What to do when a QC measure gets a warning or fails?

The first thing to do is when you see a QC metric warning or failure is to gain more information about the sample (read length, read quality, coverage, etc). Having more information makes it easier to diagnose the problem and make an informed decision about the required steps (e.g. full resequencing, resequencing from a specific step). Certain advanced analysis settings (e.g. processing more reads) can be used for compensating for some not too serious quality issues.

Insufficient or low quality data

The CG genotyping algorithm was unable to build a consensus sequence to assign an allele to the given locus due to the low quality of the input data, but some of the QC metrics are calculated. Check the QC metrics to gain more information about the problems.

One of the QC metrics is “NA”, what does that mean?

There are a few different reasons why this can happen.

  • The result you opened was analysed with an older software version which didn’t have all the QC metrics that are currently present in the software
  • The CG genotyping algorithm was unable to build a consensus sequence and consequently the QC measures calculated from the consensus couldn’t be calculated
I think the best matching genotype is not correct. Can I change it?

You can add best matching genotype by navigating to the HLA Typing sample result screen and selecting “Add genotype” from the right click context menu. Browse the displayed P and G groups to locate the allele pair candidate that you wish to add. Please note that the already present result cannot be added again.

If you are in the Genome Browser you can add alleles by selecting “Add custom allele(s) to 1st/2nd chromosome” from the right click context menu. Note that if you choose this option then your selected alleles won’t be permanently attached to the sample. If you would like to convert the alleles that you added into a custom genotype please select “Convert custom allele pair to result genotype” from the right click context menu.

I realized that the alleles that I added manually to the result are not correct. Can I discard them?

The manually added allele pair candidates can be removed by selecting them and using the right mouse button to summon the context menu. Select the “Delete user added genotype” option to remove the allele pair candidate.

I am interested in getting information (sequence, read ID, CIGAR, etc) about a specific read that I see in the Genome Browser

You can see detailed information about a specific read by selecting it (left click) in the Genome Browser. The information will be visible in the bottom info bar. By selecting “Copy to clipboard” or “Copy sequence to clipboard” you can copy the desired information to the clipboard.