Flashcard Fridays – Scitable

Scitable is an online learning tool and knowledge base run by the Nature Publishing Group. It contains information mostly about genetics and cell biology. The available material is very diverse: an extensive and ever growing list of topics is discussed at different levels. A high-school student and a researcher with a PhD can both find useful and understandable information on the site.

Continue reading <span class="meta-nav">→</span>

Workflow Wednesdays – Reference sequence – Part 2. Subsampling and statistics

Extracting regions from a fasta file

If you are doing targeted sequencing, it’s usually a good idea to use a relatively large reference sequence (e.g. a whole chromosome) to avoid problems caused by mismappings. It can still be very useful sometimes to use a “subset” or “subsample” of the reference sequence for an alignment to save computing time, investigate alignment problems or other reasons. To get a specific region from a fasta file, you can use the Bedtools suite’s “getfasta” function. To use this function, you’ll need a bed file, containing the coordinates of the required region(s). As this is just a test run, let’s select the first gene from the NCBI record of the reference genome and create a bed file by hand! The first gene is  the “thrL” which is between positions 190 and 255.

Continue reading <span class="meta-nav">→</span>

Illumina Applies CE Mark to MiSeqDx™ Cystic Fibrosis System: Omixon Target is Featured!

We have amazing news today as the first CE mark for next generation sequencing-based molecular diagnostics was just announced. Illumina applied CE mark to MiSeqDx™ Cystic Fibrosis System and Omixon Target is used for analyzing CFTR gene in prenatal labs!

An excerpt from the announcement relevant to Omixon Target:

Cystic fibrosis is a life-threatening, inherited single-gene disorder that affects more than 70,000 people worldwide. Caused by mutations in the CFTR gene, the disease has a wide clinical presentation depending on which CFTR gene mutations are present. Most CFTR mutations are rare, and their distribution and frequency vary among different world populations.

Illumina Applies CE Mark to MiSeqDx™ Cystic Fibrosis System: Omixon Target is Featured!

Bioinformatics for Beginners – File Formats Part 3. – Alignments

The generally used file formats for sequence based alignments are the SAM and BAM formats. These files can contain information about mapped and unmapped reads, the contigs of the reference sequence that was used and many more things.

SAM

You can find the SAM format specification here and the article about the SAM format and SAMtools here.

The SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments.

Continue reading <span class="meta-nav">→</span>