Large-Scale Multiple Sequence Alignment Visualization through Gradient Vector Flow Analysis

IEEE Symposium on Biological Data Visualization, page 9--16 - October 2013
Download the publication : NR13a.pdf [2.2Mo]  
Multiple sequence alignment (MSA) is essential as an initial step in studying molecular phylogeny as well as during the identification of genomic rearrangements. Recent advances in sequencing techniques have led to a tremendous increase in the number of sequences to be analyzed. As a result, a greater demand is being placed on visualization techniques, as they have the potential to reveal the underlying information in large-scale MSAs. In this work, we present a novel visualization technique for conveying the patterns in large-scale MSAs. By applying gradient vector flow analysis to the MSA data, we can extract and visually emphasize conservations and other patterns that are relevant during the MSA exploration process. In contrast to the traditional visual representation of MSAs, which exploits color-coded tables, the proposed visual metaphor allows us to provide an overview of large MSAs as well as to highlight global patterns, outliers, and data distributions. We will motivate and describe the proposed algorithm, and further demonstrate its application to large-scale MSAs.

Images and movies

 

BibTex references

@inproceedings{NR13a,
  author       = {Nguyen, Khoa Tan and Ropinski, Timo},
  title        = {{Large-Scale Multiple Sequence Alignment Visualization through Gradient Vector Flow Analysis}},
  booktitle    = {IEEE Symposium on Biological Data Visualization},
  pages        = {9--16},
  year         = {2013},
  editor       = {Jos Roerdink and Jessie Kennedy},
  note         = {accepted}
}

Author publication list