Thursday, September 26, 2013

DNA Sequence Analysis

from http://cdwscience.blogspot.com/

Genome Visualization Tools:

  • UCSC Genome Browser
    • popular, free genomic visualization tool for a wide variety of organisms
    • also serves as a database for genomic sequences and features
  • Integrative Genomics Viewer (IGV)
    • very efficient tool for visualizing almost any type of genomic data
    • open-source
  • Gbrowse - open-source genome browser

Sequence Alignment:

  • BLAST - search for similar DNA sequences in GenBank
  • ClustalW - multi-species genome alignment
  • TCoffee - multi-species genome alignment
  • Mauve - multi-species alignment and visualization tool to detect segments of conserved sequence

General DNA-Seq Tools:

  • samtools
    • popular, free tool to extract data from .SAM alignment files
    • Picard - java-based version of samtools
    • see short read aligners necessary for upstream analysis
  • Galaxy
    • open-source, cloud-based suite of popular sequence analysis tools (including deep sequencing analysis 
  • GATK
    • toolkit for analysis of next-generation sequencing data
    • previously open-source, but now requires a commercial license
  • CLC Bio Genomics Workbench
    • commercial software covering a wide variety of applications such as sequence alignment, SNP/DIP detection, de novo assembly, etc.
    • CLC Bio Genomics Workbench also has the functionality of CLC Bio Main Workbench for standard sequencing analysis (cloning, primer design, etc.)
      • both are commercial programs that require a purchased license
  • Nexus Copy Number
    • commercial software for analysis of copy number alterations
    • works for a variety of microarray platforms as well as for deep sequencing analysis
  • SeqAnswers Software List

Transcription Factor Motif Analysis:

  • TRANSFAC
    • database of transcription factor motifs
    • a subscription is required to access the most recent annotations, but older versions are freely available
    • A plug-in is available within CLC Bio (a commercial program for genomics analysis)
  • JASPAR
    • free database of transcription factor motif sequences
  • TFsitescan
    • free tool to search for transcription factor motifs
  • MEME Suite
    • tools for ab initio motif finding
  • rVista / VISTA Suite
    • tool for searching motifs conserved across closely related organisms
  • TESS
    • transcription factor search system
    • unfortunately, this tool now has to be run locally

Mutation Analysis:
  • VarScan
    • open-source variant calling tool
    • see short read aligners necessary for upstream analysis
    • usually also requires something like samtools to create input file
  • SeattleSNPs Genome Variation Server
    • tool to filter candidate variants (based upon frequency, predicted function, etc.)
  • ANNOVAR (pronounced Anno-Var)
    • tool to filter candidate variants (based upon frequency, predicted function, etc.)
    •  wANNOVAR is the web-based interface
  • GWAS Catalog
    • NHGRI database of SNP-based phenotypic / disease associations
  • Promethease
    • open-source tool for personalized genomic analysis
    • it is technically free to use, but you can pay $5 to get your report more quickly
    • uses annotations from SNPedia
  • Interpretome
    • Genome interpretation tool similar to Promethease
    • In my opinion, nicer interface.  However, it currently only works with raw data from 23andMe and  Lumigenix.
  • SNPedia
    • crowd sourced annotation of SNP associations
    • includes some publicly available genomes
ChIP-Seq Tools:


de novo Assembly Algorithms:


Other Tools:
  • Primer3 - PCR primer design
  • Repeatmasker - identifies repetitive elements within a DNA sequence
  • Webcutter - detects restriction enzyme sites in a DNA sequence
  • Translate - a tool that allows translation of nucleotide (DNA / RNA) sequence into a protein sequence

No comments:

Post a Comment