Pairwise sequence comparison
The following exercises will illustrate several aspects of pairwise sequence comparison:
- Using some important web resources for sequence analysis software, presented in the preceding practicals.
- Understanding the usage of web based sequence analysis tools.
- Using pairwise sequence comparison tools to illustrate some theoretical aspects studied during thecourse.
Reminder: some links for the recovery of sequences
[ Easily fetch sequences ][ Search SWISS-PROT ][ NCBI's Entrez ]
Pairwise alignment with LALIGN
Compare the sequences OPRM_RAT and SSR1_HUMAN (these are the SWISS-PROT IDs) with lalign using default parameters.
The sequences can be fetched here(choose the "FASTA" format) using the SWISS-PROT IDs.
Don't hesitate to look at the complete SWISS-PROT entries (OPRM_RAT andSSR1_HUMAN),in order to get more information about these two proteins !
Try to answer the following questions:
- Is this a local or global alignment ?
- Switch between local and global alignment . Try to understand the differences. Why are there several alignments displayed when performing the local alignment ?
- What does "% identity" mean ? How is it computed ?
- What do the symbols ":" and "." stand for ?
- When two residues are different, there can be either a "." or a blank. Try to understand the difference and what parameters influence this result ?
- Try to modify the gap penalties, examine more closely how these parameters influence the occurenceand the length of gaps ("-").
- Try to modify the scoring matrices used (i.e. BLOSUM35 and BLOSUM80), examine more closely how these parameters influence the scoresand the alignments.
Dotplot using Dotlet
Compare the same sequences (OPRM_RAT and SSR1_HUMAN) using Dotlet(If you are working on a Mac Dotlet may not work).
The sequences can be fetched here(choose the "FASTA" format).
Start with a look onto the Dotlet documentation
- Load the two sequences into Dotlet and compute the dotplot.
- What does the intensity (gray level) of a pixel mean ?
- Try changing the grayscale borders. Where would be an optimal position for the upper and lower limits of the grayscale ?
- What do the diagonal lines represent ?
- Try to identify corresponding aligned regions in the dotplot and the alignments found by LALIGN.
- Try to modify the noise by changing the window size, the threshold, both.
- Try comparing each sequence against itself.
Dotlet examples and method comparison
The Dotlet learn by example pagesshow different typical sequence analysis problems.
- Take an interested look at the Dotlet examples.
- Try to understand the dotplots.
Supplementary exercise
For those who can't get enough: get some more practice by compairing some of the pairs of sequences below.
- CO9_HUMAN - PERF_HUMAN
- FRA_DROME - GCN4_YEAST
- HBB_HUMAN - LGB1_PEA
- YOR6_ADEG1 - CD4_HUMAN