Suppose you wanted to create an exercise for your students based on the Riginos and McDonald (2003) paper. In my opinion it is a rich paper filled with conceptual cognates for high school biology students. Of course "sex" is interesting to this age group, just mentioning the word "sperm" will get them paying attention. And to biologists the mystery of how a sperm finds and binds to just the right egg cell out there in the vast ocean is a lifetime study. But it will also be interesting because you can tell your students that you have worked on amplifying Mussel DNA coding for portions of this recognition system.

 

A bioinformatics activity you can conduct might be to take amino acid sequence published in the paper and use BLAST to search for similar sequences so that you could study which portions of the protein are conserved and which are variable.

 

  • Open Notepad [Start ---> Programs ---> Accessories ---> Notepad
  • Begin to enter the sequence from the paper) using the FASTA format we discussed yesterday. If you are really squeamish about typing I have entered the sequence here that you could copy and paste. Always use courier font or another non-proportional font when working with sequences.

  • Save this sequence to your "sequences" folder as M_gallo_M7frag.pep
  • I save FASTA format sequences as ".nuc" for nucleotide and ".pep" for protein sequence.
  • Copy these two lines of text and paste them into the "protein-protein" Blast search window.

 

In this case I suggest limiting the search to animals...

The search will take half a minute or so to complete.

When the results page opens, scroll down past all the garbage until you get to the first alignment.

Notice that the sequence fragment you submitted (Query) is numbered 1-50 and that it is an exact match to the Subject Sequence which is numbered 39-88. It is probably the whole version of the fragment.

Now let's repeat the search using the whole protein instead of the fragment.

Check the little box next to the >gi|286056|dbj|BAA03551.1 and click on "Get selected sequences."

There are several ways that we commonly use to display and save sequences:

This time we will select
"Display FASTA" and
"Send to File"
When you click "Send to File" you will be asked where to save the file. Be sure to change the name to
M_edul_lysinM7.pep
Eventually you will develope your own system for naming sequence files. I use Genus_Species_protein.pep for FASTA format protein sequences.
Now use this full-length protein to search the database again. Even if you don't get substantially different results this time, you certainly will for other proteins!

If you would like to build your skills, try this protein fragment from the literature:

>Test sequence to copy and paste into Blast protein-protein
VHLTPVEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLST