Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs-312:project-5 [2015/02/21 18:17]
ringger [Objectives]
cs-312:project-5 [2015/03/04 16:58] (current)
ringger
Line 25: Line 25:
 ==Provided Framework== ==Provided Framework==
  
-You are provided with some scaffolding code to help you get started on the project. ​ We provide a Visual Studio project with a GUI containing a 10x10 matrix, with a row and a column for each of 10 organisms. ​ The organism at row is the same organism for column ​n.  Note that this matrix is '''​*not*'''​ the dynamic programming table; it is simply a place to store and display the final minimum alignment cost result from the alignment of the gene sequences for each pair of organisms. ​ Thus, cell (i, j) in this table will contain the minimum cost alignment score of the genome for organism i with the genome for organism j.+You are provided with some scaffolding code to help you get started on the project. ​ We provide a Visual Studio project with a GUI containing a 10x10 matrix, with a row and a column for each of 10 organisms. ​ The organism at row $X$ is the same organism for column ​$X$.  Note that this matrix is '''​*not*'''​ the dynamic programming table; it is simply a place to store and display the final minimum alignment cost result from the alignment of the gene sequences for each pair of organisms. ​ Thus, cell $(i, j)in this table will contain the minimum cost alignment score of the genome for organism ​$iwith the genome for organism ​$j$.
  
 When you press the “Process” button on the toolbar, the matrix fills with numbers, one number for each pair, as shown in the figure below. ​ You will generate these numbers by aligning '''​exacty the first 5000 characters'''​ (bases) in each sequence pair.  ''​Currently,​ the numbers are computed by a quick placeholder computation (the difference in sequence lengths) that is not particularly useful; these are the numbers shown in the figure below. ''​ Your job will be to replace that number with the output from a sequence alignment function that employs dynamic programming. ​ You will fill the matrix with the pair-wise scores. ​ When you press the “Process” button, the event handler for the button-down event should call the <​tt>​PairWiseAlign.Align()</​tt>​ method which you implement. When you press the “Process” button on the toolbar, the matrix fills with numbers, one number for each pair, as shown in the figure below. ​ You will generate these numbers by aligning '''​exacty the first 5000 characters'''​ (bases) in each sequence pair.  ''​Currently,​ the numbers are computed by a quick placeholder computation (the difference in sequence lengths) that is not particularly useful; these are the numbers shown in the figure below. ''​ Your job will be to replace that number with the output from a sequence alignment function that employs dynamic programming. ​ You will fill the matrix with the pair-wise scores. ​ When you press the “Process” button, the event handler for the button-down event should call the <​tt>​PairWiseAlign.Align()</​tt>​ method which you implement.
Line 41: Line 41:
  
 Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property "​Target framework:",​ specify .NET Framework 4 (or higher). Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property "​Target framework:",​ specify .NET Framework 4 (or higher).
 +
 +Summary:
 +* For each pair of genes, you will apply the DP algorithm to compute their alignment cost.
 +* You will summarize the alignment costs for all pairs in the results table (the 10 x 10 table) in the GUI.  The results table is just a way of presenting the results. ​
  
 == Required: Whiteboard Experience (TM) == == Required: Whiteboard Experience (TM) ==
Line 91: Line 95:
 # Align the first 100 characters (bases) of sequence #3 (row 3) and #10 (column 10) (assuming the first sequence in the table is numbered as #1); extract and display the alignment in your report using a fixed-width font.  For example, if you were aligning "​AGCTCATGCT"​ and "​ACTGCATCA",​ the side-by-side alignment should be presented in the manner shown here: # Align the first 100 characters (bases) of sequence #3 (row 3) and #10 (column 10) (assuming the first sequence in the table is numbered as #1); extract and display the alignment in your report using a fixed-width font.  For example, if you were aligning "​AGCTCATGCT"​ and "​ACTGCATCA",​ the side-by-side alignment should be presented in the manner shown here:
 <pre> <pre>
-AGCT-CATGCT +AGCTCATGCT 
-A-CTGCAT-CA+ACTGCAT-CA
 </​pre>​ </​pre>​
  
Line 174: Line 178:
 #Experiment with alternative operator (substitution,​ match, indel) costs and discuss the impact of changing their values. #Experiment with alternative operator (substitution,​ match, indel) costs and discuss the impact of changing their values.
 #Try aligning longer (sub-)sequences. ​ Conduct an empirical analysis. ​ Discuss impact on the alignment score matrix. #Try aligning longer (sub-)sequences. ​ Conduct an empirical analysis. ​ Discuss impact on the alignment score matrix.
-#​Re-implement your alignment algorithm in top-down fashion using recursion and a memory function. ​ How does this algorithm compare to the implementation ​using a table?+#​Re-implement your alignment algorithm in top-down fashion using recursion and a memory function. ​ How does this algorithm compare to the iterative, bottom-up ​implementation?​
 #Try a shortest-path algorithm like Dijkstra’s to solve this problem. #Try a shortest-path algorithm like Dijkstra’s to solve this problem.
 #Get ahead of the game, learn about the A* shortest-path algorithm, and implement it with a tight admissible heuristic (we will get to this topic later in the course, but you might enjoy trying it in the context of the alignment problem). #Get ahead of the game, learn about the A* shortest-path algorithm, and implement it with a tight admissible heuristic (we will get to this topic later in the course, but you might enjoy trying it in the context of the alignment problem).
cs-312/project-5.1424542641.txt.gz · Last modified: 2015/02/21 18:17 by ringger
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0