Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
cs-312:project-5 [2015/02/21 11:17]
ringger [Objectives]
cs-312:project-5 [2015/02/21 11:23]
ringger [Further Exploration]
Line 25: Line 25:
 ==Provided Framework== ==Provided Framework==
  
-You are provided with some scaffolding code to help you get started on the project. ​ We provide a Visual Studio project with a GUI containing a 10x10 matrix, with a row and a column for each of 10 organisms. ​ The organism at row is the same organism for column ​n.  Note that this matrix is '''​*not*'''​ the dynamic programming table; it is simply a place to store and display the final minimum alignment cost result from the alignment of the gene sequences for each pair of organisms. ​ Thus, cell (i, j) in this table will contain the minimum cost alignment score of the genome for organism i with the genome for organism j.+You are provided with some scaffolding code to help you get started on the project. ​ We provide a Visual Studio project with a GUI containing a 10x10 matrix, with a row and a column for each of 10 organisms. ​ The organism at row $X$ is the same organism for column ​$X$.  Note that this matrix is '''​*not*'''​ the dynamic programming table; it is simply a place to store and display the final minimum alignment cost result from the alignment of the gene sequences for each pair of organisms. ​ Thus, cell $(i, j)in this table will contain the minimum cost alignment score of the genome for organism ​$iwith the genome for organism ​$j$.
  
 When you press the “Process” button on the toolbar, the matrix fills with numbers, one number for each pair, as shown in the figure below. ​ You will generate these numbers by aligning '''​exacty the first 5000 characters'''​ (bases) in each sequence pair.  ''​Currently,​ the numbers are computed by a quick placeholder computation (the difference in sequence lengths) that is not particularly useful; these are the numbers shown in the figure below. ''​ Your job will be to replace that number with the output from a sequence alignment function that employs dynamic programming. ​ You will fill the matrix with the pair-wise scores. ​ When you press the “Process” button, the event handler for the button-down event should call the <​tt>​PairWiseAlign.Align()</​tt>​ method which you implement. When you press the “Process” button on the toolbar, the matrix fills with numbers, one number for each pair, as shown in the figure below. ​ You will generate these numbers by aligning '''​exacty the first 5000 characters'''​ (bases) in each sequence pair.  ''​Currently,​ the numbers are computed by a quick placeholder computation (the difference in sequence lengths) that is not particularly useful; these are the numbers shown in the figure below. ''​ Your job will be to replace that number with the output from a sequence alignment function that employs dynamic programming. ​ You will fill the matrix with the pair-wise scores. ​ When you press the “Process” button, the event handler for the button-down event should call the <​tt>​PairWiseAlign.Align()</​tt>​ method which you implement.
Line 41: Line 41:
  
 Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property "​Target framework:",​ specify .NET Framework 4 (or higher). Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property "​Target framework:",​ specify .NET Framework 4 (or higher).
 +
 +Summary:
 +* For each pair of genes, you will apply the DP algorithm to compute their alignment cost.
 +* You will summarize the alignment costs for all pairs in the results table (the 10 x 10 table) in the GUI.  The results table is just a way of presenting the results. ​
  
 == Required: Whiteboard Experience (TM) == == Required: Whiteboard Experience (TM) ==
Line 174: Line 178:
 #Experiment with alternative operator (substitution,​ match, indel) costs and discuss the impact of changing their values. #Experiment with alternative operator (substitution,​ match, indel) costs and discuss the impact of changing their values.
 #Try aligning longer (sub-)sequences. ​ Conduct an empirical analysis. ​ Discuss impact on the alignment score matrix. #Try aligning longer (sub-)sequences. ​ Conduct an empirical analysis. ​ Discuss impact on the alignment score matrix.
-#​Re-implement your alignment algorithm in top-down fashion using recursion and a memory function. ​ How does this algorithm compare to the implementation ​using a table?+#​Re-implement your alignment algorithm in top-down fashion using recursion and a memory function. ​ How does this algorithm compare to the iterative, bottom-up ​implementation?​
 #Try a shortest-path algorithm like Dijkstra’s to solve this problem. #Try a shortest-path algorithm like Dijkstra’s to solve this problem.
 #Get ahead of the game, learn about the A* shortest-path algorithm, and implement it with a tight admissible heuristic (we will get to this topic later in the course, but you might enjoy trying it in the context of the alignment problem). #Get ahead of the game, learn about the A* shortest-path algorithm, and implement it with a tight admissible heuristic (we will get to this topic later in the course, but you might enjoy trying it in the context of the alignment problem).
cs-312/project-5.txt · Last modified: 2015/03/04 09:58 by ringger
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0