##### Differences

This shows you the differences between two versions of the page.

 cs-312:project-5 [2015/02/18 22:31]ringger cs-312:project-5 [2015/02/21 11:23]ringger [Further Exploration] Both sides previous revision Previous revision 2015/03/04 09:58 ringger 2015/02/21 11:23 ringger [Further Exploration] 2015/02/21 11:21 ringger [Provided Framework] 2015/02/21 11:20 ringger [Provided Framework] 2015/02/21 11:17 ringger [Objectives] 2015/02/18 22:31 ringger 2015/02/18 22:30 ringger [Distribution] 2015/02/18 22:29 ringger [Distribution] 2015/02/17 10:45 cs312ta [Appendix: Coronaviruses] 2015/02/17 10:42 cs312ta [Appendix: Coronaviruses] 2015/02/17 10:41 cs312ta [Provided Framework] 2014/12/31 16:20 ringger 2014/12/31 15:10 ringger created Next revision Previous revision 2015/03/04 09:58 ringger 2015/02/21 11:23 ringger [Further Exploration] 2015/02/21 11:21 ringger [Provided Framework] 2015/02/21 11:20 ringger [Provided Framework] 2015/02/21 11:17 ringger [Objectives] 2015/02/18 22:31 ringger 2015/02/18 22:30 ringger [Distribution] 2015/02/18 22:29 ringger [Distribution] 2015/02/17 10:45 cs312ta [Appendix: Coronaviruses] 2015/02/17 10:42 cs312ta [Appendix: Coronaviruses] 2015/02/17 10:41 cs312ta [Provided Framework] 2014/12/31 16:20 ringger 2014/12/31 15:10 ringger created Last revision Both sides next revision Line 5: Line 5: ==Objectives== ==Objectives== - *Design and implement a Dynamic Programming algorithm that has applications to gene sequence alignment. + *Design and implement a Dynamic Programming algorithm that solves the gene sequence alignment ​problem. *Think carefully about the use of memory in your implementation. *Think carefully about the use of memory in your implementation. *Solve a non-trivial computational genomics problem. *Solve a non-trivial computational genomics problem. Line 25: Line 25: ==Provided Framework== ==Provided Framework== - You are provided with some scaffolding code to help you get started on the project. ​ We provide a Visual Studio project with a GUI containing a 10x10 matrix, with a row and a column for each of 10 organisms. ​ The organism at row n is the same organism for column ​n.  Note that this matrix is '''​*not*'''​ the dynamic programming table; it is simply a place to store and display the final minimum alignment cost result from the alignment of the gene sequences for each pair of organisms. ​ Thus, cell (i, j) in this table will contain the minimum cost alignment score of the genome for organism i with the genome for organism j. + You are provided with some scaffolding code to help you get started on the project. ​ We provide a Visual Studio project with a GUI containing a 10x10 matrix, with a row and a column for each of 10 organisms. ​ The organism at row $X$ is the same organism for column ​$X$.  Note that this matrix is '''​*not*'''​ the dynamic programming table; it is simply a place to store and display the final minimum alignment cost result from the alignment of the gene sequences for each pair of organisms. ​ Thus, cell $(i, j)$ in this table will contain the minimum cost alignment score of the genome for organism ​$i$ with the genome for organism ​$j$. When you press the “Process” button on the toolbar, the matrix fills with numbers, one number for each pair, as shown in the figure below. ​ You will generate these numbers by aligning '''​exacty the first 5000 characters'''​ (bases) in each sequence pair.  ''​Currently,​ the numbers are computed by a quick placeholder computation (the difference in sequence lengths) that is not particularly useful; these are the numbers shown in the figure below. ''​ Your job will be to replace that number with the output from a sequence alignment function that employs dynamic programming. ​ You will fill the matrix with the pair-wise scores. ​ When you press the “Process” button, the event handler for the button-down event should call the <​tt>​PairWiseAlign.Align()​ method which you implement. When you press the “Process” button on the toolbar, the matrix fills with numbers, one number for each pair, as shown in the figure below. ​ You will generate these numbers by aligning '''​exacty the first 5000 characters'''​ (bases) in each sequence pair.  ''​Currently,​ the numbers are computed by a quick placeholder computation (the difference in sequence lengths) that is not particularly useful; these are the numbers shown in the figure below. ''​ Your job will be to replace that number with the output from a sequence alignment function that employs dynamic programming. ​ You will fill the matrix with the pair-wise scores. ​ When you press the “Process” button, the event handler for the button-down event should call the <​tt>​PairWiseAlign.Align()​ method which you implement. Line 41: Line 41: Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property "​Target framework:",​ specify .NET Framework 4 (or higher). Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property "​Target framework:",​ specify .NET Framework 4 (or higher). + + Summary: + * For each pair of genes, you will apply the DP algorithm to compute their alignment cost. + * You will summarize the alignment costs for all pairs in the results table (the 10 x 10 table) in the GUI.  The results table is just a way of presenting the results. ​ == Required: Whiteboard Experience (TM) == == Required: Whiteboard Experience (TM) == Line 174: Line 178: #Experiment with alternative operator (substitution,​ match, indel) costs and discuss the impact of changing their values. #Experiment with alternative operator (substitution,​ match, indel) costs and discuss the impact of changing their values. #Try aligning longer (sub-)sequences. ​ Conduct an empirical analysis. ​ Discuss impact on the alignment score matrix. #Try aligning longer (sub-)sequences. ​ Conduct an empirical analysis. ​ Discuss impact on the alignment score matrix. - #​Re-implement your alignment algorithm in top-down fashion using recursion and a memory function. ​ How does this algorithm compare to the implementation ​using a table? + #​Re-implement your alignment algorithm in top-down fashion using recursion and a memory function. ​ How does this algorithm compare to the iterative, bottom-up ​implementation?​ #Try a shortest-path algorithm like Dijkstra’s to solve this problem. #Try a shortest-path algorithm like Dijkstra’s to solve this problem. #Get ahead of the game, learn about the A* shortest-path algorithm, and implement it with a tight admissible heuristic (we will get to this topic later in the course, but you might enjoy trying it in the context of the alignment problem). #Get ahead of the game, learn about the A* shortest-path algorithm, and implement it with a tight admissible heuristic (we will get to this topic later in the course, but you might enjoy trying it in the context of the alignment problem).