##### Differences

This shows you the differences between two versions of the page.

 cs-312:project-5 [2014/12/31 22:10]ringger created cs-312:project-5 [2015/03/04 16:58] (current)ringger 2015/03/04 16:58 ringger 2015/02/21 18:23 ringger [Further Exploration] 2015/02/21 18:21 ringger [Provided Framework] 2015/02/21 18:20 ringger [Provided Framework] 2015/02/21 18:17 ringger [Objectives] 2015/02/19 05:31 ringger 2015/02/19 05:30 ringger [Distribution] 2015/02/19 05:29 ringger [Distribution] 2015/02/17 17:45 cs312ta [Appendix: Coronaviruses] 2015/02/17 17:42 cs312ta [Appendix: Coronaviruses] 2015/02/17 17:41 cs312ta [Provided Framework] 2014/12/31 23:20 ringger 2014/12/31 22:10 ringger created Next revision Previous revision 2015/03/04 16:58 ringger 2015/02/21 18:23 ringger [Further Exploration] 2015/02/21 18:21 ringger [Provided Framework] 2015/02/21 18:20 ringger [Provided Framework] 2015/02/21 18:17 ringger [Objectives] 2015/02/19 05:31 ringger 2015/02/19 05:30 ringger [Distribution] 2015/02/19 05:29 ringger [Distribution] 2015/02/17 17:45 cs312ta [Appendix: Coronaviruses] 2015/02/17 17:42 cs312ta [Appendix: Coronaviruses] 2015/02/17 17:41 cs312ta [Provided Framework] 2014/12/31 23:20 ringger 2014/12/31 22:10 ringger created Line 5: Line 5: ==Objectives== ==Objectives== - *Design and implement a Dynamic Programming algorithm that has applications to gene sequence alignment. + *Design and implement a Dynamic Programming algorithm that solves the gene sequence alignment ​problem. *Think carefully about the use of memory in your implementation. *Think carefully about the use of memory in your implementation. *Solve a non-trivial computational genomics problem. *Solve a non-trivial computational genomics problem. Line 14: Line 14: Code: http://​faculty.cs.byu.edu/​~ringger/​cs312/​projects/​Project5-GeneSequenceAlignment-VS2008.zip Code: http://​faculty.cs.byu.edu/​~ringger/​cs312/​projects/​Project5-GeneSequenceAlignment-VS2008.zip - The data is distributed with the code as an .mdb database file.  However, if you want to look at plain text, see the following: http://​faculty.cs.byu.edu/​~ringger/​cs312/​projects/​GeneData.zip + The data is distributed with the code as an .mdb database file.  However, if you want to look at plain text, see the following ​raw data: + + http://​faculty.cs.byu.edu/​~ringger/​cs312/​projects/​GeneData.zip ==Background== ==Background== Line 23: Line 25: ==Provided Framework== ==Provided Framework== - You are provided with some scaffolding code to help you get started on the project. ​ We provide a Visual Studio project with a GUI containing a 10x10 matrix, with a row and a column for each of 10 organisms. ​ The organism at row n is the same organism for column ​n.  Note that this matrix is '''​*not*'''​ the dynamic programming table; it is simply a place to store and display the final minimum alignment cost result from the alignment of the gene sequences for each pair of organisms. ​ Thus, cell (i, j) in this table will contain the minimum cost alignment score of the genome for organism i with the genome for organism j. + You are provided with some scaffolding code to help you get started on the project. ​ We provide a Visual Studio project with a GUI containing a 10x10 matrix, with a row and a column for each of 10 organisms. ​ The organism at row $X$ is the same organism for column ​$X$.  Note that this matrix is '''​*not*'''​ the dynamic programming table; it is simply a place to store and display the final minimum alignment cost result from the alignment of the gene sequences for each pair of organisms. ​ Thus, cell $(i, j)$ in this table will contain the minimum cost alignment score of the genome for organism ​$i$ with the genome for organism ​$j$. - When you press the “Process” button on the toolbar, the matrix fills with numbers, one number for each pair, as shown in the figure below. ​ You will generate these numbers by aligning '''​exacty the first 5000 characters'''​ (bases) in each sequence pair.  ''​Currently,​ the numbers are computed by a quick placeholder computation (the difference in sequence lengths) that is not particularly useful; these are the numbers shown in the figure below. ''​ Your job will be to replace that number with the output from a sequence alignment function that employs dynamic programming. ​ You will fill the matrix with the pair-wise scores. ​ When you press the “Process” button, the event handler for the button-down event should call the ​PairWiseAlign.Align() method which you implement. + When you press the “Process” button on the toolbar, the matrix fills with numbers, one number for each pair, as shown in the figure below. ​ You will generate these numbers by aligning '''​exacty the first 5000 characters'''​ (bases) in each sequence pair.  ''​Currently,​ the numbers are computed by a quick placeholder computation (the difference in sequence lengths) that is not particularly useful; these are the numbers shown in the figure below. ''​ Your job will be to replace that number with the output from a sequence alignment function that employs dynamic programming. ​ You will fill the matrix with the pair-wise scores. ​ When you press the “Process” button, the event handler for the button-down event should call the ​PairWiseAlign.Align() method which you implement. - [[Image:geneSequenceAlignment1.png|500px|center]] + [[media:cs-312:​project-5:​genesequencealignment1.png?500px]] Each gene sequence consists of a string of letters and is stored in the given database file.  The scaffolding code loads the data from the database. ​ For example, the record for the “sars9” genome contains the following sequence (prefix shown here): Each gene sequence consists of a string of letters and is stored in the given database file.  The scaffolding code loads the data from the database. ​ For example, the record for the “sars9” genome contains the following sequence (prefix shown here): Line 39: Line 41: Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property "​Target framework:",​ specify .NET Framework 4 (or higher). Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property "​Target framework:",​ specify .NET Framework 4 (or higher). + + Summary: + * For each pair of genes, you will apply the DP algorithm to compute their alignment cost. + * You will summarize the alignment costs for all pairs in the results table (the 10 x 10 table) in the GUI.  The results table is just a way of presenting the results. ​ == Required: Whiteboard Experience (TM) == == Required: Whiteboard Experience (TM) == Line 66: Line 72: #* The second algorithm should align the sequences and compute the alignment score (like the first algorithm) but do so in such a way that the actual character-by-character alignment can also be extracted. ​ Call this the “extraction algorithm”. #* The second algorithm should align the sequences and compute the alignment score (like the first algorithm) but do so in such a way that the actual character-by-character alignment can also be extracted. ​ Call this the “extraction algorithm”. #* Each algorithm should use dynamic programming:​ in particular, use the Needleman/​Wunsch algorithm we used in class. ​ Find the best alignment by minimizing the total cost. #* Each algorithm should use dynamic programming:​ in particular, use the Needleman/​Wunsch algorithm we used in class. ​ Find the best alignment by minimizing the total cost. - #* The requirement is that your scoring algorithm run in at most <​math>​O(n^2) ​time and <​math>​O(n) ​space; this will require a modification to the algorithm as given in order to be more sparing in the use of space. ​ On the other hand, your extraction algorithm can use <​math>​O(n^2) ​in time and space, as shown in class and practiced in the homework. + #* The requirement is that your scoring algorithm run in at most $O(n^2)$ time and $O(n)$ space; this will require a modification to the algorithm as given in order to be more sparing in the use of space. ​ On the other hand, your extraction algorithm can use $O(n^2)$ in time and space, as shown in class and practiced in the homework. #* Your scoring and extraction algorithms should use the following operator costs to compute the distance between two sequences: #* Your scoring and extraction algorithms should use the following operator costs to compute the distance between two sequences: #** Substitutions,​ which are single character mis-matches,​ are penalized 1 unit #** Substitutions,​ which are single character mis-matches,​ are penalized 1 unit Line 72: Line 78: #** Matches are rewarded -3 units #** Matches are rewarded -3 units #* Make your pseudo-code easy to follow, but keep it within one page.  If needed for clarity, document your algorithm with legible comments in important places in the pseudo-code. ​ Clear pseudo-code provides evidence of your comprehension. ​ The correctness of your approach should be easy for the reader (e.g., TA) to verify. ​ If the TA cannot easily determine the code's correctness,​ then fewer points will be awarded. #* Make your pseudo-code easy to follow, but keep it within one page.  If needed for clarity, document your algorithm with legible comments in important places in the pseudo-code. ​ Clear pseudo-code provides evidence of your comprehension. ​ The correctness of your approach should be easy for the reader (e.g., TA) to verify. ​ If the TA cannot easily determine the code's correctness,​ then fewer points will be awarded. - # Implement the scoring algorithm. ​ When scoring, only align the first 5000 characters (bases) of each sequence. ​ Implement your algorithm in the ​PairWiseAlign.Align() method. ​ That method should return the alignment score. + # Implement the scoring algorithm. ​ When scoring, only align the first 5000 characters (bases) of each sequence. ​ Implement your algorithm in the ​PairWiseAlign.Align() method. ​ That method should return the alignment score. #* Document your algorithm with legible comments in important places in the code that make it obvious to the reader what your solution is doing. ​ This documentation also provides evidence of your comprehension. ​ With the aid of adequate documentation,​ the correctness of your approach should be easy for the reader (e.g., TA) to verify. ​ If the TA cannot easily determine the code's correctness,​ then fewer points will be awarded. #* Document your algorithm with legible comments in important places in the code that make it obvious to the reader what your solution is doing. ​ This documentation also provides evidence of your comprehension. ​ With the aid of adequate documentation,​ the correctness of your approach should be easy for the reader (e.g., TA) to verify. ​ If the TA cannot easily determine the code's correctness,​ then fewer points will be awarded. - # Record the pair-wise scores computed by the scoring algorithm in the given 10x10 score matrix such that position (i, j) in the display shows the optimal distance from <​math>​sequence_i ​to <​math>​sequence_j ​for the first 5000 characters. ​ Since the matrix is symmetric, you should only fill the upper part of the matrix (above the diagonal). ​ Correctness of your algorithm will be checked using this matrix. ​ (There is also no need to fill the diagonal, since the alignment of a sequence with itself consists only of matches.) + # Record the pair-wise scores computed by the scoring algorithm in the given 10x10 score matrix such that position (i, j) in the display shows the optimal distance from $sequence_i$ to $sequence_j$ for the first 5000 characters. ​ Since the matrix is symmetric, you should only fill the upper part of the matrix (above the diagonal). ​ Correctness of your algorithm will be checked using this matrix. ​ (There is also no need to fill the diagonal, since the alignment of a sequence with itself consists only of matches.) # As an aid to debugging, the first two sequences in the database are the strings: “polynomial” and “exponential”. ​ The string “polynomial” is the sequence for the first row (and column) of the score matrix, and “exponential” comes second. ​ While these strings aren’t biologically valid DNA sequences, they are the strings used in an example in the textbook. # As an aid to debugging, the first two sequences in the database are the strings: “polynomial” and “exponential”. ​ The string “polynomial” is the sequence for the first row (and column) of the score matrix, and “exponential” comes second. ​ While these strings aren’t biologically valid DNA sequences, they are the strings used in an example in the textbook. #* To help you verify the correctness of your algorithm, with our operator costs the optimal alignment of these two strings should be '''​-1'''​ (your code should compute that result for the cell at row 1 and column 2 in the table). #* To help you verify the correctness of your algorithm, with our operator costs the optimal alignment of these two strings should be '''​-1'''​ (your code should compute that result for the cell at row 1 and column 2 in the table). Line 80: Line 86: #* As a further aid for verifying the correctness of your algorithm, here is another value that should appear in your table: ​ (counting from 1) at row 3, column 4, the result should be '''​-14972'''​. ​ That is the result of aligning the first 5000 bases each of the genomes for “Bovine coronavirus isolate BCoV-LUN” and “Bovine coronavirus isolate BCoV-ENT”. #* As a further aid for verifying the correctness of your algorithm, here is another value that should appear in your table: ​ (counting from 1) at row 3, column 4, the result should be '''​-14972'''​. ​ That is the result of aligning the first 5000 bases each of the genomes for “Bovine coronavirus isolate BCoV-LUN” and “Bovine coronavirus isolate BCoV-ENT”. # Implement the extraction algorithm. ​ Your function should produce a character-by-character alignment of its two sequence arguments. ​ (This function should not be called to fill the score matrix.) # Implement the extraction algorithm. ​ Your function should produce a character-by-character alignment of its two sequence arguments. ​ (This function should not be called to fill the score matrix.) - #* Add a text field to your GUI for displaying an extracted alignment. ​ (Right click on ​MainForm.cs in the "​Solution Explorer",​ and choose "View Designer"​ to view the GUI in designer mode and gain access to the toolbox of visual controls.) + #* Add a text field to your GUI for displaying an extracted alignment. ​ (Right click on ​MainForm.cs in the "​Solution Explorer",​ and choose "View Designer"​ to view the GUI in designer mode and gain access to the toolbox of visual controls.) #* Add a click event to the GUI table cells. ​ The method handling this event should use your extraction algorithm to calculate the optimal alignment for the strings corresponding to that cell and should display the alignment in the text field. ​ Thus, the field will be updated whenever you click on a cell in the score table. #* Add a click event to the GUI table cells. ​ The method handling this event should use your extraction algorithm to calculate the optimal alignment for the strings corresponding to that cell and should display the alignment in the text field. ​ Thus, the field will be updated whenever you click on a cell in the score table. - #* To add the click event, select the ​DataGridView (the gray box in the middle of the main form), and select the Events tab (the lightning bolt) on the Properties list.  Add a handler for ​CellMouseClick by double clicking on its row. + #* To add the click event, select the ​DataGridView (the gray box in the middle of the main form), and select the Events tab (the lightning bolt) on the Properties list.  Add a handler for ​CellMouseClick by double clicking on its row. #* Display the alignment in a manner that easily reveals the matches, substitutions,​ and insertions/​deletions (indels), such as is shown in the short example below. ​ Gaps should be shown with a '​-'​ (hyphen). #* Display the alignment in a manner that easily reveals the matches, substitutions,​ and insertions/​deletions (indels), such as is shown in the short example below. ​ Gaps should be shown with a '​-'​ (hyphen). #* Extract and display the alignment of the first 100 characters (bases) (no more, no less) of each of the selected sequences in the pair. (The alignment itself will most likely include gaps and will therefore be longer than 100 characters.) #* Extract and display the alignment of the first 100 characters (bases) (no more, no less) of each of the selected sequences in the pair. (The alignment itself will most likely include gaps and will therefore be longer than 100 characters.) Line 89: Line 95: # Align the first 100 characters (bases) of sequence #3 (row 3) and #10 (column 10) (assuming the first sequence in the table is numbered as #1); extract and display the alignment in your report using a fixed-width font.  For example, if you were aligning "​AGCTCATGCT"​ and "​ACTGCATCA",​ the side-by-side alignment should be presented in the manner shown here: # Align the first 100 characters (bases) of sequence #3 (row 3) and #10 (column 10) (assuming the first sequence in the table is numbered as #1); extract and display the alignment in your report using a fixed-width font.  For example, if you were aligning "​AGCTCATGCT"​ and "​ACTGCATCA",​ the side-by-side alignment should be presented in the manner shown here:

-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  AGCT-CATGCT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      AGCTCATGCT
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  A-CTGCAT-CA                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ACTGCAT-CA
​
==Performance Requirement==                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ==Performance Requirement==
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  You must fill the 10x10 score matrix with your scoring algorithm in less than 30 seconds using <​math>​O(n^2) ​time and <​math>​O(n) ​space. ​ Remember: Since the matrix is symmetric, you should only fill the upper part of the matrix (above the diagonal)!                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      You must fill the 10x10 score matrix with your scoring algorithm in less than 30 seconds using $O(n^2)$ time and $O(n)$ space. ​ Remember: Since the matrix is symmetric, you should only fill the upper part of the matrix (above the diagonal)!
If you have implemented the algorithms correctly but are struggling to meet the time requirement,​ it is worth noting that your program will run more quickly if you:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  If you have implemented the algorithms correctly but are struggling to meet the time requirement,​ it is worth noting that your program will run more quickly if you:
Line 117:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Line 123:
Note that you may need to reset the Platform Target if you switch from Debug to Release or vice versa.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 Note that you may need to reset the Platform Target if you switch from Debug to Release or vice versa.
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Setting the Platform Target to "​x86"​ eliminates the ​InvalidOperationException with the message "The '​Microsoft.Jet.OLEDB.4.0'​ provider is not registered on the local machine."​                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Setting the Platform Target to "​x86"​ eliminates the ​InvalidOperationException with the message "The '​Microsoft.Jet.OLEDB.4.0'​ provider is not registered on the local machine."​
==== x64 ====                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ==== x64 ====
Line 158:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Line 164:
# Analyze your pseudo-code for both space and time complexity (''​worst-case''​). ​ Give reasonable justifications for your analysis.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  # Analyze your pseudo-code for both space and time complexity (''​worst-case''​). ​ Give reasonable justifications for your analysis.
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  #* [5 points] Your analysis should show that your scoring algorithm is at most <​math>​O(n^2) ​time and at most <​math>​O(n) ​space                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      #* [5 points] Your analysis should show that your scoring algorithm is at most $O(n^2)$ time and at most $O(n)$ space
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  #* [5 points] Your analysis should show that your extraction algorithm is at most <​math>​O(n^2) ​space and time.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      #* [5 points] Your analysis should show that your extraction algorithm is at most $O(n^2)$ space and time.
# [5 points] Write a paragraph that explains how your alignment extraction algorithm works. ​                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          # [5 points] Write a paragraph that explains how your alignment extraction algorithm works. ​
# Include a "​results"​ section showing:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               # Include a "​results"​ section showing:
Line 172:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Line 178:
#Experiment with alternative operator (substitution,​ match, indel) costs and discuss the impact of changing their values.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             #Experiment with alternative operator (substitution,​ match, indel) costs and discuss the impact of changing their values.
#Try aligning longer (sub-)sequences. ​ Conduct an empirical analysis. ​ Discuss impact on the alignment score matrix.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 #Try aligning longer (sub-)sequences. ​ Conduct an empirical analysis. ​ Discuss impact on the alignment score matrix.
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  #​Re-implement your alignment algorithm in top-down fashion using recursion and a memory function. ​ How does this algorithm compare to the implementation ​using a table?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      #​Re-implement your alignment algorithm in top-down fashion using recursion and a memory function. ​ How does this algorithm compare to the iterative, bottom-up ​implementation?​
#Try a shortest-path algorithm like Dijkstra’s to solve this problem.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  #Try a shortest-path algorithm like Dijkstra’s to solve this problem.
#Get ahead of the game, learn about the A* shortest-path algorithm, and implement it with a tight admissible heuristic (we will get to this topic later in the course, but you might enjoy trying it in the context of the alignment problem).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         #Get ahead of the game, learn about the A* shortest-path algorithm, and implement it with a tight admissible heuristic (we will get to this topic later in the course, but you might enjoy trying it in the context of the alignment problem).
Line 180:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Line 186:
'''​Coronaviruses:'''​ Coronaviruses were first isolated from chickens in 1937.  After the discovery of Rhinoviruses in the 1950’s, approximately 50% of colds still could not be ascribed to known agents. ​ In 1965, Tyrell and Bynoe used cultures of human ciliated embryonal trachea to propagate the first human coronavirus (HCoV) in vitro. ​ There are now approximately 15 species in this family, which infect not only man but cattle, pigs, rodents, cats, dogs, and birds (some are serious veterinary pathogens, especially in chickens). ​                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             '''​Coronaviruses:'''​ Coronaviruses were first isolated from chickens in 1937.  After the discovery of Rhinoviruses in the 1950’s, approximately 50% of colds still could not be ascribed to known agents. ​ In 1965, Tyrell and Bynoe used cultures of human ciliated embryonal trachea to propagate the first human coronavirus (HCoV) in vitro. ​ There are now approximately 15 species in this family, which infect not only man but cattle, pigs, rodents, cats, dogs, and birds (some are serious veterinary pathogens, especially in chickens). ​
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  ​                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               +
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  [[Image:GeneSequencer2.png|300px|thumb|Coronavirus]]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      [[media:cs-312:​project-5:​genesequencer2.png?300]]
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  [[Image:GeneSequencer3.png|300px|thumb|Coronavirus]]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            +                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      [[media:​cs-312:​project-5:genesequencer3.png?300]]
Coronavirus particles are irregularly-shaped,​ ~60-220nm in diameter, with an outer envelope bearing distinctive,​ ‘club-shaped’ peplomers (~20nm long x 10nm at wide distal end).  This ‘crown-like’ appearance (Latin, corona) gives the family its name.  The center of the particle appears amorphous in negatively shaped stained EM preps, the nucleocapsid being in a loosely wound rather disordered state. ​                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Coronavirus particles are irregularly-shaped,​ ~60-220nm in diameter, with an outer envelope bearing distinctive,​ ‘club-shaped’ peplomers (~20nm long x 10nm at wide distal end).  This ‘crown-like’ appearance (Latin, corona) gives the family its name.  The center of the particle appears amorphous in negatively shaped stained EM preps, the nucleocapsid being in a loosely wound rather disordered state. ​
Line 192:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Line 198:
'''​SARS:'''​ SARS is a type of viral pneumonia, with symptoms including fever, a dry cough, dyspnea (shortness of breath), headache, and hypoxaemia (low blood oxygen concentration). ​ Typical laboratory findings include lymphopaenia (reduced lymphocyte numbers) and mildly elevated aminotransferase levels (indicating liver damage). ​ Death may result from progressive respiratory failure due to alveolar damage. ​ The typical clinical course of SARS involves an improvement in symptoms during the first week of infection, followed by a worsening during the second week.  Studies indicate that this worsening may be related to a patient’s immune responses rather than uncontrolled viral replication.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           '''​SARS:'''​ SARS is a type of viral pneumonia, with symptoms including fever, a dry cough, dyspnea (shortness of breath), headache, and hypoxaemia (low blood oxygen concentration). ​ Typical laboratory findings include lymphopaenia (reduced lymphocyte numbers) and mildly elevated aminotransferase levels (indicating liver damage). ​ Death may result from progressive respiratory failure due to alveolar damage. ​ The typical clinical course of SARS involves an improvement in symptoms during the first week of infection, followed by a worsening during the second week.  Studies indicate that this worsening may be related to a patient’s immune responses rather than uncontrolled viral replication.
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  [[Image:GeneSequencer4.png|300px|thumb|SARS virus]]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             +
+                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      [[media:cs-312:​project-5:​genesequencer4.png?300]]
The outbreak is believed to have originated in February 2003 in the Guangdong province of China, where 300 people became ill, and at least five died.  After initial reports that a paramyxovirus was responsible,​ the true cause appears to be a novel coronavirus with some unusual properties. ​ For one thing, SARS virus can be grown in Vero cells (a fibroblast cell line isolated in 1962 from a primate)—a novel property for HCoV’s, most of which cannot be cultivated. ​ In these cells, virus infection results in a cytopathic effect, and budding of coronavirus-like particles from the endoplasmic reticulum within infected cells. ​                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                The outbreak is believed to have originated in February 2003 in the Guangdong province of China, where 300 people became ill, and at least five died.  After initial reports that a paramyxovirus was responsible,​ the true cause appears to be a novel coronavirus with some unusual properties. ​ For one thing, SARS virus can be grown in Vero cells (a fibroblast cell line isolated in 1962 from a primate)—a novel property for HCoV’s, most of which cannot be cultivated. ​ In these cells, virus infection results in a cytopathic effect, and budding of coronavirus-like particles from the endoplasmic reticulum within infected cells. ​
Line 208:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Line 215:
The availability of the sequence data and functional dissection of the SARS-CoV genome is the first step towards developing diagnostic tests, antiviral agents, and vaccines.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          The availability of the sequence data and functional dissection of the SARS-CoV genome is the first step towards developing diagnostic tests, antiviral agents, and vaccines.
-