Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cs-142:gene-finding-via-tata-box-search [2015/05/12 18:40] (current)
cs142ta created
Line 1: Line 1:
 +=Gene Finding via TATA box search=
 +==Problem==
  
 +* Within a long region of genomic sequence, genes are also characterised by having a the sequence “TATA” somewhere near the beginning of the string.
 +* Write a program to prompt the user for a string of DNA bases (ACTG)
 +* Search the string for the substring “TATA”
 +* If found, report the 0-based position of the first match. Otherwise report not found.
 +
 +A contextually-related problem is [[Gene Finding via GC content]].
 +
 +==Solution==
 +<code cpp>
 +/*
 +Test Case 1:
 +Input: TATA (example of string that is exactly what we're looking for)
 +Expected Output: 0
 +Actual Output: 0 (was 1 before adjusted i in cout statement)
 +
 +Test Case 2:
 +Input: GATATA (example of string that contains what we're looking for)
 +Expected Output: 2
 +Actual Output: 2 (was 3 before adjusted i in cout statement)
 +
 +Test Case 3:
 +Input: A (example of string that doesn'​t contain subsequence TATA)
 +Expected Output: Not Found
 +Actual Output: Not Found (was 2 before adding if around last cout statement)
 +
 +Other example Test Case inputs: ​ GTATAG (TATA is in the middle), TAT (Almost TATA),​TATAGCTATA (TATA appears twice) etc.
 +*/
 +
 +#include <​iostream>​
 +#include <​string>​
 +
 +using namespace std;
 +
 +int main()
 +{
 + // Inputs: DNA sequence
 + // Outputs: First 0-based position of "​TATA"​ found in input or "Not Found"
 +
 + // Define subsequence to find
 + string subseq_to_find = "​TATA";​
 +
 + // Prompt user for sequence
 + cout << "DNA seq please: ";
 + string dna_seq;
 + cin >> dna_seq;
 +
 + // Use a variable that will keep track whether or not we have found TATA yet
 + bool found = false;
 +
 + int i = 0;
 +
 + // As long as we haven'​t found TATA and as long as we haven'​t checked every position in the input sequence
 + while (!found && i < dna_seq.length())
 + {
 + // check if a substring starting the current position is the same as the subsequence we're looking for
 + if (subseq_to_find == dna_seq.substr(i,​ subseq_to_find.length()))
 + {
 + // if it is, then we say we've found it
 + found = true; // how exciting!
 + }
 + // be sure to increment the current position for the next time through the loop.
 + i++;
 + }
 +
 + if (found)
 + {
 + cout << subseq_to_find << " was found at position " << i - 1 << endl;
 + }
 + else
 + {
 + cout << "Not found" << endl;
 + }
 +
 + system("​pause"​);​
 + return 0;
 +}
 +</​code>​
cs-142/gene-finding-via-tata-box-search.txt · Last modified: 2015/05/12 18:40 by cs142ta
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0