# Homework Assignment #17

## Objective

To employ dynamic programming to solve the gene sequence alignment problem.

## Exercises

### Question 1

Find the optimal alignment using dynamic programming (by hand only) of $x=$ ATGCC and $y=$ TACGCA . Use the same cost function we used in the lecture, namely: $c_{indel} = 5$; $c_{sub} = 1$; and $c_{match} = -3$.

(a) Show your DP table.

(b) Circle the optimal alignment cost.

(c ) Extract an optimal alignment from the table in the form of a sequence of edit operators. Break ties among optimal alternatives by preferring a backpointer leading to the left over a packpointer leading diagonally over a packpointer leading downward. Write the edit operators (not their coordinates) in forward order from start $(0,0)$ to finish $(m,n)$. Use the following notation:

to represent a match of an A with an A, for example, use **Match(A)**

to represent an indel involving a gap on the $x$ side and an A on the $y$ side, use **Indel(-, A)**

to represent a substitution of an A on the $x$ side for C on the $y$ side, use **Sub(A, C)**

(d) Translate the optimal alignment from part (c ) into the two-row, side-by-side form, where each letter in the first sequence (or a “-” for a gap) is adjacent to a letter (or a “-”) in the second sequence.

### Question 2

Now find the optimal alignment using dynamic programming (by hand *or in a spreadsheet*) of the same sequences, ATGCC and TACGCA . Use a new cost function, namely: $c_{indel} = 1$; $c_{sub} = 2$; and $c_{match} = 0$.

(a) Show your DP table.

(b) Circle the optimal alignment cost.

(c ) Extract an optimal alignment from the table in the form of a sequence of edit operators by following the same instructions specified for question 1, part (c ).

(d) Translate the optimal alignment from part (c ) into the two-row, side-by-side form by following the same instructions specified for question 1, part (d).

Back to top