Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

To create GridArrows and heatmap plots for NeedlemanWunsch alignment using Python, the following steps can be followed:

  1. Install the necessary libraries: Firstly, you need to install the NumPy and Matplotlib libraries.

  2. Generate the alignment matrix: You should generate a matrix with values representing the scores obtained when aligning two sequences. This can be done using the NeedlemanWunsch algorithm.

  3. Create the GridArrows plot: After generating the matrix, you can create a GridArrows plot by using the Matplotlib library. This plot will display the alignment matrix as a grid of arrows indicating the direction of alignment.

  4. Create the heatmap plot: You can also create a heatmap plot to visualize the alignment matrix. This plot represents the values in the matrix using a color scale. Low values are represented with lighter colors, whereas high values are represented with darker colors.

Here is an example implementation of the above steps in Python:

import numpy as np
import matplotlib.pyplot as plt
from Bio.SubsMat.MatrixInfo import blosum62
from Bio import pairwise2

# Generate alignment matrix using NeedlemanWunch algorithm
seq1 = "AGGTACG"
seq2 = "TACGTAG"
alignment_matrix = np.zeros((len(seq1)+1, len(seq2)+1))
for i in range(1, len(seq1)+1):
    for j in range(1, len(seq2)+1):
        match_score = blosum62.get((seq1[i-1], seq2[j-1]), -5)
        alignment_matrix[i][j] = max(alignment_matrix[i-1][j-1] + match_score, 
                                      alignment_matrix[i][j-1] -5, 
                                      alignment_matrix[i-1][j] -5)

# Create GridArrow plot
fig, ax = plt.subplots()
ax.matshow(alignment_matrix, cmap='Blues')
for i in range(len(seq1)):
    for j in range(len(seq2)):
        arrow = ''
        if i == 0 or j == 0:
            continue
        if alignment_matrix[i][j] == alignment_matrix[i-1][j-1] + blosum62.get((seq1[i-1], seq2[j-1]), -5):
            arrow = '\\'
        elif alignment_matrix[i][j] == alignment_matrix[i][j-1] - 5:
            arrow = '-'
        elif alignment_matrix[i][j] == alignment_matrix[i-1][j] - 5:
            arrow = '|'
        ax.text(j-0.5, i-0.5, arrow, ha='center', va='center', color='red', fontsize=20)

# Create heatmap plot
fig, ax = plt.subplots()
im = ax.imshow(alignment_matrix, cmap='coolwarm')
ax.set_xticks(np.arange(len(seq2)) + 0.5)
ax.set_yticks(np.arange(len(seq1)) + 0.5)
ax.set_xticklabels(list(seq2))
ax.set_yticklabels(list(seq1))
cbar = ax.figure.colorbar(im, ax=ax)
cbar.ax.set_ylabel('Score', rotation=-90, va="bottom")
plt.show()

The alignment_matrix variable stores the scores obtained when aligning the sequences seq1 and seq2 using the NeedlemanWunsch algorithm. The GridArrow plot displays the alignment matrix as a grid of arrows indicating the direction of alignment, and the heatmap plot represents the values in the matrix using a color scale. The final plots will be displayed using the plt.show() function.

Note that in the above example, we have used the BLOSUM62 substitution matrix to calculate match scores. You can replace this with any other substitution matrix or scoring scheme of your choice.