Tuesday, August 25, 2020

Different types of scoring matrix

 We know how to calculate scoring matrix. There are different types of scoring matrix based on the concept of frequency of occuring of certain amino acids within the biological sequences, there are multiple strategies to compute this frequencies. Towards that we are going discuss about PAM Scoring Matrix.

The alignments are scored very nicely if we use an empirical scoring method. That is a scoring matrix based on experimental observed frequencies of amino acids, there are multiple types of matrix but the main matrices are PAM and BLOSSUM matrix.

So first we will discuss about PAM Scoring Matrix

PAM - Point Accepted Mutations Matrix.

Point accepted Mutation Matrix is a substitution of one amino acid by another such that the protein stays conserved.

Note: There are cases where the substitution of one amino acid by another amino acid changes protien, but in PAM those mutations(substitutions) are considered where the overall function of the protein stay conserved or stays the same.

PAM Unit:

PAM unit is the time in which 1% of amino acids in a sequence undergo accepted mutations and this will be PAM1.

Since sequences are long and there are a multiple neucliotides or amino acids in the sequences, then 1% of sequences changed does not necessarily mean that a 100% PAM will have 100% variation in the sequences, because the same site can be changed more than one time, ie. if the same site mutated multiple times then the entire sequence may not be changed for the entire protien.

To understand this we will consider a graph, in which the sequence difference and PAM Distance are compared


Figure 1:

In this a PAM Distance of 100 dosen't mean that there is 100 percentage change in the sequence, because one site on mutation will further mutate and will be accumulated with multiple mutations.

So the experimental data shows if you have a 100 PAM Distance then only 55 to 60 percentage of the sites in the protein are actually mutated. So for the 85 % variations the PAM Distance will increase over than 300.












figure 2

In figure 2 the value of k is 20 (ie 20 amino acids) you can calculate the PAM 1 by using this calculation.




If you want to find PAM 2 then just square the PAM1

similarly if you want to find PAM'n' then multiply PAM 1 'n' times.



If you multiply PAM 1 by 250 times then PAM 250 matrix will get the subsitution matrix like this













In conclusion PAM Matrixes are scoring matrixes that are used for comparing sequences. We can compute PAM 1, PAM2 etc upto PAM 250 or more and PAM 120 is considered as optimal scoring matrix




No comments:

Post a Comment