Mutations of SARS-CoV-2 spike protein at a glanceTM

Project home
Polypeptide \ Residue # 5 18 19 20 26 74 88 95 138 142 156 157 158 190 417 439 452 478 484 501 520 572 614 632 655 681 797 950 1027 1176 1259 1263 LZ complexity Palindromes (#,longest)
YP_009724390 Wuhan RefSeq and also Guam QJS54490 May 13, 2020 L L T T P N D T D G E F R R K N L T E N A T D T H P F D T V D P 452 112, ADTTDA,
QIC53204 Sweden February 7, 2020 L L T T P N D T D G E F R R K N L T E N A T D T H P C D T V D P 452 111, ADTTDA,
QKT21230 Hawaii March 12, 2020 and also Mink MT396266, 2020 and also Kenya, December 2, 2021 L L T T P N D T D G E F R R K N L T E N A T G T H P F D T V D P 452 112, ADTTDA,
QJU70329 Alaska March 24, 2020 L L T T P N A T D G E F R R K N L T E N A T G T H P F D T V D P 451 112, ADTTDA,
QJU70425 Alaska March 25, 2020 L L T T P N D T D G E F R R K N L T E N A T G T H P F D T V D L 452 112, ADTTDA,
QJU70545 Alaska April 26, 2020 L L T T P N D T D G E F R R K N L T E N S T G T H P F D T V D P 452 112, ADTTDA,
QIZ15981 Nevada March 9, 2020 F L T T P N D T D G E F R R K N L T E N A T D T H P F D T V D P 453 112, ADTTDA,
QLM06051 Illinois July 7, 2020 L L T T P N D T D G E F R R K K L T E N A T G T H P F D T V D P 454 112, ADTTDA,
QLF80256 Brazil March 20, 2020 L L T T P N D T D G E F R R K N L T E N A T G T H P F D T F D P 452 112, ADTTDA,
QJA41641 Brazil March 18, 2020 L L T T P K D T D G E F R R K N L T E N A T D T H P F D T V D P 452 112, ADTTDA,
Ohio released July 24, 2021 L F T N S N D T Y G E F R S T N L T K Y A T G T Y P F D I F D P 452 112, ADTTDA,
Bahrain December 5, 2021 L L R T P N D I D G _ _ G R K N R K K N A I G S H R F N T V Y P 451 114, QNVVNQ,
USA December 5, 2021 L L R T P N D T D D _ _ G R K N R K E N A T G T H R F N T V D P 450 118, ADTTDA, QNVVNQ,
Distance matrix using the metric: number of 3-letter subwords of one but not the other.
0 5 5 10 9 9 5 14 7 2 50 61 41
5 0 10 15 14 14 10 19 12 7 55 66 46
5 10 0 5 4 4 10 9 2 7 45 56 36
10 15 5 0 9 9 15 14 7 12 50 61 41
9 14 4 9 0 8 14 13 6 11 49 60 40
9 14 4 9 8 0 14 13 6 11 49 60 40
5 10 10 15 14 14 0 19 12 7 55 66 46
14 19 9 14 13 13 19 0 11 16 54 65 45
7 12 2 7 6 6 12 11 0 9 43 58 38
2 7 7 12 11 11 7 16 9 0 52 63 43
50 55 45 50 49 49 55 54 43 52 0 85 75
61 66 56 61 60 60 66 65 58 63 85 0 30
41 46 36 41 40 40 46 45 38 43 75 30 0

Some notes

"The Delta Variant is defined by 19R, (G142D), 156del, 157del, R158G, L452, T478K, D614G, P681R, D950N mutations in the spike protein." This information was used above at the points marked "_" since the fact that it is R158G and not R156G, say, is not apparent from the sequence.

The phylogenetic tree above is generated using the Ward method of aggregation and the hclust package.

The D614G mutation is receiving attention for possibly making covid-19 more contagious.

"The N439K change added a chemical bond between the ACE2 and the spike protein."

"The L5F mutation (signal peptide mutation, and it is difficult anticipate how they might impact the virus. Variation in the signal peptide of other viruses, for example HIV Env, can impact posttranslational modifications in the endoplasmic reticulum, including folding, expression levels, and glycosylation) is intriguing because of its recurrence in many lineages throughout the SARS-CoV-2 phylogenetic tree, and in many different countries throughout the world."

But there is also D88A, A520S, P1263L (not part of the main structure).

The University of Maryland CoV3D Mutation Viewer has more info on some of the sequences.

University of Melbourne has info on particular mutations.

University of Warsaw KnotProt

According to Sørensen et al., ADTTDA is a non human-like (NHL) sequence in this context (as is IPIGAG) --- but that may be a coincidence. The sequence with only 111 palindromes is missing FGGF which became CGGF.

Credit to: Stephen D. Shank, Steven Weaver, Sergei L. Kosakovsky Pond. phylotree.js - a JavaScript library for application development and interactive data visualization in phylogenetics. BMC Bioinformatics, 2018, Volume 19, Number 1, Page 1