CoV2D BrowserTM

CoV2D project home | Random page
Parikh vectors
5KVU_1 2PWV_1 7HVI_1 Letter Amino acid
56 24 9 D Aspartic acid
19 10 6 H Histidine
66 27 12 L Leucine
44 14 2 K Lycine
28 17 2 M Methionine
24 20 3 F Phenylalanine
38 32 9 R Arginine
6 5 6 C Cysteine
53 23 7 E Glutamic acid
50 33 14 G Glycine
41 30 14 S Serine
6 4 3 W Tryptophan
50 18 14 V Valine
73 38 10 A Alanine
25 18 5 Q Glutamine
37 17 4 P Proline
23 8 8 Y Tyrosine
24 8 5 N Asparagine
40 18 6 T Threonine
42 22 5 I Isoleucine

5KVU_1|Chains A, B, C, D|Isocitrate dehydrogenase|Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) (83332)
>2PWV_1|Chain A|Queuine tRNA-ribosyltransferase|Zymomonas mobilis (542)
>7HVI_1|Chain A|Protease 2A|Coxsackievirus A16 (31704)
Protein code \(c\) LZ-complexity \(\mathrm{LZ}(w)\) Length \(n=|w|\) \(\frac{\mathrm{LZ}(w)}{n /\log_{20} n}\) \(p_w(1)\) \(p_w(2)\) \(p_w(3)\) Sequence \(w=f(c)\)
5KVU , Knot 284 745 0.84 40 291 679
MSAEQPTIIYTLTDEAPLLATYAFLPIVRAFAEPAGIKIEASDISVAARILAEFPDYLTEEQRVPDNLAELGRLTQLPDTNIIKLPNISASVPQLVAAIKELQDKGYAVPDYPADPKTDQEKAIKERYARCLGSAVNPVLRQGNSDRRAPKAVKEYARKHPHSMGEWSMASRTHVAHMRHGDFYAGEKSMTLDRARNVRMELLAKSGKTIVLKPEVPLDDGDVIDSMFMSKKALCDFYEEQMQDAFETGVMFSLHVKATMMKVSHPIVFGHAVRIFYKDAFAKHQELFDDLGVNVNNGLSDLYSKIESLPASQRDEIIEDLHRCHEHRPELAMVDSARGISNFHSPSDVIVDASMPAMIRAGGKMYGADGKLKDTKAVNPESTFSRIYQEIINFCKTNGQFDPTTMGTVPNVGLMAQQAEEYGSHDKTFEIPEDGVANIVDVATGEVLLTENVEAGDIWRMCIVKDAPIRDWVKLAVTRARISGMPVLFWLDPYRPHENELIKKVKTYLKDHDTEGLDIQIMSQVRSMRYTCERLVRGLDTIAATGNILRDYLTDLFPILELGTSAKMLSVVPLMAGGGMYETGAGGSAPKHVKQLVEENHLRWDSLGEFLALGAGFEDIGIKTGNERAKLLGKTLDAAIGKLLDNDKSPSRKTGELDNRGSQFYLAMYWAQELAAQTDDQQLAEHFASLADVLTKNEDVIVRELTEVQGEPVDIGGYYAPDSDMTTAVMRPSKTFNAALEAVQG
2PWV , Knot 160 386 0.82 40 215 372
MVEATAQETDRPRFSFSIAAREGKARTGTIEMKRGVIRTPAFMPVGTAATVKALKPETVRATGADIILGNTYHLMLRPGAERIAKLGGLHSFMGWDRPILTDSGGYQVMSLSSLTKQSEEGVTFKSHLDGSRHMLSPERSIEIQHLLGSDIVMAFDECTPYPATPSRAASSMERSMRWAKRSRDAFDSRKEQAENAALFGIQQGSVFENLRQQSADALAEIGFDGYAVGGLAVGQGQDEMFRVLDFSVPMLPDDKPHYLMGVGKPDDIVGAVERGIDMFDCVLPTRSGRNGQAFTWDGPINIRNARFSEDLKPLDSECHCAVCQKWSRAYIHHLIRAGEILGAMLMTEHNIAFYQQLMQKIRDSISEGRFSQFAQDFRARYFARNS
7HVI , Knot 70 144 0.80 40 112 139
SGAIYVGNYRVVNRHLATHNDWANLVWEDSSRDLLVSSTTAQGCDTIARCDCQTGVYYCSSRRKHYPVSFSKPSLIFVEASEYYPARYQSHLMLAVGHSEPGDCGGILRCQHGVVGIVSTGGNGLVGFADVRDLLWLDEEAMEQ

Let \(P_w(n)\) be the set of distinct subwords (intervals) in a word \(w\). Let \(p_w(n)\) be the cardinality of \(P_w(n)\). Let \(f(c)\) be the sequence in FASTA with 4-symbol Protein Data Bank code \(c\).

\(|P_{f(5KVU_1)}(2) \setminus P_{f(2PWV_1)}(2)|=114\), \(|P_{f(2PWV_1)}(2) \setminus P_{f(5KVU_1)}(2)|=38\). Let \( Z_k(x,y)=|P_x(k)\setminus P_y(k)|+|P_y(k)\setminus P_x(k)| \) be a LZ76 style (set of subwords) Jaccard distance numerator for \(P(k)\).Hydrophobic-polar version of Sequence 1:1010010110010001111100111111011101111010100101110111011001000001100110110100110001101101010110111110010001011100110100000011000010011011011100100000110110001000100110101100001101001010110001010010010101110010011101011100101100111000110010000100110011110101010110100111110110110001110000110011101001100100010011100000110010000000101111001011001001001110101111101110101101010000110100010010001101000010101001101101111100100010000010110011101101101011100010110110101100111001101110010101111111101001000011001000100000011010110010010000001101100111010110001001111101100101101111111111000111101100100110000101001101111111100111001000101110010111101100000100001010001001011101100111000000110011011011000001110010010101101110011000100111010001011101101
Pair \(Z_2\) Length of longest common subsequence
5KVU_1,2PWV_1 152 4
5KVU_1,7HVI_1 221 4
2PWV_1,7HVI_1 185 4

Newick tree

 
[
	7HVI_1:10.17,
	[
		5KVU_1:76,2PWV_1:76
	]:33.17
]

Let d be the Otu--Sayood distance d.
Let d1 be the Otu--Sayood distance d1. (This makes the 4TYN sequence AAAAAA a close match...)
A roughly speaking expected distance is \((0.85)(0.8)(\frac{1131 }{\log_{20} 1131}-\frac{386}{\log_{20}386})=195.\)
Status Protein1 Protein2 d d1/2
Query variables 5KVU_1 2PWV_1 247 186
Was not able to put for d
Was not able to put for d1

In notation analogous to [Theorem 16, Kjos-Hanssen, Niraula and Yoon (2022)],
\[ \delta= \alpha \mathrm{min} + (1-\alpha) \mathrm{max}= \begin{cases} d &\alpha=0,\\ d_1/2 &\alpha=1/2 \end{cases} \]