CoV2D BrowserTM

CoV2D project home | Random page
Parikh vectors
5KIR_1 9JDG_1 7NRY_1 Letter Amino acid
23 19 14 D Aspartic acid
18 6 9 H Histidine
31 44 23 I Isoleucine
46 70 29 L Leucine
6 20 5 W Tryptophan
32 49 19 V Valine
25 34 15 A Alanine
36 53 16 G Glycine
14 18 13 M Methionine
24 19 18 R Arginine
28 17 12 N Asparagine
32 27 25 K Lycine
29 42 19 S Serine
27 34 14 Y Tyrosine
12 19 7 C Cysteine
30 13 15 Q Glutamine
33 26 27 E Glutamic acid
38 43 10 F Phenylalanine
38 35 17 P Proline
29 32 20 T Threonine

5KIR_1|Chains A, B|Prostaglandin G/H synthase 2|Homo sapiens (9606)
>9JDG_1|Chain A|Sodium- and chloride-dependent taurine transporter|Homo sapiens (9606)
>7NRY_1|Chain A[auth X]|MAP kinase-activated protein kinase 2|Homo sapiens (9606)
Protein code \(c\) LZ-complexity \(\mathrm{LZ}(w)\) Length \(n=|w|\) \(\frac{\mathrm{LZ}(w)}{n /\log_{20} n}\) \(p_w(1)\) \(p_w(2)\) \(p_w(3)\) Sequence \(w=f(c)\)
5KIR , Knot 224 551 0.85 40 279 527
NPCCSHPCQNRGVCMSVGFDQYKCDCTRTGFYGENCSTPEFLTRIKLFLKPTPNTVHYILTHFKGFWNVVNNIPFLRNAIMSYVLTSRSHLIDSPPTYNADYGYKSWEAFSNLSYYTRALPPVPDDCPTPLGVKGKKQLPDSNEIVEKLLLRRKFIPDPQGSNMMFAFFAQHFTHQFFKTDHKRGPAFTNGLGHGVDLNHIYGETLARQRKLRLFKDGKMKYQIIDGEMYPPTVKDTQAEMIYPPQVPEHLRFAVGQEVFGLVPGLMMYATIWLREHNRVCDVLKQEHPEWGDEQLFQTSRLILIGETIKIVIEDYVNHLSGYHFKLKFDPELLFNKQFQYQNRIAAEFNTLYHWHPLLPDTFQIHDQKYNYQQFIYNNSILLEHGITQFVESFTRQIAGRVAGGRNVPPAVQKVSQASIDQSRQMKYQSFNEYRKRFMLKPYESFEELTGEKEMSAELEALYGDIDAVELYPALLVEKPRPDAIFGETMVEVGAPFSLKGLMGNVICSPAYWKPSTFGGEVGFQIINTASIQSLICNNVKGCPFTSFSVP
9JDG , Knot 245 620 0.84 40 280 577
MATKEKLQCLKDFHKDILKPSPGKSPGTRPEDEAEGKPPQREKWSSKIDFVLSVAGGFVGLGNVWRFPYLCYKNGGGAFLIPYFIFLFGSGLPVFFLEIIIGQYTSEGGITCWEKICPLFSGIGYASVVIVSLLNVYYIVILAWATYYLFQSFQKELPWAHCNHSWNTPHCMEDTMRKNKSVWITISSTNFTSPVIEFWERNVLSLSPGIDHPGSLKWDLALCLLLVWLVCFFCIWKGVRSTGKVVYFTATFPFAMLLVLLVRGLTLPGAGAGIKFYLYPDITRLEDPQVWIDAGTQIFFSYAICLGAMTSLGSYNKYKYNSYRDCMLLGCLNSGTSFVSGFAIFSILGFMAQEQGVDIADVAESGPGLAFIAYPKAVTMMPLPTFWSILFFIMLLLLGLDSQFVEVEGQITSLVDLYPSFLRKGYRREIFIAFVCSISYLLGLTMVTEGGMYVFQLFDYYAASGVCLLWVAFFECFVIAWIYGGDNLYDGIEDMIGYRPGPWMKYSWAVITPVLCVGCFIFSLVKYVPLTYNKTYVYPNWAIGLGWSLALSSMLCVPLVIVIRLCQTEGPFLVRVKYLLTPREPNRWAVEREGATPYNSRTVMNGALVKPTHIIVETMM
7NRY , Knot 147 327 0.86 40 217 317
QFHVKSGLQIKKNAIIDDYKVTSQVLGLGINGKVLQIFNKRTQEKFALKMLQDCPKARREVELHWRASQCPHIVRIVDVYENLYAGRKCLLIVMECLDGGELFSRIQDRGDQAFTEREASEIMKSIGEAIQYLHSINIAHRDVKPENLLYTSKRPNAILKLTDFGFAKETTSHNSLTTPCYTPYYVAPEVLGPEKYDKSCDMWSLGVIMYILLCGYPPFYSNHGLAISPGMKTRIRMGQYEFPNPEWSEVSEEVKMLIRNLLKTEPTQRMTITEFMNHPWIMQSTKVPQTPLHTSRVLKEDKERWEDVKEEMTSALATMRVDYEQIK

Let \(P_w(n)\) be the set of distinct subwords (intervals) in a word \(w\). Let \(p_w(n)\) be the cardinality of \(P_w(n)\). Let \(f(c)\) be the sequence in FASTA with 4-symbol Protein Data Bank code \(c\).

\(|P_{f(5KIR_1)}(2) \setminus P_{f(9JDG_1)}(2)|=69\), \(|P_{f(9JDG_1)}(2) \setminus P_{f(5KIR_1)}(2)|=70\). Let \( Z_k(x,y)=|P_x(k)\setminus P_y(k)|+|P_y(k)\setminus P_x(k)| \) be a LZ76 style (set of subwords) Jaccard distance numerator for \(P(k)\).Hydrophobic-polar version of Sequence 1:01000010000110101110000000000110100000101100101110101001001100101110110011110011100110000011001100010010001011001000001111110001011110100011000011001110001110101001111111001000110000001111001110110100101001100001011001010001101010110100001011011011001011110011111111110101110000010011000010110001100001111100101110001001010010101010111000100000111010010010111100101000000000110000111001100110010001110111100111110010010100000100001000000111010001001010001010101101010110101111100101011110011011111010111101100110101001110111011001010011000101011001011
Pair \(Z_2\) Length of longest common subsequence
5KIR_1,9JDG_1 139 4
5KIR_1,7NRY_1 176 3
9JDG_1,7NRY_1 159 3

Newick tree

 
[
	7NRY_1:88.12,
	[
		5KIR_1:69.5,9JDG_1:69.5
	]:18.62
]

Let d be the Otu--Sayood distance d.
Let d1 be the Otu--Sayood distance d1. (This makes the 4TYN sequence AAAAAA a close match...)
A roughly speaking expected distance is \((0.85)(0.8)(\frac{1171 }{\log_{20} 1171}-\frac{551}{\log_{20}551})=159.\)
Status Protein1 Protein2 d d1/2
Query variables 5KIR_1 9JDG_1 203 193.5
Was not able to put for d
Was not able to put for d1

In notation analogous to [Theorem 16, Kjos-Hanssen, Niraula and Yoon (2022)],
\[ \delta= \alpha \mathrm{min} + (1-\alpha) \mathrm{max}= \begin{cases} d &\alpha=0,\\ d_1/2 &\alpha=1/2 \end{cases} \]