CoV2D BrowserTM

CoV2D project home | Random page
Parikh vectors
4JEU_1 5BYM_1 8EWB_1 Letter Amino acid
16 33 11 N Asparagine
21 24 9 Q Glutamine
45 44 15 I Isoleucine
60 52 19 L Leucine
37 23 17 T Threonine
29 8 15 R Arginine
7 9 1 C Cysteine
44 15 27 E Glutamic acid
52 30 7 K Lycine
16 3 5 H Histidine
5 0 5 W Tryptophan
31 24 22 V Valine
13 26 9 F Phenylalanine
28 9 16 P Proline
44 28 11 S Serine
23 14 6 Y Tyrosine
37 16 31 A Alanine
41 21 13 D Aspartic acid
23 17 11 G Glycine
18 4 2 M Methionine

4JEU_1|Chain A|Syntaxin-binding protein 1|Rattus norvegicus (10116)
>5BYM_1|Chain A|Suppressor protein MPT5|Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (559292)
>8EWB_1|Chain A[auth BA]|40S ribosomal protein S0-A|Saccharomyces cerevisiae (4932)
Protein code \(c\) LZ-complexity \(\mathrm{LZ}(w)\) Length \(n=|w|\) \(\frac{\mathrm{LZ}(w)}{n /\log_{20} n}\) \(p_w(1)\) \(p_w(2)\) \(p_w(3)\) Sequence \(w=f(c)\)
4JEU , Knot 237 590 0.85 40 264 555
PIGLKAVVGEKIMHDVIKKVKKKGEWKVLVVDQLSMRMLSSCCKMTDIMTEGITIVEDINKRREPLPSLEAVYLITPSEKSVHSLISDFKDPPTAKYRAAHVFFTDSCPDALFNELVKSRAAKVIKTLTEINIAFLPYESQVYSLDSADSFQSFYSPHKAQMKNPILERLAEQIATLCATLKEYPAVRYRGEYKDNALLAQLIQDKLDAYKADDPTMGEGPDKARSQLLILDRGFDPSSPVLHELTFQAMSYDLLPIENDVYKYETSGIGEARVKEVLLDEDDDLWIALRHKHIAEVSQEVTRSLKDFSSSKRMNTGEKTTMRDLSQMLKKMPQYQKELSKYSTHLHLAEDCMKHYQGTVDKLCRVEQDLAMGTDAEGEKIKDPMRAIVPILLDANVSTYDKIRIILLYIFLKNGITEENLNKLIQHAQIPPEDSEIITNMAHLGVPIVTDSTLRRRSKPERKERISEQTYQLSRWTPIIKDIMEDTIEDKLDTKHYPYISTRSSASFSTTAVSARYGHWHKNKAPGEYRSGPRLIIFILGGVSLNEMRCAYEVTQANGKWEVLIGSTHILTPQKLLDTLKKLNKTDEEI
5BYM , Knot 165 400 0.82 38 192 373
SMVEISALPLRDLDYIKLATDQFGCRFLQKKLETPSESNMVRDLMYEQIKPFFLDLILDPFGNYLVQKLCDYLTAEQKTLLIQTIYPNVFQISINQYGTRSLQKIIDTVDNEVQIDLIIKGFSQEFTSIEQVVTLINDLNGNHVIQKCIFKFSPSKFGFIIDAIVEQNNIITISTHKHGCCVLQKLLSVCTLQQIFKISVKIVQFLPGLINDQFGNYIIQFLLDIKELDFYLLAELFNRLSNELCQLSCLKFSSNVVEKFIKKLFRIITGFIVNNNGGASQRTAVASDDVINASMNILLTTIDIFTVNLNVLIRDNFGNYALQTLLDVKNYSPLLAYNKNSNAIGQNSSSTLNYGNFCNDFSLKIGNLIVLTKELLPSIKTTSYAKKIKLKVKAYAEATG
8EWB , Knot 113 252 0.82 40 162 238
MSLPATFDLTPEDAQLLLAANTHLGARNVQVHQEPYVFNARPDGVHVINVGKTWEKLVLAARIIAAIPNPEDVVAISSRTFGQRAVLKFAAHTGATPIAGRFTPGSFTNYITRSFKEPRLVIVTDPRSDAQAIKEASYVNIPVIALTDLDSPSEFVDVAIPCNNRGKHSIGLIWYLLAREVLRLRGALVDRTQPWSIMPDLYFYRDPEEVEQQVAEEATTEEAGEEEAKEEVTEEQAEATEWAEENADNVEW

Let \(P_w(n)\) be the set of distinct subwords (intervals) in a word \(w\). Let \(p_w(n)\) be the cardinality of \(P_w(n)\). Let \(f(c)\) be the sequence in FASTA with 4-symbol Protein Data Bank code \(c\).

\(|P_{f(4JEU_1)}(2) \setminus P_{f(5BYM_1)}(2)|=120\), \(|P_{f(5BYM_1)}(2) \setminus P_{f(4JEU_1)}(2)|=48\). Let \( Z_k(x,y)=|P_x(k)\setminus P_y(k)|+|P_y(k)\setminus P_x(k)| \) be a LZ76 style (set of subwords) Jaccard distance numerator for \(P(k)\).Hydrophobic-polar version of Sequence 1:11110111100110011001000101011110010101100000100110011011001000001110101101101000010011001001101000110111000010111001100011011001001011111000010010010010010010010100111001100110101010001110001000001111011000101001001011011001000111100110100111001010110001111000100000011101010011100000111110000110100010001001000001001000010010011001100000100000010110001000010100100100011110010100100110111111101010000010111101110011000010011001011100001100110111111000010000010000010000001001011100110001000100000101000001010001101001010000111000011011111111101001001001001010101111000110100110010010000001
Pair \(Z_2\) Length of longest common subsequence
4JEU_1,5BYM_1 168 4
4JEU_1,8EWB_1 176 4
5BYM_1,8EWB_1 168 4

Newick tree

 
[
	8EWB_1:86.68,
	[
		4JEU_1:84,5BYM_1:84
	]:2.68
]

Let d be the Otu--Sayood distance d.
Let d1 be the Otu--Sayood distance d1. (This makes the 4TYN sequence AAAAAA a close match...)
A roughly speaking expected distance is \((0.85)(0.8)(\frac{990 }{\log_{20} 990}-\frac{400}{\log_{20}400})=156.\)
Status Protein1 Protein2 d d1/2
Query variables 4JEU_1 5BYM_1 201 163
Was not able to put for d
Was not able to put for d1

In notation analogous to [Theorem 16, Kjos-Hanssen, Niraula and Yoon (2022)],
\[ \delta= \alpha \mathrm{min} + (1-\alpha) \mathrm{max}= \begin{cases} d &\alpha=0,\\ d_1/2 &\alpha=1/2 \end{cases} \]