CoV2D BrowserTM

CoV2D project home | Random page
Parikh vectors
6CAJ_1 3KBD_1 7OTT_1 Letter Amino acid
49 2 65 A Alanine
44 0 22 D Aspartic acid
46 3 30 G Glycine
74 0 36 L Leucine
32 0 31 P Proline
9 0 5 W Tryptophan
62 0 41 V Valine
31 0 14 N Asparagine
35 0 13 Q Glutamine
33 0 23 I Isoleucine
31 0 36 K Lycine
38 0 16 R Arginine
26 0 11 F Phenylalanine
17 0 5 Y Tyrosine
13 6 1 C Cysteine
60 0 38 E Glutamic acid
18 0 5 H Histidine
15 0 8 M Methionine
58 0 26 S Serine
30 5 33 T Threonine

6CAJ_1|Chains A, F[auth B]|Translation initiation factor eIF-2B subunit epsilon|Homo sapiens (9606)
>3KBD_1|Chain A|DNA (5'-D(*CP*TP*GP*CP*TP*CP*AP*CP*TP*TP*TP*CP*CP*AP*GP*G)-3')|
>7OTT_1|Chain A|Acetyltransferase component of pyruvate dehydrogenase complex|Chaetomium thermophilum (strain DSM 1495 / CBS 144.50 / IMI 039719) (759272)
Protein code \(c\) LZ-complexity \(\mathrm{LZ}(w)\) Length \(n=|w|\) \(\frac{\mathrm{LZ}(w)}{n /\log_{20} n}\) \(p_w(1)\) \(p_w(2)\) \(p_w(3)\) Sequence \(w=f(c)\)
6CAJ , Knot 274 721 0.83 40 292 640
MAAPVVAPPGVVVSRANKRSGAGPGGSGGGGARGAEEEPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLEFLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNVVRIITSELYRSLGDVLRDVDAKALVRSDFLLVYGDVISNINITRALEEHRLRRKLEKNVSVMTMIFKESSPSHPTRCHEDNVVVAVDSTTNRVLHFQKTQGLRRFAFPLSLFQGSSDGVEVRYDLLDCHISICSPQVAQLFTDNFDYQTRDDFVRGLLVNEEILGNQIHMHVTAKEYGARVSNLHMYSAVCADVIRRWVYPLTPEANFTDSTTQSCTHSRHNIYRGPEVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIGDNVVLDQTYLWQGVRVAAGAQIHQSLLCDNAEVKERVTLKPRSVLTSQVVVGPNITLPEGSVISLHPPDAEEDEDDGEFSDDSGADQEKDKVKMKGYNPAEVGAAGKGYLWKAAGMNMEEEEELQQNLWGLKINMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQRGKEENISCDNLVLEINSLKYAYNVSLKEVMQVLSHVVLEFPLQQMDSPLDSSRYCALLLPLLKAWSPVFRNYIKRAADHLEALAAIEDFFLEHEALGISMAKVLMAFYQLEILAEETILSWFSQRDTTDKGQQLRKNQQLQRFIQWLKEAEEESSEDD
3KBD , Knot 9 16 0.52 8 10 14
CTGCTCACTTTCCAGG
7OTT , Knot 182 459 0.81 40 206 409
MLAQVLRRQALQHVRLARAAAPSLTRWYASYPPHTIVKMPALSPTMTSGNIGAWQKKPGDAITPGEVLVEIETDKAQMDFEFQEEGVLAKILKETGEKDVAVGSPIAVLVEEGTDINAFQNFTLEDAGGDAAAPAAPAKEELAKAETAPTPASTSAPEPEETTSTGKLEPALDREPNVSFAAKKLAHELDVPLKALKGTGPGGKITEEDVKKAASAPAAAAAAPGAAYQDIPISNMRKTIATRLKESVSENPHFFVTSELSVSKLLKLRQALNSSAEGRYKLSVNDFLIKAIAVACKRVPAVNSSWRDGVIRQFDTVDVSVAVATPTGLITPIVKGVEAKGLETISATVKELAKKARDGKLKPEDYQGGTISISNMGMNPAVERFTAIINPPQAAILAVGTTKKVAVPVENEDGTTGVEWDDQIVVTASFDHKVVDGAVGAEWMRELKKVVENPLELLL

Let \(P_w(n)\) be the set of distinct subwords (intervals) in a word \(w\). Let \(p_w(n)\) be the cardinality of \(P_w(n)\). Let \(f(c)\) be the sequence in FASTA with 4-symbol Protein Data Bank code \(c\).

\(|P_{f(6CAJ_1)}(2) \setminus P_{f(3KBD_1)}(2)|=284\), \(|P_{f(3KBD_1)}(2) \setminus P_{f(6CAJ_1)}(2)|=2\). Let \( Z_k(x,y)=|P_x(k)\setminus P_y(k)|+|P_y(k)\setminus P_x(k)| \) be a LZ76 style (set of subwords) Jaccard distance numerator for \(P(k)\).Hydrophobic-polar version of Sequence 1:1111111111111001000011111101111101100011111011111001000111100001011111101111000101101011000111001011010001100010010010110110001000110110010101110001111010110010100110000100010001011011100001001000000011111000000110100001100111110110100011010001100010100101101100010000000110111100011100101010100011010010100110101100110110101010000000000000010011010110101100011110101110001100011111001100111000011011011111010001100010100010101001100011111010110101101011010000001010000110000001010100110111110101101111010000010001111010100000000000100001000110101001011000111010010000100001110100100100101001101100111011100100110000001111111011011100010011001011111001110001111011011111001011100011011000000001001000001001101100100000000
Pair \(Z_2\) Length of longest common subsequence
6CAJ_1,3KBD_1 286 2
6CAJ_1,7OTT_1 152 4
3KBD_1,7OTT_1 206 3

Newick tree

 
[
	3KBD_1:13.04,
	[
		6CAJ_1:76,7OTT_1:76
	]:61.04
]

Let d be the Otu--Sayood distance d.
Let d1 be the Otu--Sayood distance d1. (This makes the 4TYN sequence AAAAAA a close match...)
A roughly speaking expected distance is \((0.85)(0.8)(\frac{737 }{\log_{20} 737}-\frac{16}{\log_{20}16})=215.\)
Status Protein1 Protein2 d d1/2
Query variables 6CAJ_1 3KBD_1 270 138
Was not able to put for d
Was not able to put for d1

In notation analogous to [Theorem 16, Kjos-Hanssen, Niraula and Yoon (2022)],
\[ \delta= \alpha \mathrm{min} + (1-\alpha) \mathrm{max}= \begin{cases} d &\alpha=0,\\ d_1/2 &\alpha=1/2 \end{cases} \]