glgA Resolved · high auto-curated

H37Rv Rv1212c · MTBC0 mtbc0_001300 · 387 aa · 1362940–1364103 (-) · RefSeq NP_215728.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)capsular glucan synthase
MTBC0 PGAP re-annotationglycogen synthase
Revised (this work)Glycogen synthase. Pfam: Glyco_transf_4 (PF13439.13), Glyco_trans_4_4 (PF13579.13), Glyco_transf_5 (PF08323.18), GT4-conflict (PF20706.4), Glycos_transf_1 (PF00534.27), Glyco_trans_1_4 (PF13692.13), Glyco_trans_1_2 (PF13524.13).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Curated reference (UniProt)

UniProt P9WMZ1 SwissProt · reviewed · Evidence at protein level
UniProt nameAlpha-maltose-1-phosphate synthase
EC (curated) EC 2.4.1.342
Curated functionInvolved in the biosynthesis of the maltose-1-phosphate (M1P) building block required for alpha-glucan production by the key enzyme GlgE. Catalyzes the formation of an alpha-1,4 linkage between glucose from ADP-glucose and glucose 1-phosphate (G1P) to yield maltose-1-phosphate (M1P). Also able to catalyze the elongation of the non-reducing ends of glycogen, maltodextrin and maltoheptaose using ADP-glucose as sugar donor, however the rate of the reaction appears to be too low to be physiologically relevant. GlgM is also able to use UDP-glucose as sugar donor with G1P, however, it is less effici.

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category G Carbohydrate transport and metabolism
Preferred nameglgA
eggNOG descriptionGlycogen synthase
Orthologous groupCOG0297
EC number EC 2.4.1.342
KEGG orthology K16148
KEGG pathways map00500, map01100
CAZy family GT4
Gene Ontology (28) GO:0000271, GO:0003674, GO:0003824, GO:0005975, GO:0005976, GO:0006073, GO:0008150, GO:0008152, GO:0009058, GO:0009059, GO:0009250, GO:0009987 +16 more

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 2.134 · diversifying/relaxed
Polymorphic sites (≥ 0.1% of strains) 2 synonymous, 12 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
Glyco_transf_4PF13439.13 4.6e-2715–176 Glycosyltransferase Family 4
Glyco_trans_4_4PF13579.13 2.3e-1316–172 Glycosyl transferase 4-like domain
Glyco_transf_5PF08323.18 3.6e-0561–156 Starch synthase catalytic domain
GT4-conflictPF20706.4 6.6e-13126–376 Family 4 Glycosyltransferase in conflict systems
Glycos_transf_1PF00534.27 3.3e-29193–364 Glycosyl transferases group 1
Glyco_trans_1_4PF13692.13 4.6e-25200–347 Glycosyl transferases group 1
Glyco_trans_1_2PF13524.13 2.7e-08270–380 Glycosyl transferase-like

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: glgC (glucose-1-phosphate adenylyltransferase), high confidence from genomic context alone (score 987 excluding text-mining).

PartnerProductScoreNo text-miningChannels (≥400)
Rv1213 glgC glucose-1-phosphate adenylyltransferase 999 987 ctx neighborhood:765 cooccurence:465 database:900 textmining:965
Rv1327c glgE alpha-1,4-glucan:maltose-1-phosphate maltosyltransferase 998 970 ctx cooccurence:662 database:900 textmining:959
Rv1328 glgP glycogen phosphorylase 991 946 database:900 textmining:857
Rv3068c pgmA phosphoglucomutase PgmA 943 940 database:900
Rv0127 mak maltokinase 994 928 database:900 textmining:933
Rv1781c malQ 4-alpha-glucanotransferase 968 914 database:900 textmining:649
Rv1326c glgB 1,4-alpha-glucan branching protein 995 910 coexpression:410 database:777 textmining:954
Rv0993 galU UTP--glucose-1-phosphate uridylyltransferase 992 903 database:900 textmining:922
Rv1562c treZ malto-oligosyltrehalose trehalohydrolase 987 834 coexpression:407 database:572 textmining:925
Rv2529 hyp hypothetical protein 673 661 database:516
Rv0322 udgA UDP-glucose 6-dehydrogenase UdgA 556 528 coexpression:443
Rv3809c glf UDP-galactopyranose mutase 512 487 coexpression:470
Rv3465 rmlC dTDP-4-dehydrorhamnose 3,5-epimerase 503 473 coexpression:455
Rv2627c hyp hypothetical protein 481 456 experimental:443
Rv2307c hyp hypothetical protein 479 455 experimental:443

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Legacy H37Rv annotation: capsular glucan synthase
  • MTBC0 PGAP product: glycogen synthase
  • Pfam (hmmscan --cut_ga): Glyco_transf_4 PF13439.13 (E=5e-27), Glyco_trans_4_4 PF13579.13 (E=2e-13), Glyco_transf_5 PF08323.18 (E=4e-05), GT4-conflict PF20706.4 (E=7e-13), Glycos_transf_1 PF00534.27 (E=3e-29), Glyco_trans_1_4 PF13692.13 (E=5e-25), Glyco_trans_1_2 PF13524.13 (E=3e-08)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_215728.1)
  • Domains: Pfam-A via hmmscan --cut_ga — Glyco_transf_4 (PF13439.13), Glyco_trans_4_4 (PF13579.13), Glyco_transf_5 (PF08323.18), GT4-conflict (PF20706.4), Glycos_transf_1 (PF00534.27), Glyco_trans_1_4 (PF13692.13), Glyco_trans_1_2 (PF13524.13)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG0297
  • Curated reference: UniProt P9WMZ1 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 35 functional partner(s); context anchor glgC
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_001300|Rv1212c|glgA
MRVAMLTREYPPEVYGGAGVHVTELVAYLRRLCAVDVHCMGAPRPGAFAYRPDPRLGSANAALSTLSADLVMANAASAATVVHSHTWYTALAGHLAAILYDIPHVLTAHSLEPLRPWKKEQLGGGYQVSTWVEQTAVLAANAVIAVSSAMRNDMLRVYPSLDPNLVHVIRNGIDTETWYPAGPARTGSVLAELGVDPNRPMAVFVGRITRQKGVVHLVTAAHRFRSDVQLVLCAGAADTPEVADEVRVAVAELARNRTGVFWIQDRLTIGQLREILSAATVFVCPSVYEPLGIVNLEAMACATAVVASDVGGIPEVVADGITGSLVHYDADDATGYQARLAEAVNALVADPATAERYGHAGRQRCIQEFSWAYIAEQTLDIYRKVCA