Rv3632 Family assigned · medium

H37Rv Rv3632 · MTBC0 mtbc0_003849 · 114 aa · 4094940–4095284 (+) · RefSeq NP_218149.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)membrane protein
MTBC0 PGAP re-annotationDUF2304 domain-containing protein
Revised (this work)Small membrane protein that potentiates galactosamine modification of arabinogalactan. RefSeq leaves it of unknown function. Rv3632 is co-transcribed with ppgS (Rv3631, polyprenyl-phospho-N-acetyl-galactosaminyl synthase) and increases PpgS catalytic activity 40-50-fold, supporting the GalN modification of arabinogalactan (Skovierova 2010). Its cell-envelope peptides also bind host cells (adhesion).

Curated reference (UniProt)

UniProt I6YGT7 TrEMBL · unreviewed · Predicted
UniProt namePossible conserved membrane protein

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category S Function unknown
eggNOG descriptionUncharacterized conserved protein (DUF2304)
Orthologous groupCOG2456
KEGG orthology K09153

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 0.688 · relaxed/neutral
Polymorphic sites (≥ 0.1% of strains) 1 synonymous, 2 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
DUF2304PF10066.15 9.1e-313–107 Uncharacterized conserved protein (DUF2304)

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: Rv3631 (transferase), high confidence from genomic context alone (score 971 excluding text-mining).

PartnerProductScoreNo text-miningChannels (≥400)
Rv3631 transferase 996 971 ctx neighborhood:881 cooccurence:760 textmining:870
Rv3630 integral membrane protein 987 937 ctx neighborhood:818 cooccurence:664 textmining:806
Rv3779 transmembrane protein 893 787 ctx cooccurence:768 textmining:519
Rv1510 hyp hypothetical protein 726 723 ctx cooccurence:656
Rv3629c integral membrane protein 952 651 ctx neighborhood:646 textmining:870
Rv3633 hyp hypothetical protein 931 496 ctx neighborhood:493 textmining:870
Rv0517 acyltransferase 474 475 ctx cooccurence:454
Rv0048c membrane protein 474 475 ctx cooccurence:470
Rv3634c galE1 UDP-glucose 4-epimerase 868 358 textmining:804
Rv3784 dTDP-glucose 4,6-dehydratase 496 148 textmining:433
Rv3789 GtrA family protein 624 99 textmining:600
Rv3468c dTDP-glucose 4,6-dehydratase 662 71 textmining:652
Rv0230c php phosphotriesterase 520 47 textmining:517
Rv1077 cbs cystathionine beta-synthase 434 47 textmining:431
Rv0007 membrane protein 626 42 textmining:626

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Co-transcribed with ppgS; boosts PpgS activity 40-50x for arabinogalactan GalN modification (Skovierova 2010, PMID 21030587)
  • Cell-envelope protein; host-cell binding peptides (Sanchez-Barinas 2019, PMID 31111070)
  • Curated from the literature crible (project 'Still unknown gene function', 2026-06-09)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_218149.1)
  • Domains: Pfam-A via hmmscan --cut_ga — DUF2304 (PF10066.15)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG2456
  • Curated reference: UniProt I6YGT7 (TrEMBL, unreviewed; Predicted)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 15 functional partner(s); context anchor Rv3631
  • Primary literature: Skovierova H, Larrouy-Maumus G, Pham H, Belanova M, Barilone N, Dasgupta A, Mikusova K, Gicquel B, Gilleron M, Brennan PJ, Puzo G, Nigou J, Jackson M (2010). Biosynthetic origin of the galactosamine substituent of Arabinogalactan in Mycobacterium tuberculosis J Biol Chem 285(53):41348-55. doi:10.1074/jbc.M110.188110 PMID:21030587

Ancestral MTBC0 protein sequence

>mtbc0_003849|Rv3632|
MNWIQVLLIASIIGLLFYLLRSRRSARSRAWVKVGYVLFVLAGIYAVLRPDDTTVVANWFGVRRGTDLMLYALVMAFSFTTLSTYMRFKDLELRYARIARALALEGAQAPEQCR