Rv2949c Family assigned · medium auto-curated

H37Rv Rv2949c · MTBC0 mtbc0_003131 · 199 aa · 3321146–3321745 (-) · RefSeq NP_217465.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)chorismate pyruvate-lyase
MTBC0 PGAP re-annotationchorismate pyruvate-lyase family protein
Revised (this work)Chorismate pyruvate-lyase family protein. Pfam: Rv2949c-like (PF01947.22).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Curated reference (UniProt)

UniProt P9WIC5 SwissProt · reviewed · Evidence at protein level
UniProt nameChorismate pyruvate-lyase
EC (curated) EC 4.1.3.40
Curated functionRemoves the pyruvyl group from chorismate to provide 4-hydroxybenzoate (4HB). Involved in the synthesis of glycosylated p-hydroxybenzoic acid methyl esters (p-HBADs) and phenolic glycolipids (PGL) that play important roles in the pathogenesis of mycobacterial infections.

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category H Coenzyme transport and metabolism
eggNOG descriptionResponsible for the direct conversion of chorismate to p-hydroxybenzoate, the substrate used in the production of glycosylated p-hydroxybenzoic acid methyl esters and structurally related phenolphthiocerol glycolipids. in M. tuberculosis, this is the sole enzymatic source of p- hydroxybenzoic acid
Orthologous groupCOG3161
EC number EC 4.1.3.40
KEGG orthology K03181
KEGG pathways map00130, map01100, map01110
KEGG modules M00117
Gene Ontology (34) GO:0003674, GO:0003824, GO:0006082, GO:0008150, GO:0008152, GO:0008813, GO:0009058, GO:0009273, GO:0009987, GO:0016043, GO:0016053, GO:0016829 +22 more

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 0.855 · relaxed/neutral
Polymorphic sites (≥ 0.1% of strains) 3 synonymous, 7 missense, 1 nonsense, 0 frameshift
Disruption 1 distinct premature-stop/frameshift site(s); most common in 0.20% of strains (290) · clonal

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
Rv2949c-likePF01947.22 4.8e-1224–173 Chorismate pyruvate-lyase Rv2949c-like

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: fadD22 (p-hydroxybenzoyl--AMP ligase), high confidence from genomic context alone (score 981 excluding text-mining).

PartnerProductScoreNo text-miningChannels (≥400)
Rv2948c fadD22 p-hydroxybenzoyl--AMP ligase 987 981 ctx neighborhood:739 coexpression:839 database:500
Rv2950c fadD29 long-chain-fatty-acid--AMP ligase FadD29 983 976 ctx neighborhood:835 coexpression:860
Rv2947c pks15 polyketide synthase 949 939 ctx neighborhood:746 coexpression:770
Rv3215 entC isochorismate synthase 910 901 database:900
Rv0948c chorismate mutase 821 810 database:800
Rv1885c chorismate mutase 820 809 database:800
Rv1609 trpE anthranilate synthase component I 818 808 database:800
Rv2540c aroF chorismate synthase 808 801 database:800
Rv1600 hisC1 histidinol-phosphate aminotransferase 800 801 database:800
Rv3772 hisC2 histidinol-phosphate aminotransferase 800 800 database:800
Rv2231c cobC aminotransferase 800 800 database:800
Rv2946c pks1 polyketide synthase 816 778 ctx neighborhood:767
Rv0355c PPE8 PPE family protein PPE8 767 767 ctx cooccurence:766
Rv3347c PPE55 PPE family protein PPE55 765 766 ctx cooccurence:764
Rv3343c PPE54 PPE family protein PPE54 764 765 ctx cooccurence:757

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Legacy H37Rv annotation: chorismate pyruvate-lyase
  • MTBC0 PGAP product: chorismate pyruvate-lyase family protein
  • Pfam (hmmscan --cut_ga): Rv2949c-like PF01947.22 (E=5e-12)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_217465.1)
  • Domains: Pfam-A via hmmscan --cut_ga — Rv2949c-like (PF01947.22)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG3161
  • Curated reference: UniProt P9WIC5 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 117 functional partner(s); context anchor fadD22
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_003131|Rv2949c|
MTECFLSDQEIRKLNRDLRILIAANGTLTRVLNIVADDEVIVQIVKQRIHDVSPKLSEFEQLGQVGVGRVLQRYIILKGRNSEHLFVAAESLIAIDRLPAAIITRLTQTNDPLGEVMAASHIETFKEEAKVWVGDLPGWLALHGYQNSRKRAVARRYRVISGGQPIMVVTEHFLRSVFRDAPHEEPDRWQFSNAITLAR