katG Resolved · high auto-curated

H37Rv Rv1908c · MTBC0 mtbc0_002023 · 740 aa · 2171999–2174221 (-) · RefSeq NP_216424.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)catalase-peroxidase
MTBC0 PGAP re-annotationcatalase/peroxidase HPI
Revised (this work)Catalase/peroxidase HPI. Pfam: peroxidase (PF00141.29).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Curated reference (UniProt)

UniProt P9WIE5 SwissProt · reviewed · Evidence at protein level
UniProt nameCatalase-peroxidase
EC (curated) EC 1.11.1.21
Curated functionBifunctional enzyme with both catalase and broad-spectrum peroxidase activity, oxidizing various electron donors including NADP(H). Protects M.tuberculosis against toxic reactive oxygen species (ROS) including hydrogen peroxide as well as organic peroxides and thus contributes to its survival within host macrophages by countering the phagocyte oxidative burst. Also displays efficient peroxynitritase activity, which may help the bacterium to persist in macrophages..; FUNCTION: Might be involved in DNA repair. Partly complements recA-deficient E.coli cells exposed to UV radiation, mitomycin C or.

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category P Inorganic ion transport and metabolism
Preferred namekatG
eggNOG descriptionBifunctional enzyme with both catalase and broad- spectrum peroxidase activity
Orthologous groupCOG0376
EC number EC 1.11.1.21
KEGG orthology K03782
KEGG pathways map00360, map00380, map00940, map00983, map01100, map01110
Gene Ontology (130) GO:0000166, GO:0000302, GO:0003674, GO:0003824, GO:0004096, GO:0004601, GO:0005488, GO:0005575, GO:0005576, GO:0005618, GO:0005622, GO:0005623 +118 more

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 0.198 · strong purifying
Polymorphic sites (≥ 0.1% of strains) 8 synonymous, 5 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
peroxidasePF00141.29 3.7e-4094–397 Peroxidase

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: furA (ferric uptake regulation protein FurA), high confidence from genomic context alone (score 837 excluding text-mining).

PartnerProductScoreNo text-miningChannels (≥400)
Rv1907c hyp hypothetical protein 957 945 ctx neighborhood:805 coexpression:731
Rv3566c nat arylamine N-acetyltransferase 912 901 database:900
Rv1600 hisC1 histidinol-phosphate aminotransferase 910 901 database:900
Rv2231c cobC aminotransferase 905 900 database:900
Rv3772 hisC2 histidinol-phosphate aminotransferase 905 900 database:900
Rv1909c furA ferric uptake regulation protein FurA 979 837 ctx neighborhood:829 textmining:879
Rv3838c pheA prephenate dehydratase 801 801 database:800
Rv3846 sodA superoxide dismutase 973 516 database:500 textmining:948
Rv0432 sodC superoxide dismutase 936 503 database:500 textmining:878
Rv1910c hyp hypothetical protein 541 471 ctx neighborhood:471
Rv1906c hyp hypothetical protein 508 423 ctx neighborhood:415
Rv1905c aao D-amino acid oxidase 418 395
Rv3914 trxC thioredoxin TrxC 578 387
Rv1471 trxB1 thioredoxin 455 366
Rv1324 thioredoxin 415 360

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Legacy H37Rv annotation: catalase-peroxidase
  • MTBC0 PGAP product: catalase/peroxidase HPI
  • Pfam (hmmscan --cut_ga): peroxidase PF00141.29 (E=4e-40)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_216424.1)
  • Domains: Pfam-A via hmmscan --cut_ga — peroxidase (PF00141.29)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG0376
  • Curated reference: UniProt P9WIE5 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 71 functional partner(s); context anchor furA
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_002023|Rv1908c|katG
MPEQHPPITETTTGAASNGCPVVGHMKYPVEGGGNQDWWPNRLNLKVLHQNPAVADPMGAAFDYAAEVATIDVDALTRDIEEVMTTSQPWWPADYGHYGPLFIRMAWHAAGTYRIHDGRGGAGGGMQRFAPLNSWPDNASLDKARRLLWPVKKKYGKKLSWADLIVFAGNCALESMGFKTFGFGFGRVDQWEPDEVYWGKEATWLGDERYSGKRDLENPLAAVQMGLIYVNPEGPNGNPDPMAAAVDIRETFRRMAMNDVETAALIVGGHTFGKTHGAGPADLVGPEPEAAPLEQMGLGWKSSYGTGTGKDAITSGIEVVWTNTPTKWDNSFLEILYGYEWELTKSPAGAWQYTAKDGAGAGTIPDPFGGPGRSPTMLATDLSLRVDPIYERITRRWLEHPEELADEFAKAWYKLIHRDMGPVARYLGPLVPKQTLLWQDPVPAVSHDLVGEAEIASLKSQILASGLTVSQLVSTAWAAASSFRGSDKRGGANGGRIRLQPQVGWEVNDPDGDLRKVIRTLEEIQESFNSAAPGNIKVSFADLVVLGGCAAIEKAAKAAGHNITVPFTPGRTDASQEQTDVESFAVLEPKADGFRNYLGKGNPLPAEYMLLDKANLLTLSAPEMTVLVGGLRVLGANYKRLPLGVFTEASESLTNDFFVNLLDMGITWEPSPADDGTYQGKDGSGKVKWTGSRVDLVFGSNSELRALVEVYGADDAQPKFVQDFVAAWDKVMNLDRFDVR