nth Resolved · high auto-curated

H37Rv Rv3674c · MTBC0 mtbc0_003893 · 245 aa · 4138987–4139724 (-) · RefSeq NP_218191.2

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)endonuclease III
MTBC0 PGAP re-annotationendonuclease III
Revised (this work)Endonuclease III. Pfam: HhH-GPD (PF00730.32), HHH (PF00633.30), EndIII_4Fe-2S (PF10576.15).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Curated reference (UniProt)

UniProt P9WQ11 SwissProt · reviewed · Evidence at protein level
UniProt nameEndonuclease III
EC (curated) EC 4.2.99.18
Curated functionDNA repair enzyme that has both DNA N-glycosylase activity and AP-lyase activity. The DNA N-glycosylase activity releases various damaged pyrimidines from DNA by cleaving the N-glycosidic bond, leaving an AP (apurinic/apyrimidinic) site. The AP-lyase activity cleaves the phosphodiester bond 3' to the AP site by a beta-elimination, leaving a 3'-terminal unsaturated sugar and a product with a terminal 5'-phosphate. Has a preference for oxidized pyrimidines, such as thymine glycol (prefers 5S isomers) 5,6-dihydrouracil:G, 5-hydroxyuracil:G, 5-hydroxycytosine:G and urea:A. Cleaves ssDNA containing.

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category L Replication, recombination and repair
Preferred namenth
eggNOG descriptionDNA repair enzyme that has both DNA N-glycosylase activity and AP-lyase activity. The DNA N-glycosylase activity releases various damaged pyrimidines from DNA by cleaving the N- glycosidic bond, leaving an AP (apurinic apyrimidinic) site. The AP-lyase activity cleaves the phosphodiester bond 3' to the AP site by a beta-elimination, leaving a 3'-terminal unsaturated sugar and a product with a terminal 5'-phosphate
Orthologous groupCOG0177
EC number EC 4.2.99.18
KEGG orthology K10773
KEGG pathways map03410
Gene Ontology (43) GO:0003674, GO:0003676, GO:0003677, GO:0003690, GO:0003824, GO:0005488, GO:0005575, GO:0005618, GO:0005623, GO:0005886, GO:0006139, GO:0006259 +31 more

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS n/a
Polymorphic sites (≥ 0.1% of strains) 0 synonymous, 3 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
HhH-GPDPF00730.32 4.7e-2545–179 HhH-GPD superfamily base excision DNA repair protein
HHHPF00633.30 4.1e-09110–137 Helix-hairpin-helix motif
EndIII_4Fe-2SPF10576.15 1.8e-04198–214 Iron-sulfur binding domain of endonuclease III

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: Rv3673c (membrane-anchored thioredoxin-like protein), high confidence from genomic context alone (score 903 excluding text-mining).

PartnerProductScoreNo text-miningChannels (≥400)
Rv3673c membrane-anchored thioredoxin-like protein 903 903 ctx neighborhood:881
Rv0427c xthA exodeoxyribonuclease III protein XthA 823 799 experimental:403 database:641
Rv3671c marP serine protease 797 793 ctx neighborhood:758
Rv3675 membrane protein 855 779 ctx neighborhood:779
Rv3672c hyp hypothetical protein 617 597 ctx neighborhood:591
Rv3676 crp cAMP receptor protein 577 577 ctx neighborhood:486
Rv2976c ung uracil-DNA glycosylase 613 571 ctx fusion:508
Rv0670 end endonuclease IV 768 480 experimental:441 textmining:572
Rv2239c hyp hypothetical protein 438 439 ctx cooccurence:428
Rv1613 trpA tryptophan synthase subunit alpha 426 427 coexpression:420
Rv0944 fpg2 formamidopyrimidine-DNA glycosylase 548 418 coexpression:418
Rv2924c fpg formamidopyrimidine-DNA glycosylase 915 416 coexpression:416 textmining:861
Rv2464c nei1 DNA glycosylase 887 416 coexpression:416 textmining:815
Rv3297 nei endonuclease VIII 924 414 coexpression:414 textmining:877
Rv2572c aspS aspartate--tRNA ligase 404 404

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Legacy H37Rv annotation: endonuclease III
  • MTBC0 PGAP product: endonuclease III
  • Pfam (hmmscan --cut_ga): HhH-GPD PF00730.32 (E=5e-25), HHH PF00633.30 (E=4e-09), EndIII_4Fe-2S PF10576.15 (E=2e-04)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_218191.2)
  • Domains: Pfam-A via hmmscan --cut_ga — HhH-GPD (PF00730.32), HHH (PF00633.30), EndIII_4Fe-2S (PF10576.15)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG0177
  • Curated reference: UniProt P9WQ11 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 41 functional partner(s); context anchor Rv3673c
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_003893|Rv3674c|nth
MPGRWSAETRLALVRRARRMNRALAQAFPHVYCELDFTTPLELAVATILSAQSTDKRVNLTTPALFARYRTARDYAQADRTELESLIRPTGFYRNKAASLIGLGQALVERFGGEVPATMDKLVTLPGVGRKTANVILGNAFGIPGITVDTHFGRLVRRWRWTTAEDPVKVEQAVGELIERKEWTLLSHRVIFHGRRVCHARRPACGVCVLAKDCPSFGLGPTEPLLAAPLVQGPETDHLLALAGL