ercc3 Resolved · high auto-curated

H37Rv Rv0861c · MTBC0 - · 542 aa · 958523–960151 (-) · RefSeq NP_215376.3

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)DNA helicase Ercc3
MTBC0 PGAP re-annotation
Revised (this work)DNA helicase Ercc3. Pfam: Helicase_C_3 (PF13625.12), ResIII (PF04851.22), DEAD (PF00270.36), ERCC3_RAD25_C (PF16203.12), Helicase_C (PF00271.38).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Annotated on the H37Rv protein: this gene has no 1:1 ancestral MTBC0 anchor (PE/PPE, paralogue, IS element, or otherwise unanchored CDS).

Curated reference (UniProt)

UniProt O53873 SwissProt · reviewed · Evidence at protein level
UniProt nameDNA 3'-5' helicase XPB
EC (curated) EC 5.6.2.4
Curated functionATP-dependent 3'-5' DNA helicase, unwinds 3'-overhangs, 3'- flaps, and splayed-arm DNA substrates but not 5'-overhangs, 5'-flap substrates, 3-way junctions or Holliday junctions. Not highly efficient in vitro. Requires ATP hydrolysis for helicase activity; the ATPase activity is DNA-dependent and requires a minimum of 4 single-stranded nucleotides (nt) with 6-10 nt providing all necessary interactions for full processive unwinding. The ATPase prefers ATP over CTP or GTP, is almost inactive with TTP. DNA helicase activity requires ATP or dATP and only acts when the 3'-overhang is >20 nt. Capabl.

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category L Replication, recombination and repair
Preferred nameercc3
eggNOG descriptionType III restriction enzyme res subunit
Orthologous groupCOG1061
EC number EC 3.6.4.12
KEGG orthology K10843
KEGG pathways map03022, map03420
KEGG modules M00290

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 0.341 · purifying
Polymorphic sites (≥ 0.1% of strains) 4 synonymous, 4 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
Helicase_C_3PF13625.12 6.7e-382–127 Helicase conserved C-terminal domain
ResIIIPF04851.22 1.2e-17174–320 Type III restriction enzyme, res subunit
DEADPF00270.36 6.1e-06177–321 DEAD/DEAH box helicase
ERCC3_RAD25_CPF16203.12 3.8e-40346–530 ERCC3/RAD25/XPB C-terminal helicase
Helicase_CPF00271.38 1.7e-12389–492 Helicase conserved C-terminal domain

Functional interaction network (STRING v12, guilt-by-association)

PartnerProductScoreNo text-miningChannels (≥400)
Rv0862c hyp hypothetical protein 931 932 ctx neighborhood:738 cooccurence:743
Rv1329c dinG ATP-dependent helicase DinG 948 882 experimental:685 database:622 textmining:580
Rv0667 rpoB DNA-directed RNA polymerase subunit beta 856 853 experimental:578 database:621
Rv3457c rpoA DNA-directed RNA polymerase subunit alpha 853 853 experimental:627 database:621
Rv1390 rpoZ DNA-directed RNA polymerase subunit omega 847 848 experimental:613 database:621
Rv1629 polA DNA polymerase I 834 768 experimental:430 database:580
Rv2090 5'-3' exonuclease 765 752 experimental:430 database:580
Rv0668 rpoC DNA-directed RNA polymerase subunit beta' 703 688 database:620
Rv2529 hyp hypothetical protein 705 686 database:610
Rv2101 helZ helicase HelZ 843 669 database:615 textmining:548
Rv0863 hyp hypothetical protein 662 663 ctx neighborhood:662
Rv1334 mec [CysO 646 646 database:613
Rv3812 PE_PGRS62 PE-PGRS family protein PE_PGRS62 620 621 database:563
Rv2328 PE23 PE family protein PE23 620 621 database:563
Rv3036c TB22.2 hyp hypothetical protein 620 621 database:563

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Annotation from H37Rv (no MTBC0 1:1 anchor; H37Rv protein used): DNA helicase Ercc3
  • Pfam (hmmscan --cut_ga): Helicase_C_3 PF13625.12 (E=7e-38), ResIII PF04851.22 (E=1e-17), DEAD PF00270.36 (E=6e-06), ERCC3_RAD25_C PF16203.12 (E=4e-40), Helicase_C PF00271.38 (E=2e-12)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_215376.3)
  • Domains: Pfam-A via hmmscan --cut_ga — Helicase_C_3 (PF13625.12), ResIII (PF04851.22), DEAD (PF00270.36), ERCC3_RAD25_C (PF16203.12), Helicase_C (PF00271.38)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG1061
  • Curated reference: UniProt O53873 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 51 functional partner(s)
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>H37Rv|Rv0861c|ercc3
MQSDKTVLLEVDHELAGAARAAIAPFAELERAPEHVHTYRITPLALWNARAAGHDAEQVVDALVSYSRYAVPQPLLVDIVDTMARYGRLQLVKNPAHGLTLVSLDRAVLEEVLRNKKIAPMLGARIDDDTVVVHPSERGRVKQLLLKIGWPAEDLAGYVDGEAHPISLHQEGWQLRDYQRLAADSFWAGGSGVVVLPCGAGKTLVGAAAMAKAGATTLILVTNIVAARQWKRELVARTSLTENEIGEFSGERKEIRPVTISTYQMITRRTKGEYRHLELFDSRDWGLIIYDEVHLLPAPVFRMTADLQSKRRLGLTATLIREDGREGDVFSLIGPKRYDAPWKDIEAQGWIAPAECVEVRVTMTDSERMMYATAEPEERYRICSTVHTKIAVVKSILAKHPDEQTLVIGAYLDQLDELGAELGAPVIQGSTRTSEREALFDAFRRGEVATLVVSKVANFSIDLPEAAVAVQVSGTFGSRQEEAQRLGRILRPKADGGGAIFYSVVARDSLDAEYAAHRQRFLAEQGYGYIIRDADDLLGPAI