rpoA Family assigned · medium auto-curated

H37Rv Rv3457c · MTBC0 mtbc0_003675 · 347 aa · 3903348–3904391 (-) · RefSeq NP_217974.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)DNA-directed RNA polymerase subunit alpha
MTBC0 PGAP re-annotationDNA-directed RNA polymerase subunit alpha
Revised (this work)DNA-directed RNA polymerase subunit alpha. Pfam: RNA_pol_L (PF01193.30), RNA_pol_A_bac (PF01000.32), RNA_pol_A_CTD (PF03118.22).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Curated reference (UniProt)

UniProt P9WGZ1 SwissProt · reviewed · Evidence at protein level
UniProt nameDNA-directed RNA polymerase subunit alpha
EC (curated) EC 2.7.7.6
Curated functionDNA-dependent RNA polymerase catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates.

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category K Transcription
Preferred namerpoA
eggNOG descriptionDNA-dependent RNA polymerase catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates
Orthologous groupCOG0202
EC number EC 2.7.7.6
KEGG orthology K03040
KEGG pathways map00230, map00240, map01100, map03020
KEGG modules M00183
Gene Ontology (54) GO:0003674, GO:0003824, GO:0003899, GO:0005575, GO:0005618, GO:0005622, GO:0005623, GO:0005737, GO:0005829, GO:0005886, GO:0006139, GO:0006351 +42 more

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 0.687 · relaxed/neutral
Polymorphic sites (≥ 0.1% of strains) 3 synonymous, 6 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
RNA_pol_LPF01193.30 7.4e-2124–219 RNA polymerase Rpb3/Rpb11 dimerisation domain
RNA_pol_A_bacPF01000.32 1.1e-3854–169 RNA polymerase Rpb3/RpoA insert domain
RNA_pol_A_CTDPF03118.22 7.4e-26245–302 Bacterial RNA polymerase, alpha chain C terminal domain

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: sigA (RNA polymerase sigma factor SigA), high confidence from genomic context alone (score 1000 excluding text-mining).

PartnerProductScoreNo text-miningChannels (≥400)
Rv2703 sigA RNA polymerase sigma factor SigA 999 1000 ctx cooccurence:508 experimental:999
Rv0668 rpoC DNA-directed RNA polymerase subunit beta' 999 1000 ctx cooccurence:575 coexpression:957 experimental:999 database:844 textmining:970
Rv1390 rpoZ DNA-directed RNA polymerase subunit omega 999 1000 coexpression:805 experimental:999 database:844 textmining:878
Rv0667 rpoB DNA-directed RNA polymerase subunit beta 999 1000 ctx cooccurence:481 coexpression:970 experimental:999 database:844 textmining:973
Rv3460c rpsM 30S ribosomal protein S13 999 999 ctx neighborhood:578 fusion:673 cooccurence:499 coexpression:926 experimental:808
Rv0718 rpsH 30S ribosomal protein S8 999 999 ctx cooccurence:612 coexpression:967 experimental:895
Rv3458c rpsD 30S ribosomal protein S4 999 999 ctx neighborhood:750 cooccurence:509 coexpression:958 experimental:902 textmining:413
Rv0707 rpsC 30S ribosomal protein S3 999 999 ctx cooccurence:702 coexpression:967 experimental:912
Rv2050 rbpA RNA polymerase-binding protein RbpA 999 999 experimental:999
Rv3459c rpsK 30S ribosomal protein S11 999 999 ctx neighborhood:578 fusion:691 cooccurence:495 coexpression:891 experimental:895
Rv0721 rpsE 30S ribosomal protein S5 998 999 ctx cooccurence:548 coexpression:968 experimental:901
Rv0704 rplB 50S ribosomal protein L2 998 998 ctx cooccurence:552 coexpression:975 experimental:850
Rv0705 rpsS 30S ribosomal protein S19 998 998 ctx cooccurence:434 coexpression:969 experimental:902
Rv3456c rplQ 50S ribosomal protein L17 998 998 ctx neighborhood:838 coexpression:969 experimental:474
Rv0700 rpsJ 30S ribosomal protein S10 998 998 ctx cooccurence:510 coexpression:958 experimental:910

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Legacy H37Rv annotation: DNA-directed RNA polymerase subunit alpha
  • MTBC0 PGAP product: DNA-directed RNA polymerase subunit alpha
  • Pfam (hmmscan --cut_ga): RNA_pol_L PF01193.30 (E=7e-21), RNA_pol_A_bac PF01000.32 (E=1e-38), RNA_pol_A_CTD PF03118.22 (E=7e-26)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_217974.1)
  • Domains: Pfam-A via hmmscan --cut_ga — RNA_pol_L (PF01193.30), RNA_pol_A_bac (PF01000.32), RNA_pol_A_CTD (PF03118.22)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG0202
  • Curated reference: UniProt P9WGZ1 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 347 functional partner(s); context anchor sigA
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_003675|Rv3457c|rpoA
MLISQRPTLSEDVLTDNRSQFVIEPLEPGFGYTLGNSLRRTLLSSIPGAAVTSIRIDGVLHEFTTVPGVKEDVTEIILNLKSLVVSSEEDEPVTMYLRKQGPGEVTAGDIVPPAGVTVHNPGMHIATLNDKGKLEVELVVERGRGYVPAVQNRASGAEIGRIPVDSIYSPVLKVTYKVDATRVEQRTDFDKLILDVETKNSISPRDALASAGKTLVELFGLARELNVEAEGIEIGPSPAEADHIASFALPIDDLDLTVRSYNCLKREGVHTVGELVARTESDLLDIRNFGQKSIDEVKIKLHQLGLSLKDSPPSFDPSEVAGYDVATGTWSTEGAYDEQDYAETEQL