Rv0428c Family assigned · medium auto-curated

H37Rv Rv0428c · MTBC0 mtbc0_000450 · 302 aa · 520260–521168 (-) · RefSeq NP_214942.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)GCN5-like N-acetyltransferase
MTBC0 PGAP re-annotationGNAT family N-acetyltransferase
Revised (this work)GNAT family N-acetyltransferase. Pfam: SH3_Rv0428c (PF24551.2), Rv0428c_C (PF24553.3).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Curated reference (UniProt)

UniProt P96274 SwissProt · reviewed · Evidence at protein level
UniProt nameProbable histone acetyltransferase Rv0428c
EC (curated) EC 2.3.1.48
Curated functionShows histone acetyl transferase (HAT) activity with recombinant eukaryotic H3 histone expressed in bacteria as substrate and acetyl-CoA as donor. May be involved in survival under stress conditions.

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category K Transcription
eggNOG descriptionacetyltransferase
Orthologous groupCOG0454
EC number EC 2.3.1.1
KEGG orthology K22476
KEGG pathways map00220, map01210, map01230

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 0.51 · relaxed/neutral
Polymorphic sites (≥ 0.1% of strains) 3 synonymous, 4 missense, 0 nonsense, 1 frameshift
Disruption 1 distinct premature-stop/frameshift site(s); most common in 0.13% of strains (187) · clonal

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
SH3_Rv0428cPF24551.2 9.0e-226–60 Probable histone acetyltransferase Rv0428c SH3 domain
Rv0428c_CPF24553.3 7.0e-9769–292 Probable histone acetyltransferase Rv0428c C-terminal

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: def (polypeptide deformylase), high confidence from genomic context alone (score 887 excluding text-mining).

PartnerProductScoreNo text-miningChannels (≥400)
Rv1653 argJ bifunctional glutamate N-acetyltransferase/amino-acid acetyltransferase 940 905 database:900
Rv1654 argB acetylglutamate kinase 914 905 database:900
Rv2747 argA L-glutamate alpha-N-acetyltranferase 907 903 database:900
Rv0337c aspC aspartate aminotransferase 905 901 database:900
Rv2476c gdh NAD-dependent glutamate dehydrogenase 900 901 database:900
Rv0429c def polypeptide deformylase 961 887 ctx neighborhood:882 textmining:669
Rv0427c xthA exodeoxyribonuclease III protein XthA 946 883 ctx neighborhood:881 textmining:562
Rv0495c hyp hypothetical protein 737 738 ctx cooccurence:737
Rv0430 hyp hypothetical protein 699 700 ctx neighborhood:698
Rv0432 sodC superoxide dismutase 669 669 ctx neighborhood:664
Rv0433 carboxylate-amine ligase 665 666 ctx neighborhood:664
Rv0434 hyp hypothetical protein 665 665 ctx neighborhood:609
Rv1125 hyp hypothetical protein 653 654 ctx cooccurence:645
Rv3724B cut5b Rv3724B, (MTV025.072), len: 187 aa. Probable cut5b,truncated cutinase, similar to C-terminal end of others e.g. Q9XB09|RVD2-RV1758 protein ( 586 586 ctx cooccurence:586
Rv2326c ABC transporter ATP-binding protein 565 566 ctx cooccurence:552

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Legacy H37Rv annotation: GCN5-like N-acetyltransferase
  • MTBC0 PGAP product: GNAT family N-acetyltransferase
  • Pfam (hmmscan --cut_ga): SH3_Rv0428c PF24551.2 (E=9e-22), Rv0428c_C PF24553.3 (E=7e-97)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_214942.1)
  • Domains: Pfam-A via hmmscan --cut_ga — SH3_Rv0428c (PF24551.2), Rv0428c_C (PF24553.3)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG0454
  • Curated reference: UniProt P96274 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 54 functional partner(s); context anchor def
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_000450|Rv0428c|
MVSWPGLGTRVTVRYRRPAGSMPPLTDAVGRLLAVDPTVRVQTKTGTIVEFSPVDVVALRVLTDAPVRTAAIRALEHAAAAAWPGVERTWLDGWLLRAGHGAVLAANSAVPLDISAHTNTITEISAWYASRDLQPWLAVPDRLLPLPAGLAGERREQVLVRDVSTGEPDRSVTLLDHPDDTWLRLYHQRLPLDMATPVIDGELAFGSYLGVAVARAAVTDAPDGTRWVGLSAMRAADEQSATGSAGRQLWEALLGWGAGRGATRGYVRVHDTATSVLAESLGFRLHHHCRYLPAQSVGWDTF