napA Resolved · high
H37Rv Rv0430 · MTBC0 mtbc0_000452 ·
102 aa · 522098–522406 (+) ·
RefSeq NP_214944.1
Annotation: from legacy to revised
| Legacy (H37Rv / Mycobrowser) | hypothetical protein |
|---|---|
| MTBC0 PGAP re-annotation | DUF3263 domain-containing protein |
| Revised (this work) | NapA, a nucleoid-associated protein (NAP) and DNA topology modulator. RefSeq leaves this locus 'hypothetical protein'. Rv0430 binds DNA in a length- and supercoil-dependent manner, prefers A/T-rich sequences, bridges distant DNA segments, coats DNA into inflexible rods, protects DNA from damaging agents, modulates supercoiling and stimulates the DNA-relaxation activity of topoisomerase I. It is the first gene of an operon harbouring the virulence regulators virR and sodC, controls their expression and stimulates its own promoter in a supercoiling-dependent manner, making it a supercoiling-responsive virulence regulator (Datta 2019). Experimentally characterised. |
Curated reference (UniProt)
| UniProt |
P96276
TrEMBL · unreviewed
· Evidence at protein level
|
|---|---|
| UniProt name | DUF3263 domain-containing protein |
Functional vocabulary (eggNOG-mapper, orthology transfer)
| COG category |
S Function unknown
|
|---|---|
| eggNOG description | Protein of unknown function (DUF3263) |
| Orthologous group | 2CUM0 |
Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.
Conservation & selection (intra-MTBC, 145 209 strains)
| pN/pS | 0.113 · strong purifying |
|---|---|
| Polymorphic sites (≥ 0.1% of strains) | 3 synonymous, 1 missense, 0 nonsense, 0 frameshift |
pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.
Domains (Pfam, hmmscan --cut_ga)
| Pfam | Accession | i-Evalue | Residues | Description |
|---|---|---|---|---|
DUF3263 | PF11662.14 | 8.5e-37 | 21–92 | Protein of unknown function (DUF3263) |
Structural neighbours (Foldseek on the ESMFold model, exploratory)
ESMFold model confidence: mean pLDDT 63.1 (low). Low-confidence model: the fold may be unreliable, so treat these structural hits with caution.
Best matches against the PDB, ranked by Foldseek homology probability. A high probability / TM-score suggests a shared fold; unless flagged sig (E < 0.01) these are fold hypotheses, not assignments.
| Target | Prob | TM | E-value | Description |
|---|---|---|---|---|
8cyf-assembly1_B |
0.47 | 0.54 | 6.8e-01 | 8cyf-assembly1_B WhiB3 bound to SigmaAr4-RNAP Beta flap tip chimera and DNA |
8d5v-assembly2_D |
0.47 | 0.64 | 1.3e+00 | 8d5v-assembly2_D WhiB6 bound to the SigmaAr4-RNAP Beta flap tip chimera |
7kug-assembly1_B |
0.41 | 0.63 | 1.4e+00 | 7kug-assembly1_B Fe-S cluster-bound transcription activator WhiB7 in complex with the SigmaAr4-RNAP Beta flap tip chimera |
3n97-assembly1_A |
0.30 | 0.72 | 3.4e+00 | 3n97-assembly1_A RNA polymerase alpha C-terminal domain (E. coli) and sigma region 4 (T. aq. mutant) bound to (UP,-35 element) DNA |
7kuf-assembly1_B |
0.23 | 0.58 | 2.3e+00 | 7kuf-assembly1_B Transcription activation subcomplex with WhiB7 bound to SigmaAr4-RNAP Beta flap tip chimera and DNA |
6lts-assembly1_F |
0.18 | 0.43 | 1.9e+00 | 6lts-assembly1_F Crystal structure of Thermus thermophilus transcription initiation complex comprising a truncated sigma finger |
7ye2-assembly1_F |
0.16 | 0.57 | 3.2e+00 | 7ye2-assembly1_F The cryo-EM structure of C. crescentus GcrA-TACdown |
6ono-assembly2_D |
0.13 | 0.57 | 5.2e+00 | 6ono-assembly2_D Complex structure of WhiB1 and region 4 of SigA in C2221 space group |
Functional interaction network (STRING v12, guilt-by-association)
Closest characterised functional partner: Rv0431 (tuberculin-like peptide), high confidence from genomic context alone (score 856 excluding text-mining). This association is the citable seed of a function hypothesis for this hypothetical protein.
| Partner | Product | Score | No text-mining | Channels (≥400) |
|---|---|---|---|---|
Rv0431 |
tuberculin-like peptide | 856 | 856 ctx | neighborhood:835 |
Rv0433 |
carboxylate-amine ligase | 822 | 822 ctx | neighborhood:795 |
Rv0432 sodC |
superoxide dismutase | 797 | 797 ctx | neighborhood:795 |
Rv0434 hyp |
hypothetical protein | 740 | 741 ctx | neighborhood:739 |
Rv0429c def |
polypeptide deformylase | 701 | 702 ctx | neighborhood:698 |
Rv0428c |
GCN5-like N-acetyltransferase | 699 | 700 ctx | neighborhood:698 |
Rv0427c xthA |
exodeoxyribonuclease III protein XthA | 699 | 700 ctx | neighborhood:698 |
Rv3258c hyp |
hypothetical protein | 522 | 504 | |
Rv0819 mshD |
mycothiol acetyltransferase | 504 | 504 ctx | cooccurence:486 |
Rv2242 hyp |
hypothetical protein | 477 | 478 ctx | cooccurence:476 |
Rv2199c ctaF |
cytochrome c oxidase polypeptide 4 | 476 | 477 ctx | cooccurence:470 |
Rv2229c hyp |
hypothetical protein | 452 | 452 ctx | cooccurence:449 |
Rv3260c whiB2 |
transcriptional regulator WhiB2 | 448 | 448 | |
Rv2680 hyp |
hypothetical protein | 463 | 443 ctx | cooccurence:424 |
Rv3683 hyp |
hypothetical protein | 436 | 437 ctx | cooccurence:434 |
STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.
Evidence
- RefSeq: hypothetical protein
- NAP features: length/supercoil-dependent DNA binding, A/T-rich preference, DNA bridging/coating, protection (Datta 2019, PMID 30872139)
- Modulates supercoiling; stimulates topoisomerase I relaxation activity
- First gene of the virR-sodC operon; supercoiling-responsive virulence regulator; renamed NapA
- Curated from the literature crible (project 'Still unknown gene function', 2026-06-09)
Sources
- Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
- Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_214944.1)
- Domains: Pfam-A via hmmscan --cut_ga — DUF3263 (PF11662.14)
- Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
- Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021,
doi:10.1093/molbev/msab293), eggNOG 5.0 DB
(Huerta-Cepas et al. 2019) — OG
2CUM0 - Curated reference: UniProt P96276 (TrEMBL, unreviewed; Evidence at protein level)
- Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
- Model confidence: ESMFold per-residue pLDDT (mean 63.1, low)
- Interaction network: STRING v12.0 (Szklarczyk et al. 2023,
doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 —
24 functional partner(s); context anchor
Rv0431 - Primary literature: Datta C, Jha RK, Ganguly S, Nagaraja V (2019). NapA (Rv0430), a Novel Nucleoid-Associated Protein that Regulates a Virulence Operon in Mycobacterium tuberculosis in a Supercoiling-Dependent Manner J Mol Biol 431(8):1576-1591. doi:10.1016/j.jmb.2019.02.029 PMID:30872139
Ancestral MTBC0 protein sequence
>mtbc0_000452|Rv0430|napA MDSAMARAIRSGDDAEVADGLTRREHDILAFERQWWKFAGVKEEAIKELFSMSATRYYQVLNALVDRPEALAADPMLVKRLRRLRASRQKARAARRLGFEVT