clpB Resolved · high auto-curated
H37Rv Rv0384c · MTBC0 - ·
848 aa · 459456–462002 (-) ·
RefSeq NP_214898.1
Annotation: from legacy to revised
| Legacy (H37Rv / Mycobrowser) | chaperone protein ClpB |
|---|---|
| MTBC0 PGAP re-annotation | — |
| Revised (this work) | Chaperone protein ClpB. Pfam: Clp_N (PF02861.26), NBD_SMAX1 (PF23569.2), AAA (PF00004.36), AAA_lid_9 (PF17871.8), AAA_2 (PF07724.21), Sigma54_activat (PF00158.33), AAA_5 (PF07728.21), ClpB_D2-small (PF10431.16). |
Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.
Annotated on the H37Rv protein: this gene has no 1:1 ancestral MTBC0 anchor (PE/PPE, paralogue, IS element, or otherwise unanchored CDS).
Curated reference (UniProt)
| UniProt |
P9WPD1
SwissProt · reviewed
· Evidence at protein level
|
|---|---|
| UniProt name | Chaperone protein ClpB |
| Curated function | Part of a stress-induced multi-chaperone system, it is involved in the recovery of the cell from heat-induced damage, in cooperation with DnaK, DnaJ and GrpE. Acts before DnaK, in the processing of protein aggregates. Protein binding stimulates the ATPase activity; ATP hydrolysis unfolds the denatured protein aggregates, which probably helps expose new hydrophobic binding sites on the surface of ClpB-bound aggregates, contributing to the solubilization and refolding of denatured protein aggregates by DnaK (By similarity). |
Functional vocabulary (eggNOG-mapper, orthology transfer)
| COG category |
O Post-translational modification, protein turnover, chaperones
|
|---|---|
| Preferred name | clpB |
| eggNOG description | Part of a stress-induced multi-chaperone system, it is involved in the recovery of the cell from heat-induced damage, in cooperation with DnaK, DnaJ and GrpE |
| Orthologous group | COG0542 |
| KEGG orthology |
K03695, K03696
|
| KEGG pathways |
map01100, map04213
|
| Gene Ontology (10) |
GO:0005575, GO:0005618, GO:0005623, GO:0005886, GO:0008150, GO:0016020, GO:0030312, GO:0040007, GO:0044464, GO:0071944
|
Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.
Conservation & selection (intra-MTBC, 145 209 strains)
| pN/pS | 0.117 · strong purifying |
|---|---|
| Polymorphic sites (≥ 0.1% of strains) | 6 synonymous, 2 missense, 0 nonsense, 0 frameshift |
pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.
Domains (Pfam, hmmscan --cut_ga)
| Pfam | Accession | i-Evalue | Residues | Description |
|---|---|---|---|---|
Clp_N | PF02861.26 | 3.1e-32 | 6–127 | Clp repeat (R) N-terminal domain |
NBD_SMAX1 | PF23569.2 | 2.0e-08 | 185–288 | SMAX1 nucleotide binding domain |
AAA | PF00004.36 | 1.2e-11 | 203–317 | ATPase family associated with various cellular activities (AAA) |
AAA_lid_9 | PF17871.8 | 2.4e-35 | 343–445 | AAA lid domain |
AAA_2 | PF07724.21 | 9.6e-64 | 598–749 | AAA domain (Cdc48 subfamily) |
Sigma54_activat | PF00158.33 | 8.0e-05 | 602–703 | Sigma-54 interaction domain |
AAA_5 | PF07728.21 | 4.8e-12 | 603–736 | AAA domain (dynein-related subfamily) |
ClpB_D2-small | PF10431.16 | 1.2e-27 | 756–835 | C-terminal, D2-small domain, of ClpB protein |
Functional interaction network (STRING v12, guilt-by-association)
Closest characterised functional partner: pyrE (orotate phosphoribosyltransferase), high confidence from genomic context alone (score 702 excluding text-mining).
| Partner | Product | Score | No text-mining | Channels (≥400) |
|---|---|---|---|---|
Rv0350 dnaK |
chaperone protein DnaK | 999 | 1000 | coexpression:965 experimental:999 textmining:937 |
Rv0351 grpE |
stress response protein GrpE | 996 | 957 | coexpression:948 textmining:933 |
Rv1331 clpS |
ATP-dependent Clp protease adapter protein ClpS | 864 | 850 | experimental:773 |
Rv2460c clpP2 |
ATP-dependent CLP protease proteolytic subunit 2 | 916 | 809 | coexpression:470 experimental:529 textmining:583 |
Rv2264c hyp |
hypothetical protein | 833 | 809 | coexpression:616 experimental:508 |
Rv3446c hyp |
hypothetical protein | 831 | 807 | coexpression:612 experimental:508 |
Rv0312 hyp |
hypothetical protein | 831 | 807 | coexpression:612 experimental:508 |
Rv2461c clpP1 |
ATP-dependent CLP protease proteolytic subunit 1 | 940 | 806 | coexpression:467 experimental:529 textmining:707 |
Rv0251c hsp |
heat shock protein | 898 | 771 | coexpression:727 textmining:576 |
Rv0987 |
adhesion component ABC transporter permease | 765 | 766 | coexpression:765 |
Rv0383c ttfA hyp |
hypothetical protein | 764 | 755 ctx | neighborhood:755 |
Rv0352 dnaJ1 |
chaperone protein DnaJ | 982 | 710 | coexpression:538 textmining:943 |
Rv2031c hspX |
alpha-crystallin | 801 | 702 | coexpression:645 |
Rv0382c pyrE |
orotate phosphoribosyltransferase | 762 | 702 ctx | neighborhood:699 |
Rv2373c dnaJ2 |
chaperone protein DnaJ | 894 | 701 | coexpression:523 textmining:660 |
STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.
Evidence
- Annotation from H37Rv (no MTBC0 1:1 anchor; H37Rv protein used): chaperone protein ClpB
- Pfam (hmmscan --cut_ga): Clp_N PF02861.26 (E=3e-32), NBD_SMAX1 PF23569.2 (E=2e-08), AAA PF00004.36 (E=1e-11), AAA_lid_9 PF17871.8 (E=2e-35), AAA_2 PF07724.21 (E=1e-63), Sigma54_activat PF00158.33 (E=8e-05), AAA_5 PF07728.21 (E=5e-12), ClpB_D2-small PF10431.16 (E=1e-27)
- (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)
Sources
- Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
- Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_214898.1)
- Domains: Pfam-A via hmmscan --cut_ga — Clp_N (PF02861.26), NBD_SMAX1 (PF23569.2), AAA (PF00004.36), AAA_lid_9 (PF17871.8), AAA_2 (PF07724.21), Sigma54_activat (PF00158.33), AAA_5 (PF07728.21), ClpB_D2-small (PF10431.16)
- Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
- Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021,
doi:10.1093/molbev/msab293), eggNOG 5.0 DB
(Huerta-Cepas et al. 2019) — OG
COG0542 - Curated reference: UniProt P9WPD1 (SwissProt, reviewed; Evidence at protein level)
- Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
- Interaction network: STRING v12.0 (Szklarczyk et al. 2023,
doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 —
61 functional partner(s); context anchor
pyrE - Primary literature: none located yet; annotation rests on the domain/homology sources above.
Ancestral MTBC0 protein sequence
>H37Rv|Rv0384c|clpB MDSFNPTTKTQAALTAALQAASTAGNPEIRPAHLLMALLTQNDGIAAPLLEAVGVEPATVRAETQRLLDRLPQATGASTQPQLSRESLAAITTAQQLATELDDEYVSTEHVMVGLATGDSDVAKLLTGHGASPQALREAFVKVRGSARVTSPEPEATYQALQKYSTDLTARAREGKLDPVIGRDNEIRRVVQVLSRRTKNNPVLIGEPGVGKTAIVEGLAQRIVAGDVPESLRDKTIVALDLGSMVAGSKYRGEFEERLKAVLDDIKNSAGQIITFIDELHTIVGAGATGEGAMDAGNMIKPMLARGELRLVGATTLDEYRKHIEKDAALERRFQQVYVGEPSVEDTIGILRGLKDRYEVHHGVRITDSALVAAATLSDRYITARFLPDKAIDLVDEAASRLRMEIDSRPVEIDEVERLVRRLEIEEMALSKEEDEASAERLAKLRSELADQKEKLAELTTRWQNEKNAIEIVRDLKEQLEALRGESERAERDGDLAKAAELRYGRIPEVEKKLDAALPQAQAREQVMLKEEVGPDDIADVVSAWTGIPAGRLLEGETAKLLRMEDELGKRVIGQKAAVTAVSDAVRRSRAGVSDPNRPTGAFMFLGPTGVGKTELAKALADFLFDDERAMVRIDMSEYGEKHTVARLIGAPPGYVGYEAGGQLTEAVRRRPYTVVLFDEIEKAHPDVFDVLLQVLDEGRLTDGHGRTVDFRNTILILTSNLGSGGSAEQVLAAVRATFKPEFINRLDDVLIFEGLNPEELVRIVDIQLAQLGKRLAQRRLQLQVSLPAKRWLAQRGFDPVYGARPLRRLVQQAIGDQLAKMLLAGQVHDGDTVPVNVSPDADSLILG