Rv0250c Family assigned · low

H37Rv Rv0250c · MTBC0 mtbc0_000266 · 97 aa · 302117–302410 (-) · RefSeq NP_214764.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)hypothetical protein
MTBC0 PGAP re-annotationhypothetical protein
Revised (this work)No Pfam domain above threshold, but Foldseek strongly matches a sensor / HD-domain fold (CpxA-like, prob 0.95, TM=0.88): a putative signal-transduction / sensor-associated module. Structure-based.

Curated reference (UniProt)

UniProt O53672 SwissProt · reviewed · Evidence at protein level
UniProt nameUncharacterized protein Rv0250c

UniProt still lists this protein as Uncharacterized protein Rv0250c; the revised annotation above is ahead of the current UniProt record.

Functional vocabulary (eggNOG-mapper, orthology transfer)

Orthologous group2DP0M

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS n/a
Polymorphic sites (≥ 0.1% of strains) 0 synonymous, 2 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

No Pfam-A domain above the gathering threshold (or not yet scanned).

Structural neighbours (Foldseek on the ESMFold model, exploratory)

ESMFold model confidence: mean pLDDT 61.1 (low). Low-confidence model: the fold may be unreliable, so treat these structural hits with caution.

Best matches against the PDB, ranked by Foldseek homology probability. A high probability / TM-score suggests a shared fold; unless flagged sig (E < 0.01) these are fold hypotheses, not assignments.

TargetProbTME-valueDescription
4biy-assembly1_B 0.95 0.88 9.6e-01 4biy-assembly1_B Crystal structure of CpxAHDC (monoclinic form 2)
4biy-assembly2_D 0.89 0.71 6.9e-01 4biy-assembly2_D Crystal structure of CpxAHDC (monoclinic form 2)
6ahx-assembly1_A-2 0.75 0.81 2.0e+00 6ahx-assembly1_A-2 Copper-Sensing Operon Regulator Protein (CsoRGz)
7ls2-assembly1_b2 0.57 0.58 1.3e+00 7ls2-assembly1_b2 80S ribosome from mouse bound to eEF2 (Class I)
6z6l-assembly1_Lh 0.54 0.58 1.3e+00 6z6l-assembly1_Lh Cryo-EM structure of human CCDC124 bound to 80S ribosomes
8ir1-assembly1_H 0.51 0.57 1.3e+00 8ir1-assembly1_H human nuclear pre-60S ribosomal particle - State A
8ink-assembly1_H 0.38 0.58 2.1e+00 8ink-assembly1_H human nuclear pre-60S ribosomal particle - State D
7nfx-assembly1_h 0.28 0.59 3.4e+00 7nfx-assembly1_h Mammalian ribosome nascent chain complex with SRP and SRP receptor in early state A

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: Rv0249c (succinate dehydrogenase membrane anchor subunit), high confidence from genomic context alone (score 879 excluding text-mining). This association is the citable seed of a function hypothesis for this hypothetical protein.

PartnerProductScoreNo text-miningChannels (≥400)
Rv0249c succinate dehydrogenase membrane anchor subunit 940 879 ctx neighborhood:787 cooccurence:451 textmining:527
Rv0248c succinate dehydrogenase flavoprotein subunit 911 819 ctx neighborhood:734 textmining:531
Rv0247c succinate dehydrogenase iron-sulfur subunit 928 812 ctx neighborhood:735 textmining:637
Rv0431 tuberculin-like peptide 737 738 ctx cooccurence:736
Rv3415c hyp hypothetical protein 731 731 ctx cooccurence:730
Rv2138 lppL lipoprotein LppL 721 722 ctx cooccurence:721
Rv0882 transmembrane protein 716 717 ctx cooccurence:715
Rv1632c hyp hypothetical protein 704 705 ctx cooccurence:701
Rv2743c hyp hypothetical protein 704 704 ctx cooccurence:700
Rv0556 transmembrane protein 655 655 ctx cooccurence:653
Rv2732c transmembrane protein 640 640 ctx cooccurence:640
Rv2360c hyp hypothetical protein 627 627 ctx cooccurence:625
Rv1109c hyp hypothetical protein 626 626 ctx cooccurence:621
Rv1081c membrane protein 624 624 ctx cooccurence:622
Rv3212 hyp hypothetical protein 623 623 ctx cooccurence:621

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • MTBC0 PGAP product: 'hypothetical protein'
  • Pfam: none above threshold
  • Foldseek on the ESMFold model: strong match to a CpxA-like sensor / HD-domain fold (TM=0.88)

ESM Atlas signal (exploratory)

Ancestral protein hash ca27799c8163182369fb3521d1974fcd · 10 ESM-space neighbours (max similarity 0.957). SAE features are orienting indices, not validated domains.

#IndexActivationInterpretation
11448 0.54 Chromatin DNA-binding and scaffold domains
21101 0.49 Acidic N-terminal disordered region
316 0.46 Modified peptide core detector
48102 0.45 Membrane-proximal PTM-rich IDRs
58724 0.45 Amphipathic helical packing interfaces
61688 0.43 Acidic low-complexity interaction segments
710164 0.42 N-terminal helical assembly modules
86600 0.42 Disordered charged terminal tails

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_214764.1)
  • Domains: Pfam-A via hmmscan --cut_ga — none above threshold
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG 2DP0M
  • Curated reference: UniProt O53672 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Model confidence: ESMFold per-residue pLDDT (mean 61.1, low)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 67 functional partner(s); context anchor Rv0249c
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_000266|Rv0250c|
MSTTAELAELHDLVGGLRRCVTALKARFGDNPATRRIVIDADRILTDIELLDTDVSELDLERAAVPQPSEKIAIPDTEYDREFWRDVDDEGVGGHRY