Rv2164c Still unknown · low auto-curated

H37Rv Rv2164c · MTBC0 mtbc0_002300 · 384 aa · 2454449–2455603 (-) · RefSeq NP_216680.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)hypothetical protein
MTBC0 PGAP re-annotationhypothetical protein
Revised (this work)Conserved hypothetical protein; no recognised domain. Function unknown. Foldseek best (non-significant) hit: 8j07-assembly1_c2 96nm repeat of human respiratory doublet microtubule (prob 0.04, TM 0.14).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Curated reference (UniProt)

UniProt O06213 TrEMBL · unreviewed · Evidence at protein level
UniProt nameProbable conserved proline rich membrane protein

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category D Cell cycle control, cell division, chromosome partitioning
eggNOG descriptionEssential cell division protein. May link together the upstream cell division proteins, which are predominantly cytoplasmic, with the downstream cell division proteins, which are predominantly periplasmic
Orthologous groupCOG2919

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 0.953 · relaxed/neutral
Polymorphic sites (≥ 0.1% of strains) 5 synonymous, 12 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

No Pfam-A domain above the gathering threshold (or not yet scanned).

Structural neighbours (Foldseek on the ESMFold model, exploratory)

ESMFold model confidence: mean pLDDT 70.3 (confident). A confident model makes the fold comparison meaningful.

Best matches against the PDB, ranked by Foldseek homology probability. A high probability / TM-score suggests a shared fold; unless flagged sig (E < 0.01) these are fold hypotheses, not assignments.

TargetProbTME-valueDescription
8j07-assembly1_c2 0.04 0.14 2.1e+00 8j07-assembly1_c2 96nm repeat of human respiratory doublet microtubule and associated axonemal complexes

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: pbpB (penicillin-binding membrane protein PbpB), high confidence from genomic context alone (score 979 excluding text-mining). This association is the citable seed of a function hypothesis for this hypothetical protein.

PartnerProductScoreNo text-miningChannels (≥400)
Rv2163c pbpB penicillin-binding membrane protein PbpB 988 979 ctx neighborhood:881 coexpression:806 textmining:475
Rv2151c ftsQ cell division protein FtsQ 931 922 ctx neighborhood:544 experimental:829
Rv2165c rsmH rRNA small subunit methyltransferase H 969 910 ctx neighborhood:882 textmining:673
Rv2166c mraZ transcriptional regulator MraZ 978 837 ctx neighborhood:816 textmining:876
Rv0538 membrane protein 772 773 ctx cooccurence:767
Rv3909 hyp hypothetical protein 771 772 ctx cooccurence:769
Rv3835 hyp hypothetical protein 923 770 ctx cooccurence:766 textmining:680
Rv2709 transmembrane protein 770 770 ctx cooccurence:769
Rv3658c transmembrane protein 748 749 ctx cooccurence:748
Rv3903c cpnT hyp hypothetical protein 747 748 ctx cooccurence:746
Rv2843 hyp hypothetical protein 746 747 ctx cooccurence:743
Rv2939 papA5 phthiocerol/phthiodiolone dimycocerosyl transferase 742 743 ctx cooccurence:742
Rv0007 membrane protein 729 729 ctx cooccurence:723
Rv1111c hyp hypothetical protein 724 725 ctx cooccurence:722
Rv0339c iniR transcriptional regulator 723 723 ctx cooccurence:719

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Legacy H37Rv annotation: hypothetical protein
  • MTBC0 PGAP product: hypothetical protein
  • Foldseek best: 8j07-assembly1_c2 96nm repeat of human respiratory doublet microtubule and assoc (prob 0.04, E=2e+00, TM=0.14)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_216680.1)
  • Domains: Pfam-A via hmmscan --cut_ga — none above threshold
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG2919
  • Curated reference: UniProt O06213 (TrEMBL, unreviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Model confidence: ESMFold per-residue pLDDT (mean 70.3, confident)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 100 functional partner(s); context anchor pbpB
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_002300|Rv2164c|
MRAKREAPKSRSSDRRRRADSPAAATRRTTTNSAPSRRIRSRAGKTSAPGRQARVSRPGPQTSPMLSPFDRPAPAKNTSQAKARAKARKAKAPKLVRPTPMERLAARLTSIDLRPRTLANKVPFVVLVIGSLGVGLGLTLWLSTDAAERSYQLSNARERTRMLQQHKEALERDVREAASAPALAEAARRQGMIPTRDTAHLVQDPDGNWVVVGTPKPADGVPPPPLNTKLPEDPPPPPKPAAVPLEVPVRVTPGPDDPAPPARSGPEVLVRTPDGTATLGGATHLPTQAGPQLPGPVPIPGAPGPMPAPPLGAVPSPAPAENPVPLQVGAAPPAGLPGPAPVAATPGLSGGSQPMVAPPAPVPANGEQFGPVTAPVPTAPGAPR