lepB Resolved · high auto-curated

H37Rv Rv2903c · MTBC0 mtbc0_003085 · 294 aa · 3233799–3234683 (-) · RefSeq NP_217419.1

Annotation: from legacy to revised

Legacy (H37Rv / Mycobrowser)signal peptidase
MTBC0 PGAP re-annotationsignal peptidase I
Revised (this work)Signal peptidase I. Pfam: Peptidase_S26 (PF10502.15).

Auto-curated: this verdict and function were generated by rules from PGAP + Pfam + Foldseek and have not been hand-reviewed.

Curated reference (UniProt)

UniProt P9WKA1 SwissProt · reviewed · Evidence at protein level
UniProt nameSignal peptidase I
EC (curated) EC 3.4.21.89

Functional vocabulary (eggNOG-mapper, orthology transfer)

COG category U Intracellular trafficking, secretion and vesicular transport
Preferred namelepB
eggNOG descriptionBelongs to the peptidase S26 family
Orthologous groupCOG0681
EC number EC 3.4.21.89
KEGG orthology K03100
KEGG pathways map02024, map03060
Gene Ontology (15) GO:0005575, GO:0005576, GO:0005623, GO:0005886, GO:0005887, GO:0008150, GO:0016020, GO:0016021, GO:0031224, GO:0031226, GO:0040007, GO:0044425 +3 more

Orthology-based transfer (eggNOG 5.0.2, diamond). EC/KO/GO/CAZy are computed annotations, not manual curation; cross-check against the primary literature before treating a specific reaction as established.

Conservation & selection (intra-MTBC, 145 209 strains)

pN/pS 0.115 · strong purifying
Polymorphic sites (≥ 0.1% of strains) 3 synonymous, 1 missense, 0 nonsense, 0 frameshift

pN/pS from segregating SNPs (singletons removed) normalised by possible sites. Low pN/pS = purifying selection (a strong signal that a "hypothetical" is a real, constrained gene). A high pN/pS is ambiguous: relaxed constraint or positive selection (drug resistance, antigenic variation) inflate it; e.g. rpoB/katG/pncA score high here for resistance, not loss of function. A clonal disruption (one allele over a clade) suggests lineage pseudogenisation; a convergent one (many independent alleles) is typical of resistance loss-of-function.

Domains (Pfam, hmmscan --cut_ga)

PfamAccessioni-EvalueResiduesDescription
Peptidase_S26PF10502.15 4.4e-4466–277 Signal peptidase, peptidase S26

Functional interaction network (STRING v12, guilt-by-association)

Closest characterised functional partner: rnhB (ribonuclease HII), high confidence from genomic context alone (score 975 excluding text-mining).

PartnerProductScoreNo text-miningChannels (≥400)
Rv2902c rnhB ribonuclease HII 975 975 ctx neighborhood:872 coexpression:813
Rv2904c rplS 50S ribosomal protein L19 846 847 ctx neighborhood:805
Rv2901c hyp hypothetical protein 812 812 ctx neighborhood:802
Rv1305 atpE ATP synthase subunit C 703 670 database:647
Rv0722 rpmD 50S ribosomal protein L30 676 663 database:588
Rv0938 ligD multifunctional non-homologous end joining DNA repair protein/ATP dependent DNA ligase LigD 664 648 database:626
Rv3062 ligB DNA ligase 656 640 database:626
Rv3731 ligC DNA ligase C 654 638 database:626
Rv2404c lepA GTP-binding protein LepA 843 634 coexpression:557 textmining:590
Rv2869c rip zinc metalloprotease 610 574 coexpression:412
Rv2992c gltS glutamate--tRNA ligase 557 535
Rv3921c yidC membrane protein insertase YidC 639 529 coexpression:408
Rv2733c miaB (dimethylallyl)adenosine tRNA methylthiotransferase 518 518 coexpression:418
Rv1385 pyrF orotidine 5'-phosphate decarboxylase 517 517 database:501
Rv2444c rne ribonuclease E 516 517 coexpression:439

STRING combines evidence channels (neighborhood, fusion, cooccurrence, coexpression, experimental, database, text-mining) into a 0–1000 score. The ctx badge marks edges carried by the genomic-context channels (conserved neighborhood, fusion, phylogenetic co-occurrence), which are independent of orthology and structure and the strongest signal for an unknown gene. The no text-mining column recomputes the score from data alone, so a link that does not depend on the literature is visible. Association is a function hypothesis, not proof: corroborate with the operon context and the primary literature before assigning a function.

Evidence

  • Legacy H37Rv annotation: signal peptidase
  • MTBC0 PGAP product: signal peptidase I
  • Pfam (hmmscan --cut_ga): Peptidase_S26 PF10502.15 (E=4e-44)
  • (auto-curated by rules from PGAP + Pfam + Foldseek; not hand-reviewed)

Sources

  • Ancestral sequence & coordinates: Harrison LB et al. (2024), An imputed ancestral reference genome for the MTBC, doi:10.1101/2023.09.07.556366
  • Product annotation: NCBI PGAP on MTBC0; legacy from H37Rv NC_000962.3 (RefSeq NP_217419.1)
  • Domains: Pfam-A via hmmscan --cut_ga — Peptidase_S26 (PF10502.15)
  • Sequence-level signal: ESM Atlas (EvolutionaryScale × BioHub) — exploratory
  • Controlled vocabulary: eggNOG-mapper 2.1.12 (Cantalapiedra et al. 2021, doi:10.1093/molbev/msab293), eggNOG 5.0 DB (Huerta-Cepas et al. 2019) — OG COG0681
  • Curated reference: UniProt P9WKA1 (SwissProt, reviewed; Evidence at protein level)
  • Intra-MTBC selection: pN/pS and disruption from SPDI variants of 145 209 MTBC strains (this work, local collection vs H37Rv NC_000962.3)
  • Interaction network: STRING v12.0 (Szklarczyk et al. 2023, doi:10.1093/nar/gkac1000), taxon 83332, CC-BY 4.0 — 95 functional partner(s); context anchor rnhB
  • Primary literature: none located yet; annotation rests on the domain/homology sources above.

Ancestral MTBC0 protein sequence

>mtbc0_003085|Rv2903c|lepB
MTETTDSPSERQPGPAEPELSSRDPDIAGQVFDAAPFDAAPDADSEGDSKAAKTDEPRPAKRSTLREFAVLAVIAVVLYYVMLTFVARPYLIPSESMEPTLHGCSTCVGDRIMVDKLSYRFGSPQPGDVIVFRGPPSWNVGYKSIRSHNVAVRWVQNALSFIGFVPPDENDLVKRVIAVGGQTVQCRSDTGLTVNGRPLKEPYLDPATMMADPSIYPCLGSEFGPVTVPPGRVWVMGDNRTHSADSRAHCPLLCTDDPLPGTVPVANVIGKARLIVWPPSRWGVVRSVNPQQGR