DEV Community

# ProtBFN: Bayesian Foundation Model for Protein Sequence Design

A recent paper in Nature Communications introduces ProtBFN, a powerful foundation model for protein sequence design with 650 million parameters. ProtBFN uses Bayesian Flow Networks to generate diverse and structurally coherent sequences without relying on explicit structural data. The model offers unconditional and conditional protein generation, outperforming leading autoregressive and diffusion models. It produces sequences that match natural length and amino acid distributions. A fine-tuned variant, AbBFN, is also available for antibody heavy chains, demonstrated on the Observed Antibody Space (OAS). ProtBFN enables zero-shot design, producing valid proteins without retraining, making it versatile for therapeutic and industrial enzyme design. The model's probabilistic flow networks provide both generative flexibility and structural consistency, aligning with core needs of protein engineering. The open-source model is pip-installable, allowing investigators to benchmark it on custom tasks like stability prediction, binding design, or de novo therapeutic protein creation. Community contributions are underway to expand pretrained variants and evaluation metrics. ProtBFN has the potential to significantly impact protein engineering and design.
favicon
dev.to
dev.to