14 Foundation Models for Biology Research and Chemistry
A list of companies building foundation models focused on DNA, RNA, proteins, as well as foundation models for small molecule drug discovery and organic synthesis.
Hi! I am Andrii Buvailo, and this is my weekly newsletter, ‘Where Tech Meets Bio,’ where I talk about technologies, breakthroughs, and great companies moving the biopharma industry forward.
If you've received it, then you either subscribed or someone forwarded it to you. If the latter is the case, subscribe by pressing this button:
Today is a Thursday newsletter for paid subscribers.
Foundation models represent a new paradigm in artificial intelligence (AI), changing how machine learning models are developed and deployed.
Foundation models are a class of large-scale machine learning models, typically based on deep learning architectures such as transformers, that are trained on massive datasets encompassing diverse types of data.
The term ‘foundation model‘ was first popularized by the Stanford Institute for Human-Centered Artificial Intelligence in 2022.
The most prominent examples of general-purpose foundation models are the GPT-3 and GPT-4 models, which form the basis of ChatGPT, and BERT, or Bidirectional Encoder Representations from Transformers. These are gigantic models trained on extensive volumes of data, often in a self-supervised or unsupervised manner (without or with limited need for labeled data).
Their scalability in terms of both model size and data volume enables them to capture intricate patterns and dependencies within the data.
The pre-training phase of foundation models imparts them with a broad knowledge base, making them highly efficient in few-shot or zero-shot learning scenarios where minimal labeled data is available for specific tasks.
This approach demonstrates their high versatility and transfer learning capabilities, adapting to the nuances of particular challenges through additional training.
Despite some challenges with the foundation models, and even the controversy if they should even be called by this term, it is clear that this is a drastically different approach to building and training models, providing such models with quite unique properties, such as generalizability accross various tasks.
Below is a list of models, and companies building domain-specific foundation models for biology research and related areas, like chemistry.