“We note that these training runs frequently take >1 week on dedicated GPU resources (such as Polaris@ALCF). To enable training of the larger models on the full sequence length (10,240 tokens), we leveraged AI-hardware accelerators such as Cerebras CS-2, both in a stand-alone mode and as an inter-connected cluster, and obtained GenSLMs that converge in less than a day.”

Award-winning research

2022 Gordon Bell Prize for COVID Research

A team led by researchers from Argonne National Laboratory and Cerebras was recognized  for developing the first genome-scale language model to study the evolutionary dynamics of SARS-CoV-2. Their work has the potential to transform how we identify and classify new and emergent variants of pandemic-causing viruses.

At Cerebras Systems, we love it when the CS-2 is vastly faster than large NVIDIA GPU clusters.

Read our blogRead the paper on bioRxiv

Customer Case Study

GlaxoSmithKline: Epigenomic Language Models For Drug Discovery

A team of GSK researchers introduced a BERT model that learns representations based on both DNA sequence and paired epigenetic state inputs, which they named Epigenomic BERT (EBERT)​.

Training this complex model with a previously prohibitively large dataset was made possible for the first time by the partnership between GSK and Cerebras, empowering the team to train the EBERT model in about 2.5 days, compared to an estimated 24 days using a GPU cluster with 16 nodes.

Read our joint blogAccess the technical paperExplore customer spotlight page


Kim Branson

SVP Global Head of AI and ML @
use case

Drug Discovery

Traditional laboratory based drug screening is a slow process, taking years for a compound to progress from research into trials. AI models like Transformers, Graph Neural Networks (GNNs), and Multi-Layer Perceptrons (MLPs) use complex software to screen large libraries of candidate drug molecules; selecting only the most promising ones for subsequent trials. Using computers instead of laboratories enables faster screening and advances the most promising candidate compounds more quickly, dramatically reducing drug development time. Deep neural networks have shown exceptional results, and are now considered among the most powerful computational tools for virtual drug screening.

In partnership with customers, the CS-2 system has demonstrated vast improvements in deep neural network-based virtual screening – in one case reducing screening time for a large library of compounds from 183 days on a GPU cluster to 3.5 days on the CS-2. This 50 X acceleration means reducing time to solution by six months. Faster time to solution reduces time to cure.

Read Blog
use case

Text and Language Modeling

Neural networks like BERT and GPT can model semantic relationships within records, reports, and scientific literature, so you can instantly answer questions using this database of knowledge.

Today, the compute resources and expertise needed to efficiently work with large language models – such as BERT and GPT – and massive real-world text databases are only available in hyperscale datacenters. With a single CS-2, your organization can train models like these in hours or days rather than weeks or months.

use case

Genomics and data science

In genomics, AI has shown great potential for identifying subtle signatures of public health challenges as well as new opportunities for the treatment of rare diseases.

However, most work in this space has been limited to small clinical trials or local populations because the deep learning models used to classify sequences or predict phenotype — e.g. RNNs, Transformers, 1D CNNs — take too long to train or process with large, sparse datasets on GPU. Use the CS-2 to bring 100x – 1,000x more data to your models.

GSK Research Blog

Ready to get started?

Contact Sales