system Archives - Cerebras

MediSwift: Efficient Sparse Pre-trained Biomedical Language Models

Tin Hoang — Mon, 20 May 2024 15:07:22 +0000

Large language models (LLMs) are typically trained on general source data for various domains, but a recent surge in domain-specific LLMs has shown their potential to outperform general-purpose models in domain-specific tasks (e.g., biomedicine).…

The post MediSwift: Efficient Sparse Pre-trained Biomedical Language Models appeared first on Cerebras.

Breaking the Molecular Dynamics Timescale Barrier Using a Wafer-Scale System

Tin Hoang — Thu, 16 May 2024 00:54:00 +0000

Molecular dynamics (MD) simulations have transformed our understanding of the nanoscale, driving breakthroughs in materials science, computational chemistry, and several other fields, including biophysics and drug design.…

The post Breaking the Molecular Dynamics Timescale Barrier Using a Wafer-Scale System appeared first on Cerebras.

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

Tin Hoang — Thu, 16 May 2024 00:53:03 +0000

Large language models (LLMs) have revolutionized Natural Language Processing (NLP), but their size creates computational bottlenecks. We introduce a novel approach to create accurate, sparse foundational versions of performant LLMs that achieve full accuracy recovery for fine-tuning tasks at up…

The post Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment appeared first on Cerebras.

Efficient Algorithms for Monte Carlo Particle Transport on AI Accelerator Hardware

Udai Mody — Mon, 13 Nov 2023 18:52:26 +0000

The recent trend toward deep learning has led to the development of a variety of highly innovative AI accelerator architectures. One such architecture, the Cerebras Wafer-Scale Engine 2 (WSE-2), features 40 GB of on-chip SRAM, making it a potentially attractive…

The post Efficient Algorithms for Monte Carlo Particle Transport on AI Accelerator Hardware appeared first on Cerebras.

Position Interpolation Improves ALiBi Extrapolation

Tin Hoang — Wed, 08 Nov 2023 22:57:03 +0000

Linear position interpolation helps pre-trained models using rotary position embeddings (RoPE) to extrapolate to longer sequence lengths. We propose using linear position interpolation to extend the extrapolation range of models using Attention with Linear Biases (ALiBi). We find position interpolation…

The post Position Interpolation Improves ALiBi Extrapolation appeared first on Cerebras.

Scaling the “Memory Wall” for Multi-Dimensional Seismic Processing with Algebraic Compression on Cerebras CS-2 Systems

Tin Hoang — Tue, 26 Sep 2023 23:42:19 +0000

…

The post Scaling the “Memory Wall” for Multi-Dimensional Seismic Processing with Algebraic Compression on Cerebras CS-2 Systems appeared first on Cerebras.

BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model

Tin Hoang — Fri, 22 Sep 2023 17:28:00 +0000

We introduce the Bittensor Language Model, called “BTLM-3B-8K”, a new state-of-the-art 3 billion parameter open-source language model. BTLM-3B-8K was trained on 627B tokens from the SlimPajama dataset with a mixture of 2,048 and 8,192 context lengths. BTLM-3B-8K outperforms all existing…

The post BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model appeared first on Cerebras.

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

Tin Hoang — Thu, 31 Aug 2023 19:39:26 +0000

We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and instruction-tuned open generative large language models (LLMs). The models are based on the GPT-3 decoder-only architecture and are pretrained on a mixture of Arabic and English texts, including source code…

The post Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models appeared first on Cerebras.

Cerebras Architecture Deep Dive: First Look Inside the Hardware/Software Co-Design for Deep Learning

Rebecca Lewington — Mon, 22 May 2023 20:15:11 +0000

IEEE Micro Volume 34, Issue 3, focuses on papers from last year’s Hot Chips 34 conference.
This article describes the Cerebras architecture and how it is designed specifically with this purpose, from the ground up, as a wafer-sized chip to…

The post Cerebras Architecture Deep Dive: First Look Inside the Hardware/Software Co-Design for Deep Learning appeared first on Cerebras.

GlaxoSmithKline and Cerebras are Advancing the State of the Art in AI for Drug Discovery

Rebecca Lewington — Wed, 26 Jan 2022 15:55:45 +0000

Kim Branson, SVP & Global Head of Artificial Intelligence and Machine Learning, Meredith Trotter, and Stephen Young, GSK.
Natalia Vassilieva, Director of Product, Machine Learning, and Rebecca Lewington, Technology Evangelist, Cerebras Systems.
January 26, 2022
Artificial intelligence has the potential…

The post GlaxoSmithKline and Cerebras are Advancing the State of the Art in AI for Drug Discovery appeared first on Cerebras.

A Big Chip for Big Science: Watching the COVID-19 Virus in Action

Vishal Subbiah — Tue, 14 Dec 2021 14:00:42 +0000

…

The post A Big Chip for Big Science: Watching the COVID-19 Virus in Action appeared first on Cerebras.

Microprocessor at 50. The Path to Successful Wafer-Scale Integration: The Cerebras Story

Rebecca Lewington — Fri, 19 Nov 2021 22:38:04 +0000

IEEE Micro Volume 41, Issue 6, took a look back at the first 50 years of the microprocessor, and forward to what’s next. It featured this article by Gary Lauterbach, Co-Founder
and the Chief Technology Officer of Cerebras Systems, which…

The post Microprocessor at 50. The Path to Successful Wafer-Scale Integration: The Cerebras Story appeared first on Cerebras.