Week of July 24, 2023
What Self-Driving Cars Tell Us About AI Risks • 5 conclusions from an automation expert fresh off a stint with the U.S. highway safety agency: 1. Human errors in operation get replaced by human errors in coding; 2. AI failure modes are hard to predict; 3. Probabilistic estimates do not approximate judgment under uncertainty; 4. Maintaining AI is just as important as creating AI; and 5. AI has system-level implications that can’t be ignored. • (IEEE Spectrum, Mary L. “Missy” Cummings) / July 30
The Transformer Blueprint: A Holistic Guide to the Transformer Neural Network Architecture • In this comprehensive guide, we will dissect the transformer model to its core, thoroughly exploring every key component from its attention mechanism to its encoder-decoder structure. Not stopping at the foundational level, we will traverse the landscape of large language models that leverage the power of the transformer, delving into their unique design attributes and functionalities. Further expanding the horizons, we will explore the applications of transformer models beyond NLP and probe into the current challenges and potential future directions of this influential architecture. Additionally, a curated list of open-source implementations and supplementary resources will be provided for those intrigued to explore further. • (AI Research Blog, Jean Nyandwi) / July 29
Preparing for the era of 32K context: Early learnings and explorations • Today, we’re releasing LLaMA-2-7B-32K, a 32K context model built using Position Interpolation and Together AI’s data recipe and system optimizations, including FlashAttention-2. Fine-tune the model for targeted, long-context tasks—such as multi-document understanding, summarization, and QA—and run inference and fine-tune on 32K context with up to 3x speedup. • (Together.ai) / July 28
Researchers Discover New Vulnerability in Large Language Models • Researchers at Carnegie Mellon University’s School of Computer Science (SCS), the CyLab Security and Privacy Institute, and the Center for AI Safety in San Francisco have uncovered a new vulnerability, proposing a simple and effective attack method that causes aligned language models to generate objectionable behaviors at a high success rate. In their latest study, ‘Universal and Transferable Adversarial Attacks on Aligned Language Models,’ CMU Associate Professors Matt Fredrikson and Zico Kolter, Ph.D. student Andy Zou, and alumnus Zifan Wang found a suffix that, when attached to a wide range of queries, significantly increases the likelihood that both open- and closed-source LLMs will produce affirmative responses to queries that they would otherwise refuse. Rather than relying on manual engineering, their approach automatically produces these adversarial suffixes through a combination of greedy and gradient-based search techniques. • (Carnegie Mellon University, Ryan Noone) / July 28
Microsoft’s AI shopping announcement contains hallucinations in the demo • A few weeks ago, Microsoft announced their latest foray into e-commerce search: AI-powered buying guides in Bing. We were curious to dig in and see just how well (or not) this feature performed, since the problems with large language models like ChatGPT is that they tend to make up fake information – errors called “hallucinations.” It turns out we didn’t have to look very far. In fact, Microsoft’s own promotional materials include hallucinations about headphone quality. • (PerfectRec, Wally Nowinski) / July 28
Speaking robot: Our new AI model translates vision and language into robotic actions • Today, we’re introducing a new advancement in robotics that brings us closer to a future of helpful robots. Robotics Transformer 2, or RT-2, is a first-of-its-kind vision-language-action (VLA) model. A Transformer-based model trained on text and images from the web, RT-2 can directly output robotic actions. Just like language models are trained on text from the web to learn general ideas and concepts, RT-2 transfers knowledge from web data to inform robot behavior. In other words, RT-2 can speak robot. • (Google, Vincent Vanhoucke) / July 28
Introducing the Chie app • Chie is a cross-platform desktop app for LLMs like ChatGPT, it has following advantages over other similar apps: (1) open source and hackable, (2) support extensions, (3) NOT an Electron app, and (4) NOT a webview wrapper of web pages. • (Chie.app) / July 28
So you want to build your own open source chatbot… • Assembling an open source LLM-powered chatbot turns out to be a complicated task, requiring many decisions at multiple layers of the technology stack. In this post, I’ll take you through each layer of that stack, the challenges we encountered, and the decisions we made to meet our own specific needs and deadlines. • (Mozilla Hacks, Stephen Hood) / July 27
Llama and ChatGPT Are Not Open-Source • Social media and advertising-technology company Meta recently released an update to its large language model Llama. Llama 2 was released as open source, providing users access to the model’s weights, evaluation code, and documentation. Meta states the open-source release was intended to make the model “accessible to individuals, creators, researchers, and businesses so they can experiment, innovate, and scale their ideas responsibly.” However, compared to other open-source LLMs and open-source software packages more generally, Llama 2 is considerably closed off. Though Meta has made the trained model available, it is not sharing the model’s training data or the code used to train it. While thirdparties have been able to create applications that extend on the base model, aspiring developers and researchers have a limited ability to pick apart the model as is. • (IEEE Spectrum, Michael Nolan) / July 27
WebArena: A Realistic Web Environment for Building Autonomous Agents • WebArena is a standalone, self-hostable web environment for building autonomous agents. WebArena creates websites from four popular categories with functionality and data mimicking their real-world equivalents. To emulate human problem-solving, WebArena also embeds tools and knowledge resources as independent websites. WebArena introduces a benchmark on interpreting high-level realistic natural language command to concrete web-based interactions. We provide annotated programs designed to programmatically validate the functional correctness of each task. • (WebArena) / July 27
Monarch Mixer: Revisiting BERT, Without Attention or MLPs • Over the past six years, we’ve seen Transformers take the world by storm. Transformers have been the workhorse architecture behind modern foundation models and have seen impressive empirical success across diverse applications – from pretrained language models like BERT, ChatGPT, and Flan-T5, to image models like SAM and stable diffusion. We think Transformers are great (and have had lots of fun optimizing them), but we’ve also been thinking about a deeper question: Are Transformers the only way to get this amazing performance? Today we’re excited to present a little teaser of some work in this direction – Monarch Mixer BERT (M2-BERT). M2-BERT is sub-quadratic in sequence length and model dimension, has 25% fewer parameters/FLOPs than BERT, and matches in quality (potentially exceeding a little bit when parameter-matched). • (Hazy Research, Dan Fu, Simran Arora, Chris Ré) / July 25