Week of March 25, 2024
WSJ: The AI industry spent 17x more on Nvidia chips than it brought in in revenue • In a presentation earlier this month, the venture-capital firm Sequoia estimated that the AI industry spent $50 billion on the Nvidia chips used to train advanced AI models last year, but brought in only $3 billion in revenue. • (Reddit, /r/MachineLearning) / March 30
A Peter Thiel-Backed AI Startup, Cognition Labs, Seeks $2 Billion Valuation • Cognition Labs, a startup developing an artificial-intelligence tool for writing code, is in talks with investors to raise funding at a valuation of up to $2 billion, in a test of the investor frenzy around new AI technology. • (The Wall Street Journal, Berber Jin) / March 30
Headless, dog-sized robot to patrol Alaska airport to prevent bird strikes • A headless robot about the size of a labrador will be camouflaged as a coyote to ward off migratory birds and other wildlife at Alaska’s second largest airport. The robot - named Aurora - can climb rocks, go up stairs and make dance-like movements while flashing green lights. These tactics will be used to scare away wildlife. • (Sky News) / March 29
OpenAI and Microsoft reportedly planning $100 billion datacenter project for an AI supercomputer • Microsoft and OpenAI are reportedly working on a massive datacenter to house an AI-focused supercomputer featuring millions of GPUs. The Information reports that the project could cost “in excess of $115 billion” and that the supercomputer, currently dubbed “Stargate” inside OpenAI, would be U.S.-based. The report says that Microsoft would foot the bill for the datacenter, which could be “100 times more costly” than some of the biggest operating centers today. Stargate would be the largest in a string of datacenter projects the two companies hope to build in the next six years, and executives hope to have it running by 2028. • (Tom’s Hardware, Andrew E. Freedman) / March 29
NYC’s AI Chatbot Tells Businesses to Break the Law • In October, New York City announced a plan to harness the power of artificial intelligence to improve the business of government. The announcement included a surprising centerpiece: an AI-powered chatbot that would provide New Yorkers with information on starting and operating a business in the city. The problem, however, is that the city’s chatbot is telling businesses to break the law. Five months after launch, it’s clear that while the bot appears authoritative, the information it provides on housing policy, worker rights, and rules for entrepreneurs is often incomplete and in worst-case scenarios “dangerously inaccurate,” as one local housing policy expert told The Markup. • (The Markup, Colin Lecher) / March 29
OpenAI says it can clone a voice from just 15 seconds of audio • OpenAI just announced that it recently conducted a small-scale preview of a new tool called Voice Engine. This is a voice cloning technology that can mimic any speaker by analyzing a 15-second audio sample. The company says it generates “natural-sounding speech” with “emotive and realistic voices.” The technology is based on the company’s pre-existing text-to-speech API and it has been in the works since 2022. OpenAI has already been using a version of the toolset to power the preset voices available in the current text-to-speech API and the Read Aloud feature. • (Engadget, Lawrence Bonk) / March 29
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild • VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts To clone an unseen voice or edit a recording, VoiceCraft needs only a few seconds of the voice. • (VoiceCraft, Puyuan Peng, et al,) / March 29
Qwen1.5-MoE: Matching 7B Model Performance with 1/3 Activated Parameters • Since the surge in interest sparked by Mixtral, research on mixture-of-expert (MoE) models has gained significant momentum. Both researchers and practitioners are keenly interested in understanding how to effectively train such models and assessing their efficiency and effectiveness. Today, we introduce Qwen1.5-MoE-A2.7B, a small MoE model with only 2.7 billion activated parameters yet matching the performance of state-of-the-art 7B models like Mistral 7B and Qwen1.5-7B. Compared to Qwen1.5-7B, which contains 6.5 billion non-embedding parameters, Qwen1.5-MoE-A2.7B contains only 2.0 billion non-embedding parameters, approximately one-third of Qwen1.5-7B’s size. Notably, it achieves a 75% decrease in training expenses and accelerates inference speed by a factor of 1.74, offering substantial improvements in resource utilization without compromising performance. • (Qwen Team) / March 28
Disillusioned Businesses Discovering That Ai Kind Of Sucks • By now, it seems clear that much of the hype around generative AI is overblown — if not a bubble that’s bound to burst — and some businesses that invested in the tech are learning that the hard way. The tech’s drawbacks are hard to overlook. Large language models like ChatGPT are prone to hallucinating and spreading misinformation. Both chatbots and AI image makers have been accused of plagiarizing writers and artists. And overall, the hardware that generative AI uses needs enormous amounts of energy, gutting the environment. Perhaps most of all, according to Gary Marcus, a cognitive scientist and notable AI researcher, businesses are finding out that the tech just can’t be depended on. • (The Byte, Frank Landymore) / March 28
Rationalism in the face of GPT hypes: Benchmarking the output of large language models against human expert-curated biomedical knowledge graphs • Biomedical knowledge graphs (KGs) hold valuable information regarding biomedical entities such as genes, diseases, biological processes, and drugs. KGs have been successfully employed in challenging biomedical areas such as the identification of pathophysiology mechanisms or drug repurposing. The creation of high-quality KGs typically requires labor-intensive multi-database integration or substantial human expert curation, both of which take time and contribute to the workload of data processing and annotation. Therefore, the use of automatic systems for KG building and maintenance is a prerequisite for the wide uptake and utilization of KGs. Technologies supporting the automated generation and updating of KGs typically make use of Natural Language Processing (NLP), which is optimized for extracting implicit triples described in relevant biomedical text sources. At the core of this challenge is how to improve the accuracy and coverage of the information extraction module by utilizing different models and tools. The emergence of pre-trained large language models (LLMs), such as ChatGPT which has grown in popularity dramatically, has revolutionized the field of NLP, making them a potential candidate to be used in text-based graph creation as well. So far, no previous work has investigated the power of LLMs on the generation of cause-and-effect networks and KGs encoded in Biological Expression Language (BEL). In this paper, we present initial studies towards one-shot BEL relation extraction using two different versions of the Generative Pre-trained Transformer (GPT) models and evaluate its performance by comparing the extracted results to a highly accurate, manually curated BEL KG curated by domain experts. • (ScienceDirect, Negin Sadat Babaiha, et al.) / February 7