Introducing Meta Llama 3: the most capable openly available LLM to date.
Today we’re releasing 8B & 70B models that deliver on new capabilities such as improved reasoning and set a new state-of-the-art for models of their sizes.
Today's release includes the first two Llama 3…
We’re pleased to introduce Make-A-Video, our latest in
#GenerativeAI
research! With just a few words, this state-of-the-art AI system generates high-quality videos from text prompts.
Have an idea you want to see? Reply w/ your prompt using
#MetaAI
and we’ll share more results.
Today we're releasing the Segment Anything Model (SAM) — a step toward the first foundation model for image segmentation.
SAM is capable of one-click segmentation of any object from any photo or video + zero-shot transfer to other segmentation tasks ➡️
Today we’re releasing Code Llama 70B: a new, more performant version of our LLM for code generation — available under the same license as previous Code Llama models.
Download the models ➡️
• CodeLlama-70B
• CodeLlama-70B-Python
• CodeLlama-70B-Instruct
Announced by Mark Zuckerberg this morning — today we're releasing DINOv2, the first method for training computer vision models that uses self-supervised learning to achieve results matching or exceeding industry standards.
More on this new work ➡️
Meta AI presents CICERO — the first AI to achieve human-level performance in Diplomacy, a strategy game which requires building trust, negotiating and cooperating with multiple players.
Learn more about
#CICERObyMetaAI
:
Today we’re releasing Code Llama, a large language model built on top of Llama 2, fine-tuned for coding & state-of-the-art for publicly available coding tools.
Keeping with our open approach, Code Llama is publicly-available now for both research & commercial use.
More ⬇️
Check out our latest breakthrough in machine translation that Mark Zuckerberg just announced. We built and open sourced a state-of-the-art AI model that now translates between 200 different languages.
Today we're publicly releasing LLaMA, a state-of-the-art foundational LLM, as part of our ongoing commitment to open science, transparency and democratized access to new research.
Learn more & request access ➡️
We believe an open approach is the right one for the development of today's Al models.
Today, we’re releasing Llama 2, the next generation of Meta’s open source Large Language Model, available for free for research & commercial use.
Details ➡️
Today we’re introducing SceneScript, a novel method for reconstructing environments and representing the layout of physical spaces from
@RealityLabs
Research.
Details ➡️
SceneScript is able to directly infer a room’s geometry using end-to-end machine…
Today we’re releasing V-JEPA, a method for teaching machines to understand and model the physical world by watching videos. This work is another important step towards
@ylecun
’s outlined vision of AI models that use a learned understanding of the world to plan, reason and…
New on
@huggingface
— CoTracker simultaneously tracks the movement of multiple points in videos using a flexible design based on a transformer network — it models correlation of the points in time via specialized attention layers.
🤗 Try CoTracker ➡️
Announcing OPT-IML: a new language model from Meta AI with 175B parameters, fine-tuned on 2,000 language tasks — openly available soon under a noncommercial license for research use cases.
Research paper & more details on GitHub ⬇️
Today we're releasing the Open Catalyst Demo to the public — this new service will allow researchers to accelerate work in material sciences by enabling them to simulate the reactivity of catalyst materials ~1000x faster than existing computational methods using AI.
Demo ⬇️
By restructuring math expressions as a language, Facebook AI has developed the first neural network that uses symbolic reasoning to solve advanced mathematics problems.
Today Meta AI is sharing OPT-175B, the first 175-billion-parameter language model to be made available to the broader AI research community. OPT-175B can generate creative text on a vast range of topics. Learn more & request access:
Today we're sharing new research that brings us one step closer to real-time decoding of image perception from brain activity.
Using MEG, this AI system can decode the unfolding of visual representations in the brain with an unprecedented temporal resolution.
More details ⬇️
We're introducing an optimizer for deep learning, MADGRAD. This method matches or exceeds the performance of the Adam optimizer across a varied set of realistic large-scale deep learning training problems.
Introducing 'Prompt Engineering with Llama 2' — an interactive guide covering prompt engineering & best practices for developers, researchers & enthusiasts working with large language models.
Access the notebook in the llama-recipes repo ➡️
To better enable the community to build on our work — and contribute to the responsible development of LLMs — we've published further details about the architecture, training compute, approach to fine-tuning & more for Llama 2 in a new paper.
Full paper➡️
Today we’re sharing two new advances in our generative AI research: Emu Video & Emu Edit.
Details ➡️
These new models deliver exciting results in high quality, diffusion-based text-to-video generation & controlled image editing w/ text instructions.
🧵
Code Llama is free for both research and commercial use and we've made three different models available:
- Code Llama
- Code Llama - Python
- Code Llama - Instruct
More details on each of these models and how you can download them ➡️
Today we're sharing details on AudioCraft, a new family of generative AI models built for generating high-quality, realistic audio & music from text. AudioCraft is a single code base that works for music, sound, compression & generation — all in the same place.
More details ⬇️
We’re open-sourcing a new system to train computer vision models using Transformers. Data-efficient image Transformers (DeiT) is a high-performance image classification model requiring less data & computing resources to train than previous AI models.
Using HyperTree Proof Search we created a new neural theorem solver that was able to solve 5x more International Math Olympiad problems than any previous AI system & best previous state-of-the-art systems on miniF2F & Metamath.
More in our new post ⬇️
Introducing Voicebox, a new breakthrough generative speech system based on Flow Matching, a new method proposed by Meta AI. It can synthesize speech across six languages, perform noise removal, edit content, transfer audio style & more.
More details on this work & examples ⬇️
Introducing SeamlessM4T, the first all-in-one, multilingual multimodal translation model.
This single model can perform tasks across speech-to-text, speech-to-speech, text-to-text translation & speech recognition for up to 100 languages depending on the task.
Details ⬇️
Scientists at Facebook AI have done what was previously considered out of reach for deep learning models, solving complex, symbolic math equations using a neural network.
(1/3) Until now, AI translation has focused mainly on written languages. Universal Speech Translator (UST) is the 1st AI-powered speech-to-speech translation system for a primarily oral language, translating Hokkien, one of many primarily spoken languages.
We are releasing Detection Transformers (DETR), an important new approach to object detection and panoptic segmentation. It’s the first object detection framework to successfully integrate Transformers as a central building block in the detection pipeline.
Today we’re sharing some of our latest progress on AI-powered hypercompression of audio. Our researchers achieved ~10x compression rate vs. MP3 at 64 kbps with no quality loss — the first time this has been done with 48 kHz sampled stereo audio ➡️
1/5
We are releasing HiPlot, a lightweight interactive visualization tool to help AI researchers discover correlations and patterns in high-dimensional data.
Today we're sharing the next milestone in our Seamless Communication research — a new family of AI translation models that preserve expression and deliver near-real time streaming translations.
More on this new work ➡️
More on the individual models 🧵
Today we're releasing our work on I-JEPA — self-supervised computer vision that learns to understand the world by predicting it. It's the first model based on a component of
@ylecun
's vision to make AI systems learn and reason like animals and humans.
Details ⬇️
Introducing ImageBind by Meta AI: the first AI model capable of binding data from six modalities at once. This breakthrough brings machines one step closer to the human ability to bind together information from many different senses.
More on this new open source work ⬇️
We’ve built and open-sourced BlenderBot 2.0, the first
#chatbot
that can store and access long-term memory, search the internet for timely information, and converse intelligently on nearly any topic. It’s a significant advancement in conversational AI.
Starting today you can try our new foundation research model for audio generation. The demo includes Zero shot TTS, Text to sound effects, Infilling and more!
Try Audiobox ➡️
As
@Meta
’s technology partner, Oasis Labs built the platform that uses Secure Multi-Party Computation (SMPC) to safeguard information as Meta asks users on
@Instagram
to take a survey in which they can voluntarily share their race or ethnicity
Does your child love to draw? Ever wished that their characters could “come to life” and move around the page? Using AI, we’ve developed automatic animation that can bring children’s one-of-a-kind characters to life! Learn more and try it out here:
Today, we’re introducing TextStyleBrush, the first self-supervised AI model that replaces text in existing images of both scenes and handwriting — in one shot — using just a single example word:
Read about new developments in deep learning with authors and researchers Daniel A. Roberts (
@danintheory
), Sho Yaida (
@Shoyaida
) and Boris Hanin (
@BorisHanin
) in their book The Principles of Deep Learning Theory: An Effective Theory Approach to Understanding Neural Networks. 👇
We're releasing Flashlight: A modern, open source machine learning library written entirely in C++. It's customizable to the core to support your research's needs — and it's pretty fast too.
SeamlessStreaming is an AI translation model that can deliver state-of-the-art results on streaming translation with <2 seconds of latency. One core piece of our latest Seamless Communication research work by teams at FAIR.
More on this project ➡️
In 2021, we created a research demo that brought amateur drawings to life through animation — today, we're open-sourcing the code + releasing a first-of-its-kind dataset of nearly 180K annotated amateur drawings to help researchers keep innovating in this space.
More details ⬇️
To close out 2023, here are 10 of the most interesting AI research advancements we shared on our feed this year — and where you can find more details on the work.
1️⃣ Segment Anything (SAM)
A step toward the first foundation model for image segmentation.
Details:…
Meta AI researchers show how current language models differ from the human brain & highlight the role of long-range & hierarchical predictions.
Read the open accessed article in Nature ➡️
As part of our continued belief in the value of an open approach to today's AI, we've published a research paper with more information on Code Llama training, evaluation results, safety and more.
Code Llama: Open Foundation Models for Code ➡️
It's been exactly one week since we released Meta Llama 3, in that time the models have been downloaded over 1.2M times, we've seen 600+ derivative models on
@HuggingFace
and much more.
More on the exciting impact we're already seeing with Llama 3 ➡️
Llama 2 is now available — open source, free for research and commercial use. These models are accessible to individuals, creators, researchers and businesses.
Download ⬇️
We’re introducing GSLM, the first language model that breaks free completely of the dependence on text for training. This “textless NLP” approach learns to generate expressive speech using only raw audio recordings as input. Learn more and get the code:
Introducing the next generation of the Meta Training and Inference Accelerator (MTIA), the next in our family of custom-made silicon, designed for Meta’s AI workloads.
Full details ➡️
Together with the Ego4D consortium, today we're releasing Ego-Exo4D, the largest ever public dataset of its kind to support research on video learning & multimodal perception — including 1,400+ hours of videos of skilled human activities.
Download ➡️
We’ve developed TransCoder, the first self-supervised neural transcompiler system for migrating code between programming languages. Transcoder can translate code from Python to C++, for example, and it outperforms rule-based translation programs.
Researchers at Meta recently shared MAGNeT, a single non-autoregressive transformer model for text-to-music & text-to-sound generation capable of generating audio on-par with the quality of SOTA models — at 7x the speed.
MAGNeT is open source as part of AudioCraft. Hear audio…
Today we’re releasing OpenEQA — the Open-Vocabulary Embodied Question Answering Benchmark. It measures an AI agent’s understanding of physical environments by probing it with open vocabulary questions like “Where did I leave my badge?”
More details ➡️ …
We're open sourcing PyRobot, a lightweight, high-level interface that lets
#AI
researchers get up and running with
#robotics
experiments in just hours. No specialized robotics expertise needed!
We’re open sourcing PyTorch-BigGraph, a tool that makes it much faster and easier to produce graph embeddings for extremely large graphs. Quickly produce high-quality embeddings without specialized computing resources like GPUs or huge amounts of memory.
Today we're sharing new progress on our AI speech work. Our Massively Multilingual Speech (MMS) project has now scaled speech-to-text & text-to-speech to support over 1,100 languages — a 10x increase from previous work.
Details + access to new pretrained models ⬇️
We've expanding access to DINOv2 by releasing the training code and model weights under the Apache-2 license.
Details on this and more of our recent work to advance computer vision research and fairness in AI ⬇️
3D computer vision research just got easier!
We’re releasing Implicitron, an extension of PyTorch3D that enables fast prototyping of 3D reconstruction and new-view synthesis methods based on rendering of implicit representations.
We're releasing code for a new approach to generating recipes directly from food images. This produces more compelling recipes than retrieval-based approaches and improves performance with respect to previous baselines for ingredient prediction.
#CVPR2019
Continual-T0 (CT0) displays Continual Learning capabilities via self-supervision. This fine-tuned language model retains skills while learning new tasks across an unprecedented scale of 70 datasets. It can even combine instructions without prior training.
Announcing the ESM Metagenomic Atlas — the first comprehensive view of the ‘dark matter’ of the protein universe. Made possible by ESMFold, a new breakthrough model for protein folding from Meta AI.
More in our new blog ➡️
1/3
New in Nature Human Behavior, Meta AI researchers show how current language models differ from the human brain & highlight the role of long-range & hierarchical predictions.
We hope these findings will help inform the next generation of AI ➡️
Excited to announce Make-A-Scene, our latest research tool Mark Zuckerberg just shared. Make-A-Scene is an exploratory concept that gives creative control to anyone, artists & non-artists alike to use both text & sketches to guide AI image generation:
Today we’re announcing that Facebook AI has built and open-sourced Blender, the largest-ever open-domain chatbot. It outperforms others in terms of engagement and also feels more human, according to human evaluators.
We’ve made the code available for Mesh R-CNN, a state of the art method that can reconstruct complex objects in three dimensions. Get it on Github here: . And learn more about Mesh R-CNN here:
We’ve completed the first fastMRI image reconstruction challenge to spur development of new AI techniques to make scans 10x faster. Congratulations to the top entrants, who’ve been invited to present at the Medical Imaging Meets
#NeurIPS2019
workshop!
Facebook AI and
@CarnegieMellon
researchers have built Pluribus, the first AI bot to beat elite poker pros in 6 player Texas Hold’em. This breakthrough is the first major benchmark outside of 2 player games and we’re sharing specifics on how we built it.
New research from FAIR: Better & Faster Large Language Models via Multi-token Prediction
Research paper ➡️
We show that replacing next token prediction tasks with multiple token prediction can result in substantially better code generation performance…
In addition to Llama 3, today we’re also publishing a new paper: Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation ➡️
This work from GenAI researchers is enabling new image generation features in Meta AI on
@WhatsApp
& web.
Models are increasingly data hungry, but do we really need all that data?
@SuryaGanguli
,
@arimorcos
and team show, in theory and practice, that we can beat power law scaling and achieve exponential scaling re: data if we can appropriately rank the importance of each datapoint.
FAIR researchers recently released V-JEPA, a method for teaching machines to understand and model the physical world by watching videos. The code is available under a CC-BY-NC license to enable the research community to build on this work.
GitHub Repo ⬇️
We are sharing a new technique for self-supervised training of convolutional networks. Our method outperforms previous self-supervised approaches and surpasses supervised techniques on most transfer tasks.
In addition to the new model, we’re also releasing the SA-1B dataset, which is 400x larger than any existing segmentation dataset — we hope this work will help accelerate computer vision research and enable entirely new applications.
Get the dataset ⬇️
(1/2) We recently announced the release of OPT-175B and now, as promised, we’re releasing OPT-66B, the largest unrestricted open-sourced model to date. We're also releasing our logbook used for training all of our baselines: 125M through 66B.
Today Meta AI is sharing OPT-175B, the first 175-billion-parameter language model to be made available to the broader AI research community. OPT-175B can generate creative text on a vast range of topics. Learn more & request access:
This morning, Mark Zuckerberg announced that
@PyTorch
is moving to a new, independent
#PyTorchFoundation
as part of the
#LinuxFoundation
umbrella. Learn what’s next for one of the leading platforms for
#AI
research and production applications.
We're releasing mBART, a new seq2seq multilingual pretraining system for machine translation across 25 languages. It gives significant improvements for document-level translation and low-resource languages. Read our paper to learn more:
We created data2vec, the first general high-performance self-supervised algorithm for speech, vision, and text. When applied to different modalities, it matches or outperforms the best self-supervised algorithms. Read more and get the code:
We’ve just open-sourced AugLy, a new
#Python
library that will help AI researchers use data augmentations to evaluate and improve the robustness of their machine learning models. Read more:
We're introducing new stereo models for MusicGen. By extending the delay codebook pattern to cover tokens from both left & right channels, these models can generate stereo output with no extra computational cost vs previous models.
Try MusicGen on 🤗 ➡️
Chief AI Scientist Yann LeCun (
@ylecun
) is sketching an alternate vision for building human-level AI. LeCun proposes that the ability to learn “world models” — internal models of how the world works — may be the key. Learn more:
Notice a new feature on
@arxiv
? Machine learning articles now have a Code tab to link official and community code thanks to a new partnership with
@paperswithcode
.
Announcing data2vec 2.0, a new general self-supervised algorithm built by Meta AI for speech, vision & text that can train models 16x faster than the most popular existing algorithm for images while achieving the same accuracy.
Read more & get the open source code ⬇️
Announcing Belebele, a first-of-its-kind multilingual reading comprehension dataset. This dataset is parallel for 122 language variants, enabling direct comparison of how well models understand different languages.
Dataset ➡️
Today we’re announcing two new updates in our computer vision work — a new, expanded license for our DINOv2 model and the release of FACET, a comprehensive new benchmark dataset to help evaluate and improve fairness in vision models.
More details ➡️
🧵
Today we’re sharing details on Meta Lattice, a new model architecture that improves Meta’s ads systems performance and efficiency.
More on this new work ➡️
Three ways that this new work is enhancing our ads system 🧵
With 4.5B parallel sentences in 576 language pairs, CCMatrix is the largest data set of high-quality, web-based bitexts for training translation models. Now Facebook AI is sharing tools for other researchers to use this corpus for their work.
Meta is announcing the AI Research SuperCluster (RSC), our latest AI supercomputer 💻 for AI research. RSC will allow our researchers to do new, groundbreaking experiments in
#AI
. Learn more about RSC and the important role it will play:
In the interest of open science and sharing our research, we've published a paper outlining the work on our recently released SeamlessM4T all-in-one multilingual & multimodal translation model.
Read the full paper ➡️
Today, Mark Zuckerberg shared the latest on our vision for building the most advanced AI products and services in our Q4 earnings call. Here are a few highlights on our progress in AI — and how our teams are positioned to deliver on this work.
🧵
To support innovation in computer vision, we’re making DINOv2 available under the Apache 2.0 license + releasing a collection of DINOv2-based dense prediction models for semantic image segmentation and monocular depth estimation.
Try our updated demo ➡️
We’re introducing M2M-100, the first multilingual machine translation model that translates between any pair of 100 languages without relying on English data. We’ve open sourced the model, training, & evaluation set up. Learn more
#t9n
#machinetranslation