Our new-ish, neural, pure Python stanfordnlp package provides grammatical analyses of sentences in over 50 human languages! Version 0.2.0 brought sensibly small model sizes and an improved lemmatizer. Try it out: pip install stanfordnlp
CS224N Natural Language Processing with Deep Learning 2019
@Stanford
course videos by
@chrmanning
,
@abigail_e_see
& guests are now mostly available (16 of 20). Big update from 2017. YouTube playlist: – new CS224N online hub:
#NLProc
Out now: our new Python
#NLProc
package. StanfordNLP provides native, neural (PyTorch) tokenization, POS tagging and dependency parsing for 53 languages based on UD v2—and a Python CoreNLP interface. PyPI: – pip install stanfordnlp
We’re gearing up for the 2019 edition of Stanford CS224N: Natural Language Processing with Deep Learning. Starts Jan 8—over 500 students enrolled—using PyTorch—new Neural MT assignments—new lectures on transformers, subword models, and human language.
A 2023 update of the CS224N Natural Language Processing with Deep Learning YouTube playlist is now available with new lectures on pretrained models, prompting, RLHF, natural language and code generation, linguistics, interpretability and more.
#NLProc
Delete, Retrieve, Generate: A simple approach to doing neural style transfer on text, altering text for sentiment or style—Juncen Li, Robin Jia, He He &
@percyliang
#NAACL2018
Stanford's CS224N NLP with Deep Learning Course will take 24+ hours to understand how GPT works.
There's a better way.
I turned Stanford's course into a chatbot that can answer the questions you need answers for.
Try it out here:
Announcing Stanza v1.0.0, the new packaging of our Python
#NLProc
library for many human languages (now including mainland Chinese), greatly improved and including NER. Documentation Github PyPI (or conda)
“Google & DeepMind have hired 23 professors, Amazon 17, Microsoft 13, and Uber, Nvidia & Facebook 7 each. Tech companies disagree that they are plundering academia. A Google spokesman said the company was an enthusiastic supporter of academic research.”
“One thing I don’t like about the reporting around AI is that journalists seem to think the progress is happening in companies, and that’s not true. They are part of it, but a lot of the progress is continuing to happen in academia.”—Yoshua Bengio.
DPO (Direct Preference Optimization, ) now completely owns top-of-leaderboard medium-sized neural language models!
(More experimentation with IPO, KTO, PPO, etc. would be great! – as hf seems to be trying: )
Congratulations to
@danqi_chen
on completing her dissertation on Neural Reading Comprehension and Beyond. She starts next year as an Asst Prof at
@PrincetonCS
: . But first, vacation in Hawai’i. 🌺 Thesis:
#NLProc
Want to learn Natural Language Processing with Deep Learning a.k.a. Artificial Neural Network methods? Stanford’s
@SCPD_AI
is launching an online professional version of our
#cs224n
course with customized video content and online course assistant support:
“[BERT] is the single biggest positive change we’ve had [to
@Google
search rankings] in the last five years,”
@PanduNayak
said.
#nlproc
Google Search Now Reads at a Higher Level | WIRED
Deep Learning, Language and Cognition: Video of an introductory talk on computational linguistics for a broad audience—from hand-written rules to modern neural net models—by Christopher Manning (
@chrmanning
) at IAS, Princeton.
#NLProc
People usually get information from others in a multi-turn conversation. To approach this, we’ve released CoQA 🍃—A Conversational Question Answering Challenge by
@sivareddyg
•
@danqi_chen
•
@chrmanning
. 127K Qs— free-form answers—with evidence—multi-domain.
“Stanford has another fantastic NLP course which is also freely available online taught by a world renowned NLP researcher, academic, and author. The course in is From Languages to Information (CS124), and it is taught by Dan
@Jurafsky
.”
Stanford Researcher
@simran_s_arora
develops a simple prompting strategy that enables open-source LLMs with 30x fewer parameters to exceed the few-shot performance of GPT3-175B - MarkTechPost
Geoff Hinton on importance of university research—“One worry is that the most fertile source of genuinely new ideas is graduate students being well advised in a university. They have the freedom to come up with genuinely new ideas—we need to preserve that”
It’s the origin of attention!
@DBahdanau
&
@kchonyc
couldn’t afford Google’s large multi-GPU neural MT, so they thought of a better way.
Us either.
@lmthang
&
@chrmanning
introduced simpler multiplicative attention.
Then
@GoogleAI
folk wondered if attention is all you need…
I prefer to operate in “GPU-Poor” mode.
I don’t agree with the take from the semianalysis piece. Creative breakthroughs often occur under constraints—new systems, models, and methods that can better take advantage of even larger-scale compute
After a meteoric rise, DSPy is now the
@stanfordnlp
repository with the most GitHub stars. Big congratulations to
@lateinteraction
and his “team”.
DSPy: Programming—not prompting—Foundation Models
10 Open-Sourced AI Datasets—SyncedReview 2018 In Review. 3/10 from
@stanfordnlp
, 4/10 from
@Stanford
. Open Images V4—MURA—BDD100K—SQuAD 2.0—CoQA—Spider 1.0—HotpotQA—Tencent ML Images—Tencent AI Lab Embedding Corpus for Chinese—fastMRI.
We still see lots of links to old releases of CS224N. Make sure you're getting the latest goodness (RLHF, prompting, transformers) from the 2023 release!
YouTube:
Website:
For fee cohort-based online class:
ELECTRA’s Replaced Token Detection pre-training is not only more compute efficient but gives a new best single model result on SQuAD v2 benchmark!
6 Mar 2020 ELECTRA: 88.716 EM 91.365 F1
By
@clark_kev
@lmthang
@quocleix
@chrmanning
Some people say that no one reads PhD dissertations any more. But (literally!) thousands of people wanted a copy of
@danqi_chen
’s recent dissertation, Neural Reading Comprehension and Beyond. Piece by
@feefifofannah
#nlproc
Since 2016, SQuAD has been the key textual question answering benchmark, used by top AI groups & featured in AI Index——Today
@pranavrajpurkar
, Robin Jia &
@percyliang
release SQuAD2.0 with 50K unanswerable Qs to test understanding:
Stanford CS224N Natural Language Processing with Deep Learning is gearing up for its 2021 edition—starting Tue Jan 12, 4:30 Pacific for enrolled students. New lectures—transformers, LMs & KBs, new assignments—transformers, Choctaw NMT.
#NLProc
We’ve released a new Visual Question Answering dataset to drive progress on real-image relational/compositional visual and linguistic understanding: GQA Questions, answers, images, and semantics available; will be used as a track in the VQA Challenge 2019.
Rasa open source chatbot API: Enterprises & customers want AI assistants, not FAQ chatbots, but they’re difficult to build; Since Google demoed Duplex, every developer, product manager, and executive wants their own that can handle contextual conversations
We’ve just released Stanford CoreNLP v4.0.0, a new version of our widely used Java
#NLProc
package, after a long gap! Some big changes—i.e., compatibility problems but great for the future. Lots of bug fixes, probably a few new bugs, adds French NER.
In case you haven’t heard, the new unit for measuring computation runtime is TPU core years. But, if you missed that memo, since the numbers are already in the hundreds, you may as well get ahead of the game and start quoting your runtimes in TPU core centuries
#NLProc
Natural Language Inference (NLI) over tables by
@WilliamWangNLP
et al.
Tables are a ubiquitous but little studied human information source stuck between text and structured data—though see semantic parsing work, e.g., by
@IcePasupat
.
@GoogleAI
’s BERT unleashed a new performance level on SQuAD 2.0 QA —top 7 systems now all use it and are 2%+ above non-BERT systems. Scores equal summer 2017 SQuAD 1.0 scores. But top HIT/iFLYTEK lab AoA system is now >2% better than raw BERT. HT
@KCrosner
How can computers answer questions with multi-step information needs? How can it be done efficiently and interpretably?
@qi2peng2
and colleagues explain at
#emnlp2019
. Paper: Blog post:
#NLProc
Not only is
@huggingface
now hosting all our 🪶Stanza models (via
@github
LFS)—more reliable than our old fileserver, thx!—but thanks to work by
@mervenoyann
, 🤗🙏 you can now try out models in the browser using their Hosted Inference API: .
#NLProc
Is this the end of
@Google
/
@DeepMind
as leading presences at machine learning conferences?
@JeffDean
said: Things had to change. Google would take advantage of its own AI discoveries, sharing papers only after the lab work had been turned into products.
When and why does king - man + woman = queen? In my
#ACL2019
paper with
@DavidDuvenaud
and Graeme Hirst, we explain what conditions need to be satisfied by a training corpus for word analogies to hold in a GloVe or skipgram embedding space. 1/4
blog:
It’s the 10th anniversary of
@stanfordnlp
on
@Twitter
, and approximately the 20th anniversary of the Stanford NLP Group and to celebrate … well, actually that's all coincidental, but at any rate, we've got a new logo!!! By
@digitalbryce
.
#NLProc
New
#ACL2018
paper "Neural Factor Graph Models for Cross-lingual Morphological Tagging"
Not just for morphology, but a powerful & interpretable tool for sequence labeling that integrates graphical models and neural networks! Code:
tf.keras in
@TensorFlow
2.1 adds TextVectorization layer to flexibly map raw strings to tokens/word pieces/ngrams/vocab. An image is just a matrix of numbers but text always needs extra work and it‘s cleaner having preprocessing inside the model 👍
What’s new in
@Stanford
CS224N Natural Language Processing with Deep Learning for 2019? Question answering—1D CNNs—subword models—contextual word representations—transformers—generation—bias. YT playlist – CS224N online hub
#NLProc
🆕 We've released Stanza v1.2—our Python neural NLP toolkit for PoS, parsing, NER for dozens of human languages. UD 2.7 models, multi-document support, faster tokenization, fix race conditions, fixes “data gap bugs” with tokenization & deps in many las
Cross-View Training—A semi-supervised learning technique by
@clark_kev
@lmthang
Quoc Le
@chrmanning
at
#emnlp2018
. Allows training for your
#NLProc
task on large-scale unannotated data, not only using such data for task-agnostic word representations
We’re very excited to kick off our 2021 Stanford NLP Seminar series with Ian Tenney (
@iftenney
) of Google Research presenting on “BERTology and Beyond”! Thursday 10am PT. Open to the public non-Stanford people register at
Looking for a series to binge-watch with more depth? We are delighted to make available the latest CS224N: Natural Language Processing with Deep Learning. New content on transformers, pre-trained models, NLG, knowledge, and ethical considerations.
#NLProc
You know how to do NLP. But do you consider fairness and ethical implications in your
#NLProc
research? Learn the latest on Socially Responsible NLP from Yulia Tsvetkov, Vinod Prabhakaran and
@rfpvjr
—Jun 1 afternoon
@NAACLHLT
tutorial.
GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations—learning a dependency graph to do deep transfer learning. Jake Zhao: “Perhaps this can also be seen as encoding some relational inductive bias into the machinery” 🤔 HT
@ylecun
Wonderful to see some theory behind the great success of self-supervised learning. Still trying to get our slow brains around how strong the results are. Cameo for the Stanford Sentiment Treebank—can it become the MNIST of
#NLProc
? By
@jasondeanlee
& al.
.
@stanfordnlp
people’s
#ICLR2020
papers
#1
—
@ukhndlwl
and colleagues (incl. at
@facebookai
) show the power of neural nets learning a context similarity function for kNN in LM prediction—almost 3 PPL gain on WikiText-103—maybe most useful for domain transfer
.
@stanfordnlp
people’s
#ICLR2020
papers
#2
—ELECTRA:
@clark_kev
and colleagues (incl. at
@GoogleAI
) show how to build a much more compute/energy efficient discriminative pre-trainer for text encoding than BERT etc. using instead replaced token detection
A new winner on the
@huggingface
Open LLM Leaderboard at the end of December … combining the goodness of SOLAR-10.7B and Direct Preference Optimization (DPO)
“The recent phenomenal success of language models has reinvigorated machine learning research…. One problem class that has remained relatively elusive however is purposeful adaptive behavior…. we show that it can be resolved by treating actions as causal interventions.”
Shaking the foundations: delusions in sequence models for interaction and control. I learned so much from Pedro Ortega in this thought-provocative AI project. Great way to spend time with a friend at a London pub.
The need for a quality site recording the state-of-the-art performance on many AI/ML/NLP/Vision tasks has been obvious for a decade—this one looks great and might actually achieve escape velocity!
We’ve just released the new Papers With Code! Site now has over 950+ ML tasks, 500+ evaluation tables (including state of the art results) and 8500+ papers with code. Explore the resource here: . Have fun!
At
#acl2020nlp
,
@mhahn29
presents TACL paper Theoretical Limitations of Self-Attention in Neural Sequence Models: Transformer models seem all-powerful in
#NLProc
but they can’t even handle all regular languages—what does this say about human language?
Is your Vision-and-Language model really a Vision-AND-Language model? 👀
“Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers”
📄
🗣️
#EMNLP2021
🆕🪶Stanza 1.3 for Python
#NLProc
is out with a new language ID component and multilingual pipelines, a new transition-based constituency parser, a dictionary tokenizer feature esp. useful for East Asian languages, and model downloading from
@huggingface
.
Sharp Nearby, Fuzzy Far Away: An LSTM Neural Language Model remembers out to about 200 words, remembering word order for about 50 words & more results…. At
#ACL2017
by
@ukhndlwl
,
@hhexiy
, Peng Qi &
@jurafsky
.
#NLProc
“He and her team, which included Nanyun Peng and Percy Liang, tried to give their AI some creative wit, using insights from humor theory.”
@hhexiy
@percyliang
. The Comedian Is in the Machine. AI Is Now Learning Puns | WIRED
Nice article on methods for using distributed representations to capture graph structure in
@gradientpub
, a new, accessible magazine by
@Stanford
AI students. The first methods drew from
#NLProc
but maybe with new GCN methods, we’re borrowing back.
Three goals of human-centered AI: capturing the breadth & nuance of human intelligence, AI that enhances & collaborates with humans, and guiding the effects of AI on human society. By
@drfeifei
Just out in PNAS: A paper examining the emergent linguistic structure learned by artificial neural networks, such as BERT, trained by cloze task (word-in-context prediction) self-supervision. By
@chrmanning
,
@clark_kev
,
@johnhewtt
,
@ukhndlwl
,
@omerlevy_
.
TIL:
@NVIDIAAI
has an efficient, extensively benchmarked, and well-maintained version of our ELECTRA model—an efficient BERT-equivalent large pre-trained language model—for tf2, which exploits NVIDIA tensor cores, mixed precision training, etc.
#NLProc
An easy, bad PyTorch coding mistake: If you do Dataset preprocessing in the __getitem__ method using an np.random & use a multithreaded DataLoader, then each thread gets the same seed! Instead, set the seed in DataLoader’s worker_init_fn. HT
@peteskomoroch
We’ve just released Stanza v1.1.1, our
#NLProc
package for many human languages. It adds sentiment analysis, medical English parsing & NER, more customizability of Processors, faster tokenizers, new Thai tokenizer, bug fixes, etc.—try it out!
“the researchers [that’s us!] show that the cross-entropy loss for fitting the reward model in RLHF can be used directly to finetune the LLM. In benchmarks it's more efficient to use DPO and often also preferred over RLHF/PPO in terms of response quality.”