Will Merrill Profile Banner
Will Merrill Profile
Will Merrill

@lambdaviking

2,157
Followers
583
Following
677
Media
1,276
Statuses

Ph.D. student @ NYU🗽 Theoretical aspects of NLP and LMs /nætʃɹəl/🇮🇸 + formal🤵 languages + TCS🧮

New York, NY
Joined October 2011
Don't wanna be here? Send us removal request.
@lambdaviking
Will Merrill
2 months
✨Excited to finally drop our new paper: SSMs “look like” RNNs, but we show their statefulness is an illusion🪄🐇 Current SSMs cannot express basic state tracking, but a minimal change fixes this! 👀 w/ @jowenpetty , @Ashish_S_AI
Tweet media one
23
208
1K
@lambdaviking
Will Merrill
1 year
📣 @Ashish_S_AI and I prove that transformers can be translated to sentences in first-order logic with majority-vote quantifiers (FOM). FOM is a symbolic language that can capture computation inside transformers!
Tweet media one
12
95
490
@lambdaviking
Will Merrill
2 years
[1/6] Excited to share a year-long project re: theory of language understanding in LMs w/ @a_stadt , @tallinzen TLDR: Judging entailments (NLI) can be reduced to LMing over "Gricean data"* ∴ Learning distribution (perfectly) => learning semantics
2
37
270
@lambdaviking
Will Merrill
3 years
Is it possible for GPT-n to "understand" the semantics of English? What about Python? I'm excited to finally share work formalizing this question! We give formal languages that are *provably* un-understandable by LMs (within our setup, at least)
8
51
233
@lambdaviking
Will Merrill
2 years
How do we understand logical reasoning in non-symbolic models like transformers? 📣New preprint w/ Ashish Sabharwal shows any transformer can be translated to a fixed-size first-order-logic formulae (with majority quantifiers)
2
34
195
@lambdaviking
Will Merrill
3 months
📢 Preprint: We can predict entailment relations from LM sentence co-occurrence prob. scores These results suggest predicting sentence co-occurrence may be one way that next-word prediction leads to (partial) semantic representations in LMs🧵
Tweet media one
5
27
166
@lambdaviking
Will Merrill
3 years
What inductive biases does training impose on transformers? We find that T5, RoBERTa, etc. are well-approximated by saturated transformers (simplified attention patterns), and explain how this arises during training. w/ @RamanujanVivek @yoavgo @royschwartzNLP @nlpnoah
Tweet media one
1
32
165
@lambdaviking
Will Merrill
6 months
🐊📣Still @ NeurIPS? Come by our poster to hear about how chain of thought/scratchpad steps increase the computational power of transformers Room 242, 4pm (M3L workshop)
Tweet media one
1
27
157
@lambdaviking
Will Merrill
1 year
[1/n]📢 More work on the *computational model of transformers* w/ Ashish Sabharwal in TACL Takeaway: transformers are limited to expressing highly parallelizable functions (formally, they are in the complexity class uniform TC0)
5
36
153
@lambdaviking
Will Merrill
8 months
[1/n] How does a chain of thought change the expressive power of transformers? New work w/ @Ashish_S_AI studies how adding CoT/decoding steps extends the problems solvable by transformers as a fn of the # of steps.
Tweet media one
2
27
130
@lambdaviking
Will Merrill
3 years
oh btw, I'm excited to be joining @NYUDataScience as a PhD student this fall!
@chunkbardey
charlie
3 years
if u go to grad school that’s weird like most ppl don’t do that
71
1K
29K
13
2
124
@lambdaviking
Will Merrill
2 years
Curious about the circuits inside transformers? 🧐 📢 Our new work shows how (saturated) transformers can be simulated by *threshold circuits* Equivalently, this bounds the problems saturated transformers can solve in the class TC0 w/ Ashish Sabharwal, @nlpnoah
Tweet media one
1
19
122
@lambdaviking
Will Merrill
5 months
“The Expressive Power of Transformers with Chain of Thought” w/ @Ashish_S_AI will appear at ICLR 🇦🇹
@srush_nlp
Sasha Rush
7 months
Props to Will Merill @lambdaviking for having already fully formalized my nonsense thoughts (also for generally writing extremely interesting papers)
Tweet media one
Tweet media two
1
17
144
6
14
97
@lambdaviking
Will Merrill
4 months
If you teach NLP, please keep teaching automata 👇
@dottxtai
.txt
4 months
⚡️ Speed up LLM inference by 5x. ⚡️ We introduce a new framework, coalescence, that makes structured generation several times faster than standard generation. Coalescence is very flexible, and raises unexpected questions 🧐
Tweet media one
6
78
333
3
7
91
@lambdaviking
Will Merrill
5 years
I wrote a blog post summarizing Sequential Neural Networks as Automata
1
17
82
@lambdaviking
Will Merrill
1 month
Stop by "The Expressive Power of Chain of Thought" poster tomorrow! Wednesday 10:45am #294
Tweet media one
1
15
78
@lambdaviking
Will Merrill
6 months
🐊📣 Stop by tomorrow to hear from @Ashish_S_AI and me about: 1) how transformers can be expressed in logic 2) what this means about what transformers *can't* do Thursday @ 5pm, #1008
Tweet media one
0
10
68
@lambdaviking
Will Merrill
7 months
Our NeurIPS paper shows transformers can be expressed in first-order logic with majority quantifiers => Extends the theory of the expressiveness and limitations of transformers => Provides a "programming language" capturing transformers' computation
@lambdaviking
Will Merrill
1 year
📣 @Ashish_S_AI and I prove that transformers can be translated to sentences in first-order logic with majority-vote quantifiers (FOM). FOM is a symbolic language that can capture computation inside transformers!
Tweet media one
12
95
490
0
16
67
@lambdaviking
Will Merrill
1 year
This result acts as an upper bound: transformers can't solve problems that can't be defined in FOM. This reveals some new problems transformers can't solve and provides a handy intuitive test for seeing whether a transformer can do something: try to define the problem in FOM.
1
3
61
@lambdaviking
Will Merrill
2 months
Finally, we were excited to find a minimal change to SSMs that improves their expressive power for state tracking: make the A matrix input-dependent. Empirically, this allows them to learn hard-state tracking just as well as RNNs!
Tweet media one
9
6
57
@lambdaviking
Will Merrill
12 days
Love to see more theoretical work comparing the formal capabilities of SSMs and transformers (within TC0)!
@yashYRS
Yash Sarrof
13 days
We are excited to share our work on characterizing the expressivity of State Space Models (SSMs) with a theoretical lens, using a formal language framework, backed up by empirical findings. w/ Yana Veitsman, Dr. Michael Hahn Paper link :
Tweet media one
2
16
106
3
8
54
@lambdaviking
Will Merrill
2 months
We draw on theory to formalize hard state tracking problems and formally prove that SSMs, like transformers, cannot solve them. Empirically, SSMs and transformers struggle to learn hard state tracking, but RNNs learn it easily. We also propose a minimal fix.
1
0
46
@lambdaviking
Will Merrill
2 months
Thanks to the FLaNN Discord for recently inviting us to talk about this work. Recording: Also, stop by my poster at New England NLP next week!
1
4
45
@lambdaviking
Will Merrill
7 months
This is a good intuition to impart! There is no true “state” in a transformer (unlike DFA/RNN) and the # of state updates is bounded by the depth We discuss this more formally here (+implications for what sequential stuff transformers can’t do):
@PreetumNakkiran
Preetum Nakkiran
7 months
One intuition for why Transformers have difficulty learning sequential tasks (eg parity), w/o scratchpad, is that they can only update their “internal state” in very restricted ways (as defined by Attention). In contrast to e.g RNNs, which can do essentially arbitrary updates.
2
16
138
0
7
43
@lambdaviking
Will Merrill
1 month
Was fun working on this! The cool takeaway imo is that we can characterize the type of reasoning that blank tokens can help with… it’s reduced compared to CoT but experiments show it’s likely more than with no extra tokens
@jacob_pfau
Jacob Pfau
1 month
Do models need to reason in words to benefit from chain-of-thought tokens? In our experiments, the answer is no! Models can perform on par with CoT using repeated '...' filler tokens. This raises alignment concerns: Using filler, LMs can do hidden reasoning not visible in CoT🧵
Tweet media one
41
182
1K
2
4
40
@lambdaviking
Will Merrill
2 months
Inspired by formal language theory, we view state tracking as iterated multiplication in a monoid (whose elements represent state updates). The algebraic structure of the monoid determines the complexity of tracking state in a parallel computational model like a transformer.
Tweet media one
1
2
40
@lambdaviking
Will Merrill
2 months
Our main theoretical result✨ is that linear state-space models (e.g., S4) and Mamba can only express computation in TC0. This means they cannot (exactly) solve hard state tracking problems like permutation composition, code eval, or chess!
1
4
40
@lambdaviking
Will Merrill
2 months
But in practice can SSMs and transformers approximate state tracking in practice despite our worst-case result? We show🧪 this isn’t the case on permutation composition: both transformers and SSMs require depth growing with sequence length, whereas RNNs need just 1 layer.
Tweet media one
1
3
39
@lambdaviking
Will Merrill
2 months
In summary, we used theory to pin down “hard state tracking” and showed it poses a problem for current SSMs and transformers. Thus, the state in SSMs is an illusion! We proposed an SSM extension to overcome this and are eager to evaluate its practical viability.
Tweet media one
1
0
37
@lambdaviking
Will Merrill
2 months
A canonical example of hard state tracking is *permutation composition*, or S5 (cf. Galois). We show “real” state tracking problems (code eval, chess with <source,target> notation) can be reduced to permutation composition. We thus use it to benchmark hard state tracking.
Tweet media one
1
3
35
@lambdaviking
Will Merrill
1 year
Another contribution: we prove transformers need >loglogn precision to have full expressive power over contexts of length n. With less precision, they cannot implement uniform attention! We hope this result can guide the development of long-context, low-precision LMs.
1
1
32
@lambdaviking
Will Merrill
2 months
Some state tracking problems are known to be fully parallelizable (TC0) while others are inherently sequential (NC1-complete), requiring computation graphs of growing depth. The latter are easy for RNNs but can’t be represented by fixed-depth transformers.
2
1
31
@lambdaviking
Will Merrill
1 year
We also see natural implications of this work for understanding algorithms inside transformers ("mechanistic interpretability"). RASP/TRACR are languages for *compiling into* transformers. In contrast, any transformer can be *translated* to an FOM sentence.
1
0
30
@lambdaviking
Will Merrill
2 years
I'll be arriving in Abu Dhabi for EMNLP tomorrow! Would love to chat about formal semantics and LMs, expressive capacity/inductive biases of NNs/transformers, compositionality, or anything in between! Will respond to emails, Twitter DMs, carrier pigeons, etc.
4
0
29
@lambdaviking
Will Merrill
1 month
I’ll be at ICLR next week! 🇦🇹 Reach out if you’d to talk about transformers and state-space models, training dynamics, etc.
2
0
25
@lambdaviking
Will Merrill
2 years
In Zürich Airport, ears perked for something cross-serial👂
3
0
24
@lambdaviking
Will Merrill
1 year
@_jasonwei @OpenAI I'm not sure what you mean by compositionality, but this example is a clear failure on the level of basic Shakespearean grammar. The subject/verb agreement is realllllly off (Should be: thou seekest, I know, shall guide)
0
0
24
@lambdaviking
Will Merrill
3 years
Our paper on the form/meaning debate has been updated thanks to discussions & outside input! v2 better reflects how understanding can be hard for an *LM*, but easy for a human. Thanks to Mark-Jan Nederhof and many others who shared their thoughts.
1
2
24
@lambdaviking
Will Merrill
1 year
RIP Drago. In the classes I took and TA'ed with him, Drago was a passionate, kind, and funny teacher and mentor. Gone too soon indeed.
@hmkyale
Harlan Krumholz
1 year
The #AI community, the #computerscience community, the @YaleSEAS community, and humanity have suddenly lost a remarkable person, @dragomir_radev - kind and brilliant, devoted to his family and friends... gone too soon. A sad day @Yale @YINSedge @YaleCompsci #NLP2023
Tweet media one
Tweet media two
41
87
389
0
3
24
@lambdaviking
Will Merrill
2 years
Today I learned: the context-sensitive languages are closed under complement. Via a fun, indirect argument that takes us away from the Chomsky hierarchy and into computational complexity (1/n)
1
2
23
@lambdaviking
Will Merrill
1 year
@katiedimartin Yes, how many languages do you speak?
1
0
22
@lambdaviking
Will Merrill
2 years
@_jasonwei So emergent phenomena are an empirical fact (and an interesting one, I agree). But it's a big jump to assume arbitrarily complex (~human-level) reasoning can emerge in transformers. Provably, transformers can only implement express shallow reasoning:
2
0
21
@lambdaviking
Will Merrill
2 months
Also see here for more details on the algebra behind the paper:
@jowenpetty
jackson petty
2 months
How does Galois theory help show that the state-tracking capabilities of current (!) SSMs are illusions? What makes S5 & A5 “hard”? And why do we consider A5 & friends here instead of S5? A thread on the algebra behind our paper!
2
23
130
1
0
22
@lambdaviking
Will Merrill
7 months
+1, the limitations of LMs workshop was fun and timely. thanks to the organizers and other speakers! I spoke about complexity-theoretic limitations of transformers (vid will appear eventually) no photo of me in Bielefeld, but did get a pic of another William
Tweet media one
@LAWeissweiler
Leonie Weissweiler
7 months
I had a great time yesterday speaking about testing the limits of LLMs with Construction Grammar at a workshop on LLM limitations organised by @SAIL_network ! Thanks again to Özge Alacam, @bpaassen1 , and @MichielStraat for inviting me, and @lambdaviking for the fun company!
Tweet media one
0
4
32
1
1
20
@lambdaviking
Will Merrill
1 year
[2/n] This implies a list of problems transformers cannot solve (under assumptions in footnotes):
Tweet media one
4
1
19
@lambdaviking
Will Merrill
2 years
@yahave Semantic parsing is all you need
0
3
18
@lambdaviking
Will Merrill
11 months
Check out this poster if you’re interested in theoretical insights on the reasoning power and limitations of transformers! 👀
@davidweichiang
David Chiang
11 months
Our poster on "Tighter Bounds on the Expressivity of Transformer Encoders" has been rescheduled to Wednesday at 11am! Exhibit Hall 1 number 228 #ICML2023
Tweet media one
1
1
26
0
4
18
@lambdaviking
Will Merrill
26 days
Forget GPT-4o, I'm just waiting for Chicha San Chen NYC to open😔
1
0
17
@lambdaviking
Will Merrill
7 months
Historical context is hard to get without a lot of experience or clear exposition (this), but it can provide a broader perspective beyond the daily arXiv buzz +glad to see mention of the Stupid Backoff paper about "large language models" c. 2007:
@nsaphra
Naomi Saphra
7 months
It's not the first time! A dream team of @enfleisig (human eval expert), Adam Lopez (remembers the Stat MT era), @kchonyc (helped end it), and me (pun in title) are here to teach you the history of scale crises and what lessons we can take from them. 🧵
Tweet media one
8
63
333
0
2
16
@lambdaviking
Will Merrill
2 years
Thanks to Meryl for covering our recent work on semantics and language models on the CDS blog! The paper proves entailment prediction can be reduced to language modeling, and shows how to extract entailment from an “ideal” LM. Check out the blog to learn more!
@NYUDataScience
NYU Data Science
2 years
Can language models learn meaning just by observing text? CDS PhD student William Merrill ( @lambdaviking ) and CDS Assistant Professor of Linguistics and Data Science Tal Linzen ( @tallinzen ) explore the question in a recent study. Read about it on our blog!
1
4
41
0
2
16
@lambdaviking
Will Merrill
2 years
Interested in foundational questions about the computational/linguistic abilities of neural nets? Check out our website/join our weekly remote talk series 🍮
@sleynas
Lena Strobl
2 years
FLaNN is online ! 🍮 We organize weekly online seminars on Formal Languages 🤵 and Neural Networks 🧠 and related things. ✨ Visit our website to find out more! 🧑‍💻
1
13
46
1
1
15
@lambdaviking
Will Merrill
8 months
Thanks to Stephen for a great overview of our recent work on the reasoning limitations of transformers!
@NYUDataScience
NYU Data Science
8 months
In a recent #NeurIPS -accepted paper, CDS PhD student William Merrill ( @lambdaviking ), with @Ashish_S_AI at AI2, reveal the hidden limitations of transformer LLMs like #ChatGPT and how to detect their "hallucinations."   #datascience #hallucinations
0
1
9
0
1
14
@lambdaviking
Will Merrill
6 months
I'll be at NeurIPS next week! Looking forward to chatting about the computational power + limitations of transformers, as well as other fundamental questions about LMs Reach out if you'd like to chat! DMs open
1
0
14
@lambdaviking
Will Merrill
1 year
The coolest thing about GPT4 is I now have something to practice my broken Icelandic with
1
0
14
@lambdaviking
Will Merrill
8 months
Took a look today and this is very interesting stuff! The lower-bound direction is particularly cool: showing how LTL and counter-free automata can be simulated in a transformer through B-RASP.
@davidweichiang
David Chiang
8 months
New preprint! Dana Angluin, I, and Andy Yang @pentagonalize show that masked hard-attention transformers are exactly equivalent to the star-free regular languages.
2
16
74
1
2
13
@lambdaviking
Will Merrill
4 years
@tallinzen Not exactly what you're asking, but here's a dataset
1
1
12
@lambdaviking
Will Merrill
2 years
This result also solidifies the idea that (fixed-precision) transformer computation is "shallow": it can only next a finite number of quantifiers (wrt input length), rather than recursing arbitrarily deep like a Turing machine.
2
3
12
@lambdaviking
Will Merrill
1 month
@egrefen @sleepinyourhat +1 the general sentiment that nothing mystical is happening in our paper: our choice of task is strongly motivated by theory + intuition about what synthetic tasks filler tokens could help on
1
0
11
@lambdaviking
Will Merrill
1 month
@jowenpetty Covered in syntax 1?
1
0
5
@lambdaviking
Will Merrill
1 year
Any Merrills interested in a replication study?
@kchonyc
Kyunghyun Cho
1 year
it took us two months to have this preprint archived... can you guess why? a fun project led by Won Ik Cho and Eunjung Cho! [Cho, Cho & Cho, 2023]
7
10
122
1
0
12
@lambdaviking
Will Merrill
6 months
@zouharvi (and it may still fail even then; Geiger et al. 2019) Or consider getting rid of the outer parentheses altogether
1
0
11
@lambdaviking
Will Merrill
1 year
[4/n] Our result suggests a *Parallelism Tradeoff*: parallelism makes transformers scalable but limits the complexity of their forward pass. Fundamentally serial computation must be broken down into a "chain" of parallelizable steps à la Scratchpad/CoT
1
1
12
@lambdaviking
Will Merrill
2 years
Second, formal analysis of transformers, showing limits on the functions they can express (w/ Ashish Sabharwal, @nlpnoah ):
0
3
11
@lambdaviking
Will Merrill
1 month
@Ashish_S_AI Also always excited to talk about state-space models and state tracking (Accepted at ICML 🇦🇹 w/ @jowenpetty @Ashish_S_AI )
1
1
11
@lambdaviking
Will Merrill
2 months
@srush_nlp Agree with the post that there is a distinction (and often implicit conflation) of behavioral and mechanistic induction heads. Having a behavior definition seems more natural to me, followed by specific computational implementations of that def (eg on a transformer) 🧵
1
2
11
@lambdaviking
Will Merrill
1 year
The whole "SAT solver" thing seemed cool too but then I realized it was the boring kind of SAT
2
0
11
@lambdaviking
Will Merrill
2 months
@srush_nlp @jefrankle Sounds bad for rent
0
0
9
@lambdaviking
Will Merrill
2 years
[2/6] Specifically, the following relationship holds between text frequency and sentence entailment:
Tweet media one
1
0
11
@lambdaviking
Will Merrill
2 years
* Gricean speaker = speaker who attempts to convey information efficiently to a listener. Think rational speech acts. This is a decent first-order model of human speech acts, but it would be interesting to see how extending it changes the theory!
1
1
10
@lambdaviking
Will Merrill
1 year
Does anyone have references for understanding the scaling of model size (# params) vs. context size (# tokens) for large language models? Are there standard/"optimal" ways to scale these in tandem? Or is context size bottlenecked by memory, etc. in practice, not scaling laws?
1
0
9
@lambdaviking
Will Merrill
1 year
@UnderwaterBepis @Ashish_S_AI This applies to full transformers, not just attention layers
0
0
10
@lambdaviking
Will Merrill
3 months
This looks super cool, will need to give it a careful read! Seems like an interesting implication of the softmax bottleneck, which @mattf1n is an expert on
@mattf1n
Matthew Finlayson
3 months
Wanna know gpt-3.5-turbo's embed size? We find a way to extract info from LLM APIs and estimate gpt-3.5-turbo’s embed size to be 4096. With the same trick we also develop 25x faster logprob extraction, audits for LLM APIs, and more! 📄 Here’s how 1/🧵
Tweet media one
6
79
363
0
1
10
@lambdaviking
Will Merrill
6 months
Fruits of EMNLP is back at it again
@WeRateFruits
Fruit Guy
6 months
First durian in Singapore! Did I like it?......I'm still deciding.
Tweet media one
2
4
22
0
0
10
@lambdaviking
Will Merrill
11 months
@CFGeek @sir_deenicus I’ve worked a lot on analyzing non seq2seq transformers using circuit complexity. with realistic precision they are in tc0 - far from Turing complete! We’re currently working on extending the analysis to seq2seq transformers, which sounds a bit like what you’re suggesting
2
0
10
@lambdaviking
Will Merrill
1 year
@yoavgo Similar to “eigen-simulacra”, it’s misleading and just wrong to call conditional probabilities “amplitudes” (clear place where peer review would help). Ignoring the quantum fanboyism though, “simulator theory” reminds me of our model of LMs here:
1
0
10
@lambdaviking
Will Merrill
2 years
Takeaways: 1. An intuitive characterization of attention as majority quantification 2. Mechanistic interpretability: extracting and debugging logical structure of a model 3. Efficiency: converting models to format easier to work with at hardware level
1
1
9
@lambdaviking
Will Merrill
7 months
@srush_nlp Thanks! Also recommend this concurrent related work:
0
0
9
@lambdaviking
Will Merrill
2 months
@srush_nlp ah, hadn't seen this post. I have some thoughts but am about to give a talk so will respond later today!
1
0
9
@lambdaviking
Will Merrill
6 months
@xuanalogue From an expressiveness point of view, one layer is basically a weighted finite automaton, which can express things like counting similar to LSTMs (requires log n precision in the state)
1
0
8
@lambdaviking
Will Merrill
2 years
[4/6] Thm2 can be taken to justify the Distributional Hypothesis. Text frequency (form) and meaning aren't orthogonal. Linguistic theory predicts: Learning distribution (perfectly) => learning semantics
1
1
9
@lambdaviking
Will Merrill
1 year
@generatorman_ai @Ashish_S_AI We mean bits per activation (similar but not quite the same thing). In order words, the precision used to carry out addition/multiplication
2
0
9
@lambdaviking
Will Merrill
2 months
@srush_nlp How to define the induction head behaviorally? It's something like: given `ab...a`, predict `b`. But this definition is underspecified in two ways: 1. b-underspec: a could occur many times with different b's 2. a-underspec: there are different suffix options for a
1
0
8
@lambdaviking
Will Merrill
2 years
[5/6] That said, it's unclear if large LMs trained on NL reflect Thm2. Our proofs analyze "ideal" LMs that perfectly fit their target distribution. Real LMs only approximate it, and even small noise greatly perturbs probabilities ~= 0.
1
0
9
@lambdaviking
Will Merrill
2 months
@typedfemale libertarian ideals are NC1-complete via Ayn Rand reductions (citation needed)
1
0
9
@lambdaviking
Will Merrill
2 months
@srush_nlp Yes but it’s diagonal and only input-dependent through delta. Turns out this isn’t enough to get greater expressive power
3
0
8
@lambdaviking
Will Merrill
2 years
[3/6] n-gram LMs trained on synthetic Gricean data learn to reflect Thm2 in their probability mass function:
Tweet media one
1
0
8
@lambdaviking
Will Merrill
4 years
Awesome paper that goes beyond the Chomsky hierarchy to formalize RNNs' ability to represent bounded (!) hierarchical structure. In fact, RNNs and LSTMs can implement bounded stacks for Dyck languages in *optimal* space
@johnhewtt
John Hewitt
4 years
A simple communication complexity argument proves that O(m log k) hidden units is optimal -- even with unbounded computation (!!), it's impossible to use asymptotically fewer. That is, RNNs are fascinatingly well-suited (imo) to handling bounded-memory hierarchy.
2
0
6
0
0
8
@lambdaviking
Will Merrill
1 year
@katiedimartin But more seriously, curious about the current generative take on innateness vs learnability (to what degree can universals be explained by what languages are easier to learn?)
0
0
8
@lambdaviking
Will Merrill
3 years
Considering throwing a pierogi symposium now (with beer ofc)
@AllEndlessKnot
The Endless Knot
3 years
The #ConnectedAtBirth #etymology of the week is SYMPOSIUM/BEER/PIEROGIES #wotd
Tweet media one
0
7
11
0
1
8