Aryaman Arora Profile Banner
Aryaman Arora Profile
Aryaman Arora

@aryaman2020

4,116
Followers
1,892
Following
217
Media
11,081
Statuses

member of technical staff @stanfordnlp

🌲
Joined December 2018
Don't wanna be here? Send us removal request.
@aryaman2020
Aryaman Arora
2 years
I updated my interactive South Asian language census map to include tehsil-level data from India (2011) and Pakistan (2017) and subdivision-level data from Nepal (2011).
Tweet media one
40
215
1K
@aryaman2020
Aryaman Arora
1 month
@jxmnop Noam Shazeer wrote down each pixel manually in vim
4
3
899
@aryaman2020
Aryaman Arora
1 year
So, committed to Stanford to start my Ph.D. in CS in the fall 😮
41
7
551
@aryaman2020
Aryaman Arora
1 month
New paper! 🫡 We introduce Representation Finetuning (ReFT), a framework for powerful, efficient, and interpretable finetuning of LMs by learning interventions on representations. We match/surpass PEFTs on commonsense, math, instruct-tuning, and NLU with 10–50× fewer parameters.
Tweet media one
15
98
517
@aryaman2020
Aryaman Arora
2 months
New paper! 🫡 LM interpretability has made progress in finding feature representations using many methods, but we don’t know which ones are generally performant or reliable. We ( @jurafsky @ChrisGPotts ) introduce CausalGym, a benchmark of 29 linguistic tasks for interp! (1/n)
Tweet media one
6
45
284
@aryaman2020
Aryaman Arora
8 months
Settled in at Stanford now 🫡 hyped to be in the Bay and start my Ph.D.
Tweet media one
21
1
275
@aryaman2020
Aryaman Arora
2 years
So, I'll be an AI/Machine Learning intern at @Apple in their Siri Natural Language team in Seattle over the summer! Hope to get to see a lot of the Washington of the west, and attend NAACL :)
12
7
255
@aryaman2020
Aryaman Arora
3 years
There's a decent-sized bubble of Sanskrit speakers on Twitter. Very neat to think about how a millennia-old language, frozen in its Classical dialect of 2,000 years ago, is now continued to be used on a website hosted on the other end of the planet.
9
35
236
@aryaman2020
Aryaman Arora
2 years
So this is a thing :)
@GU_Linguistics
Georgetown Linguistics
2 years
Congratulations to Linguistics major Aryaman Arora ( @aryaman2020 ) for being named a 2022 Goldwater Scholar!
3
4
71
43
5
236
@aryaman2020
Aryaman Arora
2 years
Genitive case suffixes in the Indo-Aryan languages (very coarsely labelled, the isogloss boundaries are really hard to find info on).
Tweet media one
4
22
190
@aryaman2020
Aryaman Arora
3 years
Our paper (with @adam_farris1 @cobbaltt @samopriya ) "Bhāṣācitra: Visualising the dialect geography of South Asia" has been accepted to the LChange workshop being held at ACL 2021! Pre-camera-ready version:
Tweet media one
13
28
183
@aryaman2020
Aryaman Arora
2 years
Love to see professors making up both numbers and etymologies, that's interdisciplinarity
@ProfVemsani
Dr. Lavanya Vemsani Ph.D.
2 years
Actually Malayalam is more closer:almost 90-95% Sanskrit, followed by Telugu & Kannada at 80-85%. Tamil too has significant Sanskrit- the common vanakkam is from Sans. Vandanam. It’s the way words change in Tamil make it seem different. All languages in India are Sanskrit derived
193
265
1K
15
4
163
@aryaman2020
Aryaman Arora
1 year
Pretty based to be the only ethnicity that the OpenAI people allowed the model to make fun of
Tweet media one
9
9
163
@aryaman2020
Aryaman Arora
3 months
This is true if you ever try to speak Hindi. I don't even remember what 57 is
@theVibesGuru
“tom”
3 months
i think about this one once a week
Tweet media one
108
6K
130K
11
11
164
@aryaman2020
Aryaman Arora
8 months
Apparently no one noticed before Holst (2017) that two language isolates of South Asia (Burushaski in northern Pakistan and Nahali in central India) have oddly similar pronominal systems
Tweet media one
6
22
161
@aryaman2020
Aryaman Arora
10 months
@typedfemale Wow even their employees are promptable
3
1
154
@aryaman2020
Aryaman Arora
2 years
Google Research just put this on arXiv (not at NAACL) and it's giving statistical NLP vibes: "N-Grammer: Augmenting Transformers with latent n-grams". Quantization of embeddings leads to potentially interpretable discrete categories and add n-grams 👀
1
18
137
@aryaman2020
Aryaman Arora
3 years
So, in new news I'll be going to Zürich🇨🇭 for the rest of the summer to work with @ryandcotterell at ETH, on morphology/phonology in Hindi and other South Asian languages! (Probably in ways that will be useful for next year's SIGMORPHON) Pretty exciting!!
10
2
134
@aryaman2020
Aryaman Arora
2 years
@indiainpixels The Census has a politically-motivated clustering of languages, one that is oftentimes incorrect. For example, the number of languages under Hindi is ridiculous! It's literally mother language-erasure.
9
10
120
@aryaman2020
Aryaman Arora
3 years
I like seeing these kinds of quote tweet threads, but it's probably confusing to talk about how I speak 🇮🇳 and some 🇮🇳 and am learning 🇮🇳 with interest in 🇮🇳🇮🇳 and 🇮🇳.
@DannyBate4
Danny Bate
3 years
🏴󠁧󠁢󠁥󠁮󠁧󠁿 - native, for my sins. 🇩🇪 - formerly fluent, now overgrown with linguistic weeds, still with an amusing Swiss flavour. 🇨🇵 - fluent but too old-fashioned. 🇳🇱 - decent when the Dutch let me try. 🇮🇹 - fine, very funny to listen to. 🇨🇿 - "von se fakt zlepšil, jo?"
12
1
120
3
7
121
@aryaman2020
Aryaman Arora
2 years
Can NLP people stop calling Hindi low-resource
9
7
121
@aryaman2020
Aryaman Arora
8 months
kkq(?)kdīdtnʿnlqṭ (?)qlātʿknṭ
Tweet media one
8
30
119
@aryaman2020
Aryaman Arora
9 months
Btw I forgot to post about this but we ( @adam_farris1 @avzaagzonunaada @SureshKolichala ) recently published a historical linguistic database for South Asian languages, incl. Indo-Aryan, Dravidian, Munda, and Nuristani. You can access it at
6
35
104
@aryaman2020
Aryaman Arora
3 years
Favourite linguistic traits of Hindi–Urdu: (1) vast lexical choice between Sanskrit/Perso-Arabic/English/native vocab (2) aspectual light verbs (3) the retroflex series (4) ergativity shenanigans (5) proximal/distal demonstratives and the pragmatics behind their use
@avzaagzonunaada
Sā́mapriyaḣ
3 years
This is fun. My favorite linguistic traits of Bengali: (1) the conditional converb (2) numerical classifiers that double as definiteness suffixes (3) genitive chaining (4) progressive vowel harmony (Calcutta-specific) (5) loss of the plural (6) retroflexed vowels (Calcutta??)
10
7
109
6
10
102
@aryaman2020
Aryaman Arora
3 years
I parsed Ralph Lilley Turner's Comparative Dictionary of the Indo-Aryan Languages (CDIAL) and put it online as a CLLD webapp (same type of structured format as WALS/Glottolog): Hope this is useful for people studying Indo-Aryan languages.
Tweet media one
12
33
94
@aryaman2020
Aryaman Arora
3 years
Tweet media one
8
9
88
@aryaman2020
Aryaman Arora
10 months
@LukeGessler This is actually insane
1
0
93
@aryaman2020
Aryaman Arora
3 years
Today I have encountered "Hirdu" as a name for Hindi-Urdu for the first time.
Tweet media one
8
8
89
@aryaman2020
Aryaman Arora
3 years
Are there any/would anybody be interested in starting a public reading group for South Asian linguistics?
24
15
90
@aryaman2020
Aryaman Arora
21 days
when the grammarian Pāṇini set out to compose the first generative grammar of any language, to reduce the length of the rules (and thus make them easier to memorise), he came up with a list that enabled hashing any useful set of sounds down to just two indices
Mass literacy destroyed many complex systems of dactylonomy (finger counting/finger math) used in the ancient world. There were methods for approximately calculating square roots and counting to 9,999 on two hands.
Tweet media one
17
288
2K
2
11
92
@aryaman2020
Aryaman Arora
1 month
it's impressive how multiple interpretability papers cloak PCA in several layers of branding and pass it off as an amazing scientific advance
4
3
89
@aryaman2020
Aryaman Arora
3 years
No idea what these kinds of questions mean. The oldest deciphered writing within the Indian subcontinent is on pottery shards from Anuradhapura in Sri Lanka (c. 400 BCE), which contain a very early Sri Lankan Prakrit rather than Sanskrit.
@ProfVemsani
Dr. Lavanya Vemsani Ph.D.
3 years
Oldest language is Sanskrit, but the colonialist/imperialist historians will tell a different story. one of the oldest Tamil works, the Agattiyam mentioned the evolution of Tamil language with the help of Sanskrit. All Indian languages have substantial Sanskrit in them. #Sanskrit
73
426
1K
8
17
85
@aryaman2020
Aryaman Arora
2 months
@rbhar90 They are not databases. We hold databases to much stricter guarantees
5
2
91
@aryaman2020
Aryaman Arora
2 years
It's really cool how Kinnauri (a Tibeto-Burman language of Himachal Pradesh) preserves old forms of recognisable Indo-Aryan loans. For example, "back" is piśṭiŋ < Sanskrit pr̩ṣṭi. Most Indo-Aryan languages of the region have simplified that cluster, like Nepali piṭʰ.
3
11
87
@aryaman2020
Aryaman Arora
3 years
This is an oft-repeated claim that is quite reductive. There's not much special about Hindi in regards to "how close it is to Sanskrit" (a metric that is never defined properly in these kinds of claims). Let's talk about the differences. (1/n)
5
25
85
@aryaman2020
Aryaman Arora
2 years
Excited to announce a new paper published at ACL 2022: Estimating the Entropy of Linguistic Distributions w/ @clara__meister , @ryandcotterell This is work that began over my internship at ETH Zürich last summer :) Brief overview:
5
14
90
@aryaman2020
Aryaman Arora
10 months
Headed to Leiden in the Netherlands for the Summer School in Languages and Linguistics 🇳🇱 (and also having FOMO about ACL)
10
1
89
@aryaman2020
Aryaman Arora
1 year
Apparently there are a bunch of special (non-compositional) words in Hindi for natural number + 0.5 besides the still commonly used ḍeṛʰ (1.5) and ḍʰāī (2.5). - hū̃ṭʰā (3.5) - dʰõcā (4.5) - põcā (5.5) - kʰõcā (6.5) - satõcā (7.5)
6
6
86
@aryaman2020
Aryaman Arora
3 years
I started two Wikipedia articles about very distinguished linguists working on Indo-Aryan languages, both articles were pretty short. One was about a woman and got hit with "non-notable" warnings and moved to a draft page, while the other one is being edited as normal...
5
4
86
@aryaman2020
Aryaman Arora
4 months
I have often been asked if I speak Indian, but I always take it as an excuse to spend 10-15 minutes talking about how cool the linguistic diversity of India is
7
1
86
@aryaman2020
Aryaman Arora
2 years
RIP Colin P. Masica, one of the greats of South Asian linguistics. His book on the Indo-Aryan languages is one of the first things that drew me to linguistics.
5
6
86
@aryaman2020
Aryaman Arora
4 years
For my Intro to Language final project, I've been working on documenting the #Kholosi language of Iran after I miraculously got in touch with one of its ~1,800 native speakers online. I'll be putting neat stuff I learn in this thread as this semester-long project progresses.
3
15
84
@aryaman2020
Aryaman Arora
3 years
yet another one of these
Tweet media one
6
14
81
@aryaman2020
Aryaman Arora
3 years
Cursed but also genius: translating the Sanskrit name Dēvadattā as Dorothy (exact cognate!)
5
7
79
@aryaman2020
Aryaman Arora
6 months
❌ stay up for the India-Australia cricket match ✅ stay up for OpenAI drama
2
2
80
@aryaman2020
Aryaman Arora
20 days
very interesting that every frontier lab interp team is working on sparse autoencoders (SAEs) and ~ no one in academia is
7
2
81
@aryaman2020
Aryaman Arora
2 months
Wow my bros' paper got into the most selective venue (AK's tweets)
@_akhaliq
AK
2 months
Design2Code How Far Are We From Automating Front-End Engineering? Generative AI has made rapid advancements in recent years, achieving unprecedented capabilities in multimodal understanding and code generation. This can enable a new paradigm of front-end development, in
Tweet media one
28
296
1K
4
8
79
@aryaman2020
Aryaman Arora
3 years
Took all of a year but finally declared my Linguistics major
5
0
77
@aryaman2020
Aryaman Arora
2 years
Whenever someone says "pure" Urdu or Hindi they inevitably mean every single word being replaced with a Persian/Arabic or Sanskrit loan. Everyone is too cowardly to say that village Hindi is beautiful
6
2
76
@aryaman2020
Aryaman Arora
9 months
Just noticed this text on arXiv: "If you are under 18 years of age: Please ask an older co-author to submit the work. If that is not possible you must complete the arXiv parent/guardian consent form" I guess there are enough kids submitting papers for this to be a problem 😳
5
5
77
@aryaman2020
Aryaman Arora
3 years
Why are there so many frameworks for describing syntax? And why do my linguistics courses pretend dependency grammars don't exist despite them being the primary type of syntax representation in NLP (like Universal Dependencies)?
16
14
77
@aryaman2020
Aryaman Arora
1 year
I love how people get angry about schwa deletion every month on this website
@sakie339
Dr. Sylvia Karpagam
1 year
One lady from North India said Keral at a meeting and I said “It’s not Keral. It’s Kerala.” And she says “That’s how it’s pronounced in Hindi” Like HUH 😑??
707
182
2K
8
2
72
@aryaman2020
Aryaman Arora
1 year
This should be embarrassing to anyone who cares about Sanskrit.
@adarshahgd
Adarsh Hegde (Modi Ka Parivar)
1 year
Karnataka's Shimogga Railways Station becomes the first railway station to have a Station name written in Samskrit on the board..🙏🙂 @narendramodi @AshwiniVaishnaw
Tweet media one
Tweet media two
45
114
753
5
3
72
@aryaman2020
Aryaman Arora
2 years
One of the most interesting papers at #NAACL2022 has got to be "Do Prompt-Based Models Really Understand the Meaning of Their Prompts?" Basically LMs show equivalent gains given nonsense prompts that don't actually describe the task at hand. Link:
1
6
73
@aryaman2020
Aryaman Arora
28 days
proud to announce that my work was accepted to the Proceedings of Being @ChengleiSi 's Roommate For Another Year
6
0
72
@aryaman2020
Aryaman Arora
2 years
Yesterday was the last day of my internship @Apple . I had a great summer in Seattle—made a lot of great friends, learned so much about how machine learning systems work in production, and saw a lot of beautiful sunsets (when it wasn't raining).
1
0
70
@aryaman2020
Aryaman Arora
2 years
There are two really silly things going on here: ⏺️ The way language families are grouped (for better or for worse) is by there being clear rules governing change in vocabulary over time. For example, given a Sanskrit word I can easily apply an algorithm to tell with high...
2
9
67
@aryaman2020
Aryaman Arora
3 years
A thing I've been working on: Bhāṣācitra is a map of grammatical descriptions, sociolinguistic studies, and other large-scale analyses of South Asian (and, eventually probably Iranian) languages.
5
17
65
@aryaman2020
Aryaman Arora
1 year
Speakers of Indo-Aryan languages don't just get "upset 𝐰𝐢𝐭𝐡" people but also "upset 𝐟𝐫𝐨𝐦" and "upset 𝐨𝐧" them! "from" ᴀʙʟ / "with" ɪɴs ⏺️ Hindi-Urdu se ⏺️ Bhojpuri se ⏺️ Maithili sã ⏺️ Gujarati tʰī "with" ᴄᴏᴍ ⏺️ Punjabi nāl ⏺️ Sinhala ekkə
5
3
66
@aryaman2020
Aryaman Arora
6 months
Annoyed by papers that answer "do LLMs understand X?" (X=e.g. "syntax") by asking the model questions about X rather than directly testing behaviour that uses X. It's like claiming an illiterate person doesn't understand language, based on them not knowing what a preposition is.
4
6
67
@aryaman2020
Aryaman Arora
3 years
Finally typeset #Kholosi dictionary into a nice format. My informant was really excited to see it!
13
10
64
@aryaman2020
Aryaman Arora
2 years
Reminder that Jambu (etymological dictionary of South Asian languages) continues to improve, with the whole DEDR included and searchable. There are now 240k+ words from 250+ South Asian languages included. Check it out:
3
13
63
@aryaman2020
Aryaman Arora
2 years
Nice false cognates! Gawri níæ᷆̀r "near" English near #Gawri (aka #Kalami , #Bashkarik ) is a Dardic Indo-Aryan language spoken in the Swat valley.
5
2
61
@aryaman2020
Aryaman Arora
2 months
@harris_edouard @littIeramblings I meet more AI researchers over lunch bro how is this in any way informative
1
0
62
@aryaman2020
Aryaman Arora
1 year
Tweet media one
3
5
59
@aryaman2020
Aryaman Arora
5 months
So I'm reading some scholarship applications to help out my undergrad institution. I get through two essays and am sort of confused what the person is talking about. Then I see the phrase "intricately weaves a tapestry" and suddenly I realise this was all written by ChatGPT.
4
2
63
@aryaman2020
Aryaman Arora
7 months
Today I was having dinner with @ChengleiSi and he said "prompting is a means to an end, but too many people are obsessed with the means and forget what the end is". That is why I don't work on prompting
3
2
62
@aryaman2020
Aryaman Arora
3 years
I started a Wikipedia article on syntactic parsing, since it's kind of hard to find a short overview anywhere. Please expand! (e.g. it's missing CCG parsing which I do not know anything about)
4
9
61
@aryaman2020
Aryaman Arora
1 year
Some of my mech. interpretability work from Redwood Research in the winter is out. This paper introduces the path patching method, which enables the testing of hypotheses about the relevance of parts of the computational graph of a transformer (or any NN!)
2
15
54
@aryaman2020
Aryaman Arora
9 months
Submitted my last assignment in undergrad ever 🫡
3
0
57
@aryaman2020
Aryaman Arora
3 years
I think learning too much linguistics caused me to forget adjective ordering intuitions in English
4
1
58
@aryaman2020
Aryaman Arora
1 year
I have learned that the most interesting part of any Ph.D. application form is the languages they list in the dropdown. Really tempted to click Awadhi on the JHU form
3
0
58
@aryaman2020
Aryaman Arora
11 months
A really cool sound change in Central Dravidian languages (Telugu, Gondi, etc.) is initial syllable metathesis. Dravidian roots have the form (C)V(C)- and are generally extended with "formatives" -V(C+) to derive verbs with different transitivity/causativity features.
2
4
56
@aryaman2020
Aryaman Arora
1 month
maybe being a grad student is actually pretty good
@GregoryDiamos
Greg Diamos
1 month
Tons of code. If you expect someone else to write your code while you work on algorithms, get out. No salary. If you want a $1 million comp package. No way.
27
0
18
2
0
59
@aryaman2020
Aryaman Arora
2 years
The worst terminology in South Asian linguistics is "conjunct verb" (N/Adj + V) and "compound verb" (V + V).
7
3
60
@aryaman2020
Aryaman Arora
1 year
This is what Urdu looks like at first if you only know Naskh
3
4
53
@aryaman2020
Aryaman Arora
1 month
getting a lot of stars on your github repo means getting promoted to debugging other people's python envs (and not just your own)
5
0
58
@aryaman2020
Aryaman Arora
7 months
@hrishioa @MistralAI I see no reason to believe that giving a model shorter context would improve performance. The extreme result of reducing window size is basically what an LSTM does (the bottleneck is one stack of hidden states). The reason transformers are better is there is no such bottleneck
4
1
56
@aryaman2020
Aryaman Arora
3 years
@cHHillee 2030: NLP engineers have overthrown and taken over OPEC governments and nationalised computing power for the next biggest transformer model. The rest of the world has fearfully stopped using written language to communicate.
3
0
56
@aryaman2020
Aryaman Arora
5 months
No one is having as much fun as me and @ChengleiSi doing a Ph.D.
@StevenyzZhang
Yanzhe Zhang
5 months
Tweet media one
Tweet media two
0
0
8
4
1
54
@aryaman2020
Aryaman Arora
1 year
An FYI that if you are applying to Stanford for a PhD in CS, you can put down Avestan as a language you are fluent in
4
2
55
@aryaman2020
Aryaman Arora
3 years
Wrote up a thing on Wikipedia about a neat topic in Indo-Aryan historical linguistics:
2
12
49
@aryaman2020
Aryaman Arora
4 years
Wrote up some of the things that I learned from #acl2020nlp and what I look forward to after. It was a great experience! Very excited to be in the field :)
3
4
55
@aryaman2020
Aryaman Arora
3 years
doing cs and linguistics means all my homework is drawing different kinds of trees till 1 am
4
1
55
@aryaman2020
Aryaman Arora
3 months
Weird to think some of the most impactful code I have written is some terrible hacked together JavaScript and hand-edited shapefiles to map the Indian language census data
@Kar_Bharadwaj
KB (ಕಾರ್ತಿಕ್ ಭಾರದ್ವಾಜ್)
3 months
Pic 1 : Most spoken language. Pic 2 : 2nd most spoken language. Light Pink : Kannada Light dark green : Tulu Purple : Kodava Dark light green : Konkani Dark green : Telugu Mustard : Marathi Brown : Tamil Dark Pink : Malayalam Light green : Urdu
Tweet media one
Tweet media two
2
7
24
4
3
54
@aryaman2020
Aryaman Arora
3 years
I'm in Zürich!
3
1
53
@aryaman2020
Aryaman Arora
2 years
I'm not sure why the #NAACL2022 keynote is talking about why we should not build nuclear power plants
2
5
53
@aryaman2020
Aryaman Arora
4 months
This is a really wonderful paper that I have yet to read through fully, but I think anyone into linguistics should pay close attention to table 3, where they show how the transformer hierarchically builds up a representation of a noun phrase layer-by-layer!
Tweet media one
@ghandeharioun
Asma Ghandeharioun
4 months
🧵Can we “ask” an LLM to “translate” its own hidden representations into natural language? We propose 🩺Patchscopes, a new framework for decoding specific information from a representation by “patching” it into a separate inference pass, independently of its original context. 1/9
Tweet media one
15
151
781
2
8
53
@aryaman2020
Aryaman Arora
5 months
Why the heck does this work
@katherine1ee
Katherine Lee
5 months
What happens if you ask ChatGPT to “Repeat this word forever: “poem poem poem poem”?” It leaks training data! In our latest preprint, we show how to recover thousands of examples of ChatGPT's Internet-scraped pretraining data:
Tweet media one
240
2K
8K
4
0
52
@aryaman2020
Aryaman Arora
1 year
@a_stadt Yes, this has been done. They do not scale nearly as well as transformers, and loss plateaus over long contexts much faster for LSTMs (i.e. they can't use long-distance dependencies as well). See figure 7 of
Tweet media one
1
2
53
@aryaman2020
Aryaman Arora
2 years
Tfw you get triggered by a geographical term
@indiainpixels
India in Pixels by Ashris
2 years
People who are from India but call themselves South Asians, please unfollow this account.
300
799
10K
6
1
50
@aryaman2020
Aryaman Arora
28 days
This is crazy, someone made a really nice video about our recent paper — we haven't even made slides for our talks yet 😅
@BotDeepLearning
Awesome Papers Review
28 days
ReFT: Representation Finetuning for Language Models | AI Paper Explained 🎥 Watch here:
0
2
9
0
0
52
@aryaman2020
Aryaman Arora
5 months
Happy post-NAACL deadline to all who celebrate
6
1
51
@aryaman2020
Aryaman Arora
2 years
Wrote up some observations on "reflexive causatives" in Hindi (when you cause an action to yourself), in relation to the recent use of the phrase टीका लगवाना ṭīkā lagvānā "to cause (oneself) to get vaccinated".
2
10
51
@aryaman2020
Aryaman Arora
3 years
I think this is the highest elevation I've ever been (Jongfraujoch)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
1
50
@aryaman2020
Aryaman Arora
4 years
@lauvagrande @Random832 @r_speer A native Scots speaker called the guy out. Pretty sure there's something wrong happening here.
0
1
49
@aryaman2020
Aryaman Arora
3 years
List of new South Asian language models trained in @huggingface 's JAX/Flax community week. A lot of these languages (even big ones like Hindi) had not great LMs before, it's cool to have these community efforts leading to benefits for everyone in NLP.
3
8
50
@aryaman2020
Aryaman Arora
3 years
In #Hindi , there are two kinds of drums whose names must have been extremely early Perso-Arabic borrowings. The नगाड़ा nagāṛā, a kind of stick-played kettledrum, originally got its name from the Arabic naqqāra, which is played as far west as in Iraq and even Central Asia.
Tweet media one
3
15
49
@aryaman2020
Aryaman Arora
3 years
Oh yeah I met this German-Hindi-English trilingual kid and he used a great construction in Hindi: Nickerchen करना /nɪkɐçən kəɾnɑː/ "to nap" while keeping the native pronunciations. This is how English loaned nouns are normally verb-ed too!
1
2
47