Lei Li Profile
Lei Li

@lileics

3,441
Followers
406
Following
64
Media
633
Statuses

Generative AI for language and science

Joined April 2010
Don't wanna be here? Send us removal request.
@lileics
Lei Li
3 years
Very exciting to join @ucsbcs . Excited to working with great folks William and Shiyu, and many others! @WilliamWangNLP @CodeTerminator
@WilliamWangNLP
William Wang
3 years
UCSB #NLProc Group welcomes new faculty and co-Director Prof. Lei Li @lileics : Don’t forget to check out his #ACL2021NLP Best Paper on optimal transport for vocabulary reduction next week.
Tweet media one
6
5
108
20
6
192
@lileics
Lei Li
3 years
Honored and flattered to receive the best paper award of #ACL2021NLP . Thank to wonderful co-authors Jingjing, Hao, Chun and Zaixiang, to the extraordinary MLNLC team at Bytedance AI Lab, and zhenyuan, weiying, hang, and ACL committee for support! to Olympic sprinter #SuBingtian
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@aclmeeting
ACL 2024
3 years
Congratulations to the authors who have won our paper awards! #ACL2021NLP #NLProc
Tweet media one
0
28
138
2
18
109
@lileics
Lei Li
1 year
I’m excited to share that I’ve received an Amazon Research Award for my proposal "Real-time robust simultaneous interpretation with few samples" at UCSB. Learn more about the program on the @AmazonScience web: #AmazonResearchAwards @ucsbcs @UCSBengineering
5
5
98
@lileics
Lei Li
9 months
Happy to give a talk at @LTIatCMU Colloquium on "Empowering Responsible Use of Large Language Models" this Friday Sep. 8 12:30pm at Posner Hall A35. I will talk about how can we robustly detect AI generated text and protect copyright of LLMs. @SCSatCMU
Tweet media one
1
12
96
@lileics
Lei Li
2 years
How to teach an AI model to reason about concepts? We release a benchmark, E-KAR, for analogical reasoning in English and Chinese. It is extremely challenging (see example next). Few AI models could do it well. Welcome to try the task at #ACL2022
Tweet media one
Tweet media two
Tweet media three
2
14
73
@lileics
Lei Li
11 months
The session chair told me that the fully-zero shot talk by @xuandongzhao is the best received talk in the LLM, which attracts hundreds of audience at #ACL2023NLP . A line of people were asking questions. @siqi_ouyang
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
6
69
@lileics
Lei Li
7 months
Over the years, several of my former colleagues at Bytedance AI lab left -- @weiyingma and Hao Zhou joined Tsinghua University as faculty, @QiuqiangK joined Chinese university of HK, I left two years ago to join UCSB and now CMU. I consider this as fulfilling one original mission
@ylecun
Yann LeCun
7 months
Ross Girschick ( @inkynumbers ) is leaving FAIR for AI2, following @sainingxie who joined the NYU faculty (yay!), and @georgiagkioxari who went to Caltech, and Hervé Jégou and @AlexDefosse who went to non-profit Kyutai. It's a loss for FAIR, but I'm happy for them. There is
23
58
1K
1
4
69
@lileics
Lei Li
11 months
[ACL2023] On the second day of the poster session, I presented the work WACO on speech translation. I managed to put up the other two posters on multi-language translation Lego-MT and automatic evaluation indicator SEScore2. Indeed, people asked to hear about all three papers.
Tweet media one
@lileics
Lei Li
11 months
My group will present 5 papers at #ACL2023NLP . I will be onsite for all these papers. @xuandongzhao Siqi @WendaXu2 @jiangjie_chen Fei will join virtually on underline/gathertown. Welcome to talk to me or co-authors!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
4
46
0
1
64
@lileics
Lei Li
6 months
We had a similar story. our paper on vocabulary learning received all rejections in ICLR2021 (). We revised the paper and submitted to ACL 2021. It received the best paper. But I do not complain the review. We did take comments and revised substantially.
@thegautamkamath
Gautam Kamath
6 months
You might have heard that word2vec, test of time award winner at #NeurIPS2023 , was rejected from #ICLR2013 back in the day. Interestingly, one reviewer felt so strongly that they recommended "Strong Reject" four times.
Tweet media one
43
144
1K
0
4
57
@lileics
Lei Li
4 months
Using LLM to evaluate its own generation quality will lead to bias --- it favors its own generation comparing to other models'. What is the consequence of such bias in LLM's generation? check out new paper:
@WendaXu2
Wenda Xu
4 months
[New paper!] Can LLMs truly evaluate their own output? Can self-refine/self-reward improve LLMs? Our study reveals that LLMs exhibit biases towards their output. This self-bias gets amplified during self-refine/self-reward, leading to a negative impact on performance. @ucsbNLP
Tweet media one
5
53
210
2
11
49
@lileics
Lei Li
6 months
Anima Anandkumar delivered a highly impressive talk on generative models for predicting new mutations of coronavirus, generating sequence following phylogeny of SARS-COV-2 (genome-scale LM), and 2000x faster md simulation by NeuralMD @AnimaAnandkumar @genbio_workshop
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@AnimaAnandkumar
Prof. Anima Anandkumar
6 months
Looking forward to my talk tomorrow afternoon at #NeurIPS2023 @genbio_workshop
1
2
25
0
4
45
@lileics
Lei Li
11 months
My group will present 5 papers at #ACL2023NLP . I will be onsite for all these papers. @xuandongzhao Siqi @WendaXu2 @jiangjie_chen Fei will join virtually on underline/gathertown. Welcome to talk to me or co-authors!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
4
46
@lileics
Lei Li
4 months
Introducing LingoLLM. LLMs like GPT4 can still fail to process an endangered language such as Gitksan and Manchu. We develop a generic and training-free method to enable language understanding and translation capability for any new language (with some linguistic description).
@kexun_zhang
Kexun Zhang
4 months
🚀Fire linguists in the LLM era? No! Excited to share LingoLLM, a novel method for processing endangered languages with linguistic resources. LingoLLM improves the translation of many endangered languages from 0 to 10.5 BLEU! It helps other tasks as well!
Tweet media one
8
45
158
0
7
44
@lileics
Lei Li
5 months
Glad and excited to give a talk in @CarnegieMellon @CyLab Jan 22. I will present the challenge about "how to detect AI generated text and images" and share our recent progress on AI watermark techniques.
@CyLab
CyLab
5 months
We’re excited to welcome @CarnegieMellon ’s own @lileics , Assistant Professor at @LTIatCMU , to the @CyLab Seminar series next Monday, January 22 at 12 p.m. ET to discuss attacks and robust watermarking for generative #AI .
Tweet media one
1
1
0
1
3
42
@lileics
Lei Li
6 months
My students Danqing Wang ( @dqwang122 ) and Wenda Xu ( @WendaXu2 ) and I will be attending #EMNLP2023 in Singapore this week. We will bring four papers to present (Cooperative LLM agents, Fine-grained Explainable Metric(reward model), Auto planning LLM, and Multilingual Extrapo).
0
1
41
@lileics
Lei Li
11 months
The generation metric paper SEScore2 was at the best spot to present, right to the entrance of exhibit hall at #ACL2023NLP . Huge enthusiasm even 30 mins after the session ends. congrats to co-authors @WendaXu2 @WilliamWangNLP Mingxuan Wang and Xian Qian.
Tweet media one
Tweet media two
Tweet media three
0
3
38
@lileics
Lei Li
8 months
Turning a pre-trained large language model into using any external tools (by api call), our latest ToolDec does not require any in-context example to demonstrate, nor any fine-tuning. ToolDec generalizes to orders of magnitude more tools.
Tweet media one
@kexun_zhang
Kexun Zhang
8 months
😭Tired of in-context demos & docs for LLM tool use? 💰Too GPU-poor to tune LLMs for unseen tools? 🤬Frustrated with frequent syntax errors in tool calls? Check out our new preprint 𝐓𝐨𝐨𝐥𝐃𝐞𝐜 that addresses all these issues from the decoding side! 1/5
Tweet media one
4
32
105
1
7
35
@lileics
Lei Li
4 months
you may still use ChatGPT to evaluate. but be aware that ChatGPT favors ChatGPT, GPT4's generation. (so even if Gemini/LLaMA may generate the same quality text, they won't be preferred by chatGPT). Such bias may be more severe if you use it for self-refinement.
@AlbertBoyangLi
Boyang "Albert" Li
4 months
So what happens to ChatGPT as an evaluation metric of text quality? Do we know exactly what we are evaluating?
0
0
1
3
5
36
@lileics
Lei Li
5 months
🚀🤖 Excited to drop some serious "Buckeye Brainpower" at #OSU ! 🌰📚 Join me as I unravel the secrets of "Self-assisting and Cooperative Large Language Models" at 12pm in POM301. 👾 I'll be spilling the beans on how LLMs can write code like it's a poetry slam and LLMs cooperate!
0
3
36
@lileics
Lei Li
1 year
Adapting a pre-trained language model to new tasks often requires in-context labeled examples for demonstration. Can LLM be adapted to new tasks in fully zero-shot setting (no fine-tuning, no in-context demo)? Check out NPPrompt (nonparametric prompt) for LLM at #ACL2023NLP 👇
@xuandongzhao
Xuandong Zhao
1 year
Thrilled to announce that our paper "Pre-trained Language Models can be Fully Zero-Shot Learners" has been accepted to #ACL2023 !
Tweet media one
1
8
32
0
9
35
@lileics
Lei Li
6 months
If you want to study LLMs, MT, AI4Science(Drug/Protein design), and broader NLP in grad school, please apply to @LTIatCMU We have a great group of faculty: ddl: Dec 13!
@gneubig
Graham Neubig
7 months
If you want to study NLP, LLMs, or broader language technology in grad school, please apply to @LTIatCMU ! We have a great group of faculty covering many topics: I personally will be recruiting students on LLMs/agents/evaluation.
0
59
237
0
5
32
@lileics
Lei Li
3 years
Congratulations to Prof. Yu-Xiang Wang and his student Dheeraj of @ucsbcs winning the best student paper in #COLT2021 , the top theoretical machine learning conference! It is about Optimal Dynamic Regret in Exp-Concave Online Learning, very interesting! They’ll present on 8/16.
0
2
30
@lileics
Lei Li
11 months
My students and I are attending ICML at Honolulu Hawaii. We will present three main conference papers and four workshop papers. Topics are 2x faster diffusion algorithm, protein sequence design, protecting language model with invisible watermark. Welcome to ping me or stop by.
Tweet media one
Tweet media two
0
1
30
@lileics
Lei Li
2 years
Last quarter, students at UCSB CS taking my course on machine translation () wrote a series of wonderful blogs about recent works in MT. In the following series, I will share some of these blogs. 1/N #NLProc @ucsbcs @ucsbNLP
0
4
29
@lileics
Lei Li
6 months
Highlights of Shuiwang Ji’s talk: generative models (g-spherenet, graphbp, symat) are effective for target based drug design, Crystalline Materials, and proteins! @ShuiwangJi @genbio_workshop
Tweet media one
Tweet media two
Tweet media three
@genbio_workshop
GenBio Workshop @Neurips2023
6 months
Prof Wang is speaking now!
Tweet media one
0
0
3
0
4
26
@lileics
Lei Li
1 year
Join us at 2023 AAAI Doctoral Consortium. This year we have 20 phd students from all areas of AI and 20 senior mentors. Don't miss the Keynotes from Amy Zhang( @yayitsamyzhang ) and Pulkit Agrawal ( @pulkitology ). All at AAAI are welcome. @RealAAAI #AAAI23 @j_foerst
0
3
24
@lileics
Lei Li
6 months
4. The return of academiaAcademia is back as we saw at NeurIPS 2023. This is consistent with my observation at #NeurIPS2023 My impression is that this year's conferences (and ICML, ACL/EMNLP) are no longer focusing on the performance scores, but on the interestingness of ideas.
@Thom_Wolf
Thomas Wolf
6 months
Some predictions for 2024 – keeping only the more controversial ones. You certainly saw the non-controversial ones (multimodality, etc) already 1. At least 10 new unicorn companies building SOTA open foundation models in 2024 Stars are so aligned: - a smart, small and dedicated
19
77
411
0
1
25
@lileics
Lei Li
6 months
I will attend #NeurIPS2023 in New Orleans with my students @kexun_zhang @ZhenqiaoSong @xuandongzhao @qx_dong excited to meet and discuss about our most recent work on program generation, knowledge assessment, LLM tool use, watermark, goal-conditioned RL etc.
Tweet media one
0
2
24
@lileics
Lei Li
10 months
How to design drugs to kill bacteria. Danqing will present LSSAMP work on antimicrobial peptides design Wed 2pm in 201A and Tue 6pm. The core idea is generating the amino acid sequence based on secondary structure and quantized latent space. #KDD2023 Paper:
Tweet media one
1
2
22
@lileics
Lei Li
2 years
The fastest review! I am on editorial board of one submission to a journal. After I assigned the reviewers, one of the reviewer accepted, read and wrote a high quality review within half a day. And it was Sunday afternoon! amazing! Do these researchers need holiday?
1
0
20
@lileics
Lei Li
1 year
@xwang_lk Well, I donot think the authors aware of the bad implications from the name. Since BAT is just short for bidirectional autoregressive talker. No need to over interpret.
3
1
21
@lileics
Lei Li
1 year
Great mentor for our students!!! @WilliamWangNLP
@ucsbcs
UCSB Computer Science Department
1 year
This past January, Professor William Wang was awarded the 2023 CRA-E Undergraduate Research Faculty Mentoring Award. Read the COE news release, linked on the CS Website! Link in bio. Congratulations!
Tweet media one
2
0
21
1
0
20
@lileics
Lei Li
6 months
🤓🚀Headed to #NUS to spill some digital tea LLM! Join my talk 'Self-assisting and Cooperative Large Language Models' where we'll decode how LLMs are turning into algorithm-solving ninjas, syntax-error police, and your personal study sidekick who learns from their blunders. 🦜💻
1
0
20
@lileics
Lei Li
7 months
The best outcome from a lab is its talent. I share the similar gratitude to Bytedance AI Lab. When @weiyingma and I started the lab seven years ago, one of the (three) mission we set and endorsed by the executive team was to cultivate talents with the best supporting environment.
@ylecun
Yann LeCun
7 months
Ross Girschick ( @inkynumbers ) is leaving FAIR for AI2, following @sainingxie who joined the NYU faculty (yay!), and @georgiagkioxari who went to Caltech, and Hervé Jégou and @AlexDefosse who went to non-profit Kyutai. It's a loss for FAIR, but I'm happy for them. There is
23
58
1K
0
2
19
@lileics
Lei Li
2 years
Releasing MLGSum, another dataset for Text Summarization. It includes 1.2 million articles and summary for 12 languages. Summaries have 30 tokens on average. Idea for multilingual text generation. #NLProc 2/2
@lileics
Lei Li
3 years
Release two datasets for text summarization. The first is a large-scale document summarization for Chinese news, with 300k articles, average length 730 characters, and human written summary. Additional annotation of adequacy & deducibility. #NLProc 1/2
0
1
7
0
3
19
@lileics
Lei Li
6 months
Here are the papers and the presentation schedule at #EMNLP2023
Tweet media one
@lileics
Lei Li
6 months
My students Danqing Wang ( @dqwang122 ) and Wenda Xu ( @WendaXu2 ) and I will be attending #EMNLP2023 in Singapore this week. We will bring four papers to present (Cooperative LLM agents, Fine-grained Explainable Metric(reward model), Auto planning LLM, and Multilingual Extrapo).
0
1
41
0
1
20
@lileics
Lei Li
2 years
Glad to connect with many friends at #KDD2022 . I will give a talk on User engagement and activeness modelling for social media platforms (Tuesday morning room 202A). The paper is available at
Tweet media one
Tweet media two
Tweet media three
0
2
20
@lileics
Lei Li
3 years
Mingxuan and I will be giving #ACL2021NLP tutorial on Pre-training methods for Neural Machine Translation. The video is pre-recorded. We will be live during the QA session 10-11pm July 31 and 9-10am August 1. More info at #NLProc
0
1
19
@lileics
Lei Li
1 year
Tonight, I will present @jiangjie_chen 's work on editing and correcting false natural language statement at #AAAI2023 tonight Feb 10, 7-9pm during the poster session and on Sunday Feb 12, 2:00pm in room 147B. Paper:
Tweet media one
0
3
18
@lileics
Lei Li
9 months
Evaluation is super important in NLP, recommender systems, search engines, etc. This Friday (tomorrow) 12:30pm Fernando Diaz will bring a very exciting talk on Preference-based evaluation at @LTIatCMU colloquium. Highly encouraged to attend! @SCSatCMU @MonaDiab77 @dmort27
Tweet media one
0
1
16
@lileics
Lei Li
3 years
UCSB NLP group is admitting multiple Ph.D. students for 2022. Welcome to apply! With a beautiful campus, beach and sunshine, it is a great place to do research. @ucsbcs
@WilliamWangNLP
William Wang
3 years
UC Santa Barbara NLP Group is recruiting Ph.D. Students for 2022! PLEASE NOTE: The UCSB Computer Science department has eliminated the GRE requirement for Fall 2022 applicants. Everyone is welcome to apply! #NLProc
Tweet media one
3
9
71
0
0
17
@lileics
Lei Li
3 years
Human brain tends to activate the same zone and paths when listening to voice and reading text. Inspired by this brain activity pattern, Chimera learns shared semantic space for speech translation. #ACL2021 check out more at
1
2
16
@lileics
Lei Li
9 months
Congratulations @dan_fried !
@LTIatCMU
Language Technologies Institute | @CarnegieMellon
9 months
Congrats to our own Daniel Fried on winning a 2023 Okawa Research Grant!
0
5
49
1
1
16
@lileics
Lei Li
6 months
Thanks to all the speakers, panelists, presenters, and participants! Thanks to my wonderful co-organizers (who put great effort) for @genbio_workshop @ZhenqiaoSong @WenxianShi @menghua_wu @MinkaiX @jure Regina @StefanoErmon Fan. And moderator Eric! #NeurIPS2023 see you next yr!
@genbio_workshop
GenBio Workshop @Neurips2023
6 months
On behalf of all the organizers, we'd like to say a huge THANK YOU to everyone who came out to our workshop this year!!! We were honored to have so many speakers, attendees, and amazing conversations. It was our first time organizing, so thank you so much for making it blast 🎉
Tweet media one
1
4
39
0
2
17
@lileics
Lei Li
2 years
Welcome to career panel at #AAAI2022 Doctoral Consortium Today (2/23) 5:50pm - 6:30pm PST. Panelists include Yuandong Tian from FAIR, Matt Gombolay from Gatech, Ferdinando Fioretto from Syracuse U. Ruben Glatt from LLNL @tydsh @MatthewGombolay @nandofioretto @RealAAAI
@lileics
Lei Li
2 years
#AAAI2022 Doctoral Consortium schedule is available at
0
0
2
1
2
16
@lileics
Lei Li
3 years
Happening Monday 6-7am EDT (=6pm Beijing time) in the Machine Translation session of #ACL2021NLP , we will present two multilingual machine translation papers (mRASP2 and LaSS, 1st and 5th), both for multilingual machine translation. welcome to attend. more details to follow.
0
0
15
@lileics
Lei Li
10 months
Great keynote talk by @erichorvitz at #KDD2023 . I am already familiar with many of the results, yet the talk goes truly inspirational and insightful! Really love the list of challenges for LLM and AI broadly. Excellent topics for (multiple) phd thesis!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
2
14
@lileics
Lei Li
3 years
Lihua Qian will present our paper “Glancing Transformer for non-autoregressive neural machine translation” during the MT session 5pm PDT (=8am Beijing time) at #ACL2021NLP . Notably Glancing Transformer (or GLAT) achieves the top BLEU score at machine translation contest WMT21!
1
1
14
@lileics
Lei Li
3 years
Welcome to attend Responsible Machine Learning Summit on AI and Social Good tomorrow. Keynote talks from Russell and Sandholm. Open to public (virtual).
@ucsbcrml
UCSB Center for Responsible Machine Learning
3 years
We are looking forward to tomorrow's virtual summit! Keynote Talk: Stuart Russell "Human Compatible AI" Keynote Talk: Tuomas Sandholm "Sample Complexity of (Automated) Mechanism Design" registration/info:
Tweet media one
0
5
7
0
0
14
@lileics
Lei Li
8 months
When evaluating the quality of AI generated text, we do not just want a score. We also want an explanation! Why is this translation incorrect What errors does it contain? How severe are they? We normally need linguistic expert. Our InstructScore is exactly a model for that.
@WendaXu2
Wenda Xu
1 year
What is missing in the text generation evaluation for BERTScore, BLERUT, COMET, SEScore & SEScore2? Explanation! Can we build a metric that not only produces a well-correlated quality score but also tell you the rationales, error type, and error location? Checkout InstructScore!
Tweet media one
7
14
86
1
1
13
@lileics
Lei Li
11 months
Recognizing and associating knowledge from heterogeneous sources are challenging. I am honored to join the panel at #ACL2023NLP Matching workshop. Thanks @estevamhruschka and colleagues organizing this wonder workshop!
@Matching_wksh
Matching Workshop
1 year
@lileics will be joining our Matching workshop panel on July 13th at #ACL2023 ! #ai #MachineLearning #NLProc #data @aclmeeting
Tweet media one
0
4
7
1
0
14
@lileics
Lei Li
1 year
Congratulations to all students participated in the ICPC @ucsbcs ! Big congratulations to UCSB NLP team (Siqi Ouyang @siqi_ouyang , Kexun Zhang, Alfonso Amayuelas @AlfonAmayuelas ) winning the runner-up! NO, they are not using ChatGPT... #NLProc
Tweet media one
@lileics
Lei Li
1 year
Impressive! What is the final standing?
1
1
2
1
2
14
@lileics
Lei Li
3 years
This is wonderful! and truly shows the power of cutting-edge machine translation research for the extremely challenging problem. MT is saving culture and history. @mohitban47 @byryuer
@unccs
UNC Computer Science
3 years
Check out this @UNC article+video about the amazing work being done by our @uncnlp 's Shiyue Zhang ( @byryuer ) on helping revitalize Cherokee language using NLP research, with Prof. Mohit Bansal ( @mohitban47 ) and Prof. Ben Frey ( @AMST_UNC @UNCLinguistics )! @UNCResearch @unccollege
0
12
31
2
0
13
@lileics
Lei Li
2 years
AAAI-23 invites your participation in the Doctoral Consortium. Students are encouraged to apply. You will receive advice on both research and career planning from senior researchers. Senior members are welcome to join mentor team. @RealAAAI
0
3
12
@lileics
Lei Li
3 years
Wonderful year for #NeurIPS . Congratulations to all my colleagues and collaborators @ucsbcs ! In addition to those, @yuxiangw_cs and his team also have 5 papers on differential privacy to appear in #NeurIPS2021
@WilliamWangNLP
William Wang
3 years
Congratulations UCSB #NLProc student and faculty authors who have contributed to 8 papers at #NeurIPS this year! We will present in the areas of QA, robust learning, MT, dialogue, ASR, GAN, XAI, and Vision & Language. #NeurIPS2021
Tweet media one
1
8
53
1
1
12
@lileics
Lei Li
11 months
We will present GINSEW paper at 11am tmr (Wed Jul 26) in Hall 1 #708 . Welcome to drop by if you want to hear more about watermarking an LLM to prevent unauthorized distillation or model stealing. Paper: #ICML2023 @xuandongzhao @yuxiangw_cs
Tweet media one
@xuandongzhao
Xuandong Zhao
11 months
Heading to Hawaii for #ICML2023 ! Excited to present our work on text/image watermarking & privacy protection for LLMs. If you're interested in building trustworthy #GenrativeAI , my co-authors @yuxiangw_cs @lileics @kexun_zhang and I would love to meet up and chat!
Tweet media one
1
4
43
0
2
12
@lileics
Lei Li
2 years
Congratulations to @YejinChoinka for the MacArthur genius award! Well deserved! I and my group have learned so much from her research, and her many inspiring talks! Great encouragement for the #NLProc community.
@etzioni
Oren Etzioni
2 years
I couldn't be more proud of my brilliant @allen_ai and @uwcse colleague @YejinChoinka who just won a MacArthur genius grant. I've learned so much from her, and the best is yet to come!
0
13
119
0
0
11
@lileics
Lei Li
3 years
or you want to meta-teach meta-learning meta-students? :-) I guess zooming is not at meta-level yet.
@kchonyc
Kyunghyun Cho
3 years
my friends, how do you feel to have become meta employees? ;)
10
3
180
0
0
11
@lileics
Lei Li
2 years
Given a statement, is it possible to verify its veracity and provide underlying rationale for the prediction? Our new work on LOREN tries to combine logical reasoning and neural inference together. Welcome to the poster session and talk at #AAAI2022 . 2/3
Tweet media one
@lileics
Lei Li
2 years
#AAAI2022 show is on. Welcome to attend our talks at AAAI. The first paper is about counterfactual generation of stories. Using constrained Monte-Carlo sampling, EDUCAT is able to generate plausible counter-factual stories. Talk happens in 1hr. 1/3
Tweet media one
2
0
6
3
4
11
@lileics
Lei Li
6 months
Excellent talk by Ellen Zhong @ZhongingAlong on structure prediction based on cryoEM imaging. It is similar to human pose estimation from images but more complex and more keypoints (many more atoms on a protein)
Tweet media one
Tweet media two
Tweet media three
@ZhongingAlong
Ellen Zhong
6 months
You can find me at #NeurIPS23 🎷🎶 this week! I will be @workshopmlsb 💖 on Fri and the @genbio_workshop and the Deep Learning & Inverse Problems workshop on Sat. Please reach out if you want to chat!
0
3
26
0
2
11
@lileics
Lei Li
2 years
@WilliamWangNLP Badge unlocked -- 100% acceptance rate for all my submissions (3) to a single AI conference. Tip: better to focus on fewer but important ones! (the last time so close was CVPR 2021 with 4/5 accepted and ACL 2021 with 11/16 accepted.)
0
0
11
@lileics
Lei Li
6 months
Agree with most except 3. Evaluation based on benchmarks will no longer be the best way to evaluate LLMs. but there are other methods based on statistical inference, e.g. the Karr approach
@Thom_Wolf
Thomas Wolf
6 months
Some predictions for 2024 – keeping only the more controversial ones. You certainly saw the non-controversial ones (multimodality, etc) already 1. At least 10 new unicorn companies building SOTA open foundation models in 2024 Stars are so aligned: - a smart, small and dedicated
19
77
411
0
1
12
@lileics
Lei Li
8 months
Looking forward to collaborating with co-presenters to deliver #NAACL2024 tutorial on Combating Security and Privacy Issues in the Era of Large Language Models @muhao_chen @ChaoweiX @hhsun1 @AnimaAnandkumar
@LeonDerczynski
Leon Derczynski ✍🏻🌱🌧️
8 months
Our LLMsec tutorial is accepted to NAACL 24! "Combating Security and Privacy Issues in the Era of Large Language Models" w/ @muhao_chen , @ChaoweiX , @hhsun1 , @lileics & @AnimaAnandkumar See you in Mexico and online :)
0
5
32
0
2
11
@lileics
Lei Li
6 months
Welcome to visit our poster. #609 Thursday morning 10:45am-12:45pm. How to tell if an LLM reliably knows some knowledge? We propose Karr method (knowledge assessment risk ratio). Paper: @qx_dong @ikekong
Tweet media one
@lileics
Lei Li
6 months
I will attend #NeurIPS2023 in New Orleans with my students @kexun_zhang @ZhenqiaoSong @xuandongzhao @qx_dong excited to meet and discuss about our most recent work on program generation, knowledge assessment, LLM tool use, watermark, goal-conditioned RL etc.
Tweet media one
0
2
24
0
3
10
@lileics
Lei Li
2 years
I echo much of the reflections from @AlonHalevy about working in and leading industry labs. I wish I could have read this piece 6 years ago while found my previous lab at industry.
0
0
10
@lileics
Lei Li
2 years
What are all time classic papers? Suggestions are welcome. @yuxiangw_cs and I are compiling a list of classics for our grad #MachineLearning course.
@yuxiangw_cs
Yu-Xiang Wang
2 years
What are some *timeless classics* among #MachineLearning papers in your opinion? I am compiling a list of papers for new grad students in my ML course with @lileics , for them to develop good research taste. #NeurIPS #icml #iclr
37
37
224
0
0
9
@lileics
Lei Li
3 years
5050 language pairs! wow!!
@WilliamWangNLP
William Wang
3 years
Our #EMNLP2021 paper "A Massively Multilingual Analysis of Cross-linguality in Shared Embedding Space" is now available! Our intrepid intern Alex Jones studied various linguistic factors in x-ling pretrained LMs for 101 langs and 5,050 lang. pairs. #NLProc
Tweet media one
1
9
46
0
1
10
@lileics
Lei Li
9 months
Kinda funny, I asked ChatGPT to create a tweet for me. see below. 🎙️ Hey yinzers at @CarnegieMellon ! 🖐️ Get ready to turn "yinzpiration" into action with my talk on "Empowering Responsible Use of Large Language Models"! 🤖💡 1/3
@lileics
Lei Li
9 months
Happy to give a talk at @LTIatCMU Colloquium on "Empowering Responsible Use of Large Language Models" this Friday Sep. 8 12:30pm at Posner Hall A35. I will talk about how can we robustly detect AI generated text and protect copyright of LLMs. @SCSatCMU
Tweet media one
1
12
96
0
0
10
@lileics
Lei Li
1 year
Welcome to my talk on Multilingual Machine translation at the Stats Seminar today 3:30pm at HSSB 1174. Thanks @ucsbpstat for organizing this event!
@ucsbcs
UCSB Computer Science Department
1 year
IN ONE HOUR: Head down to HSSB 1174 for CS Professor Lei Li’s talk on his research in multilingual machine translation systems. See you there!
Tweet media one
0
0
1
0
1
8
@lileics
Lei Li
6 months
How to advance Foundation model for bio? Le: borrow techniques from nlp/cv, multimodal data, new objectives, consider cell as a system! Evan: Far less bio data. (Expensive) Human in the loop. Max: more data. More expert knowledge. Combine and align! Kyunghyun:build feedback loop
@lileics
Lei Li
6 months
Impact of gen ai bio on market? Kyunghyun: early stage ai drug startups are not ambitious enough! Rethink the whole process of drug discovery! (Not just on one compound) @kchonyc @genbio_workshop
1
0
3
0
0
10
@lileics
Lei Li
6 months
The first author (Hongqiao Chen @hongqiao_chen ) of this ToolDec paper is a high schooler. He reached USACO platinum-level and already shows great talent in research.
@lileics
Lei Li
6 months
see you this afternoon 3pm in Neurips MATH-AI workshop in room 217-219. Kexun @kexun_zhang will present our latest work on enable accurate and generalizable tool use for LLMs Paper:
0
2
6
0
0
9
@lileics
Lei Li
2 years
Our recent work CRT with @xuandongzhao @yuxiangw_cs make it possible to train language models such as GPT/T5 on private corpus without memorizing confidential information (e.g. user's name, phone number and address). It provides a provably confidential bound. #NLProc @ucsbcs
@xuandongzhao
Xuandong Zhao
2 years
Happy to share that our paper "Provably Confidential Language Modelling" got accepted to #NAACL2022 ! with @yuxiangw_cs , @lileics We propose a method to train language generation models while protecting the confidential segments #NLProc
Tweet media one
0
1
11
0
0
9
@lileics
Lei Li
11 months
[ACL2023] Introducing LegoMT: building machine translation models like stacking lego blocks. This paper introduces a novel approach to translation using a modular and detachable multilingual neural network structure inspired by Lego building blocks. 1/2
Tweet media one
@lileics
Lei Li
11 months
[ACL2023] We present Word Aligned Contrastive Learning for speech translation, which only needs one hour of data to train speech translation, especially suitable for low-resource languages. Paper: Code: #ACL2023NLP
Tweet media one
0
0
7
0
1
9
@lileics
Lei Li
2 years
Do users engage more on video-sharing platforms if they are provided with more diverse content? We analyze the user daily stay time and their content diversity on two popular video sharing platforms. #KDD2022 1/2
0
0
8
@lileics
Lei Li
3 years
Excellent tips from @yisongyue for CS academic job applications!!
@yisongyue
Yisong Yue
3 years
Just updated my tips for CS Faculty Applications! Hope people find it useful!
2
50
231
0
1
9
@lileics
Lei Li
6 months
thank you for sharing your great insights at @genbio_workshop ! Highlights From Dr. Le Song: important to borrow techniques from nlp/cv to build foundation model for biology/drug design. integrate multimodal data. designing new learning objectives. should model cell as a system
Tweet media one
@BiomapI
BioMap Inc.
6 months
Dr. Le Song, our CTO, delivered an impressive discussion at Generative AI and Biology Workshop @NeurIPSConf ! Kudos to the @genbio_workshop for hosting the discussion! #GenerativeAI #Biology #NeurIPS2023 #BioMap
Tweet media one
3
1
9
0
0
10
@lileics
Lei Li
6 months
I am looking for one postdoc on Large Language Model safety and security through CBI program. If the topic fits your interest, please feel free to reach out and apply at
0
0
8
@lileics
Lei Li
1 year
Google could try our algorithm to correct the factual error ==> VENCE correction algorithm.
Tweet media one
@Grady_Booch
Grady Booch
1 year
Misinformation at scale; bullshit as a service. (The European Very Large Telescope- not the JWST - took the first optical photograph of an exoplanet in 2004.)
18
63
384
0
1
8
@lileics
Lei Li
8 months
Strongly support the open source for academic research. a research has to be examined and verified by the public. a project that only claims to "novel" or "best" or "state-of-the-art" but publicly verifiable material is called PR.
@atalwalkar
Ameet Talwalkar
8 months
I agree with @ylecun . Relatedly, open-source is the *only* way for academic researchers to contribute. I’m surprised this is even being debated.
0
2
14
0
1
8
@lileics
Lei Li
2 years
Can language understanding skills acquired in one language be transferred to other (even dissimilar) languages? We observe that cross-lingual transfer works best for high-resource and similar languages (not surprisingly), but not for low-resource and distant ones. 1/3 #ICLR2022
Tweet media one
0
1
8
@lileics
Lei Li
2 years
Multilingual joint training proves beneficial for getting a unified translation model for many languages. Our work switch-GLAT shares an important insight: a truly multilingual translation model benefits from the capability of generating word-level switched codes. 1/3 #ICLR2022
Tweet media one
@SongZhenqiao
Jocelyn Song
2 years
Our paper "switch-GLAT: Multilingual Parallel Machine Translation via Code-Switch Decoder" will be presented at ICLR 2022 poster session 1 on April 25th at 5:30 pm. Welcome to discuss in GatherTown!
0
0
1
0
0
8
@lileics
Lei Li
3 years
Finding novel chemical molecules for drug purpose is just like generating sentences in natural language. check out the latest #ICLR2021 paper: MARS with code at
0
0
8
@lileics
Lei Li
5 months
@ysu_nlp Thank you for hosting me! Wonderful to meet with faculty and students at OSU. Great to learn all the exciting work here!
0
0
8
@lileics
Lei Li
3 years
Fast Transformer training and inference with LightSeq 2.0. The new release introduce accelerated training on GPU with optimized CUDA kernels (3x speedup). for machine translation, text generation, text classification, matching and many other NLP tasks.
0
1
8
@lileics
Lei Li
8 months
Beautiful place to do research! 👇
@cestlemieux
Caroline Lemieux
8 months
For those on the academic job market: @UBC_CS has two positions open in AI/ML this year! Come join us :). Full ad here:
Tweet media one
3
7
56
0
0
7
@lileics
Lei Li
1 year
Excited to have Danny Zhou visiting UCSB!
@WilliamWangNLP
William Wang
1 year
We are super excited about Google Brain's @denny_zhou 's visit to UCSB Computer Science next week. He will talk about "Teach Language Models to Reason." Date: Monday, January 16th, 2022. Time: 3:30 - 4:30 pm. Location: Henley 1010. See you there!
Tweet media one
2
9
58
0
0
7
@lileics
Lei Li
3 years
or use the ocean power, which we have a lot 😄
@WilliamWangNLP
William Wang
3 years
Just right after the first week of a new academic quarter, our group brought down the entire rack of GPU servers due to excessive current draw... 😅 The need for #GreenAI and #GreenNLProc is imminent.
1
0
30
0
0
7
@lileics
Lei Li
3 years
Release two datasets for text summarization. The first is a large-scale document summarization for Chinese news, with 300k articles, average length 730 characters, and human written summary. Additional annotation of adequacy & deducibility. #NLProc 1/2
0
1
7
@lileics
Lei Li
2 years
Shocked. I was explaining his ResNet to students in this year’s 165B deep learning course. A huge loss for the community. R.I.P Dr. Sun.
@YiMaTweets
Yi Ma
2 years
I was shocked to know that Dr. Jian Sun, my former colleague of the MSRA Visual Computing Group, has passed away. We will miss him dearly. May his soul rest in peace.
20
61
603
0
0
7
@lileics
Lei Li
11 months
[ACL2023] We present Word Aligned Contrastive Learning for speech translation, which only needs one hour of data to train speech translation, especially suitable for low-resource languages. Paper: Code: #ACL2023NLP
Tweet media one
@lileics
Lei Li
11 months
[ACL2023] On the second day of the poster session, I presented the work WACO on speech translation. I managed to put up the other two posters on multi-language translation Lego-MT and automatic evaluation indicator SEScore2. Indeed, people asked to hear about all three papers.
Tweet media one
0
1
64
0
0
7
@lileics
Lei Li
11 months
[ACL2023] LegoMT: We trained a unified translation model for 433 languages, using seven languages as pivots. This model cover the most number of languages -- more than NLLB. It is relatively compact, with only 1.2 billion parameters (NLLB has 54B). Paper:
Tweet media one
@lileics
Lei Li
11 months
[ACL2023] Introducing LegoMT: building machine translation models like stacking lego blocks. This paper introduces a novel approach to translation using a modular and detachable multilingual neural network structure inspired by Lego building blocks. 1/2
Tweet media one
0
1
9
0
1
7
@lileics
Lei Li
11 months
Come to Tomorrow (Tue July 25) 11-12:30 in Exhibit Hall 1 poster 544 for details of ReDi: a generic and efficient inference method for diffusion models. @kexun_zhang and I will be there. Welcome to drop by. Paper: Code: #ICML2023
Tweet media one
Tweet media two
@lileics
Lei Li
11 months
ReDi improves the inference speed of diffusion by 2 times and it works for any underlying sampler for Diffusion. The core idea is to generate a few initial steps using sampler, retrieve similar trajectories, and use the retrieved trajectory to continue polish the image. #ICML2023
0
0
4
0
0
7
@lileics
Lei Li
2 years
Cool! PyTorch supports training on Mac's M1 GPU! Time to heat my computer!
@PyTorch
PyTorch
2 years
We’re excited to announce support for GPU-accelerated PyTorch training on Mac! Now you can take advantage of Apple silicon GPUs to perform ML workflows like prototyping and fine-tuning. Learn more:
Tweet media one
79
710
3K
0
0
7
@lileics
Lei Li
8 months
Excellent initiative! Looking forward to this new dedicated conference on large language model!
@srush_nlp
Sasha Rush
8 months
Introducing COLM () the Conference on Language Modeling. A new research venue dedicated to the theory, practice, and applications of language models. Submissions: March 15 (it's pronounced "collum" 🕊️)
Tweet media one
34
437
2K
0
0
7
@lileics
Lei Li
2 years
@roydanroy some biased stats: 4 former colleagues (including myself) moved to universities, and another is considering. one person was moving from university to industry. hard to compare the rate.
2
0
6
@lileics
Lei Li
2 years
Xianjun from @ucsbcs wrote a detailed blog about speech translation and a recent model Chimera. The core idea is motivated from neural cognitive science -- a model produces similar representation for a speech and its corresponding transcript text in a shared space. #NLProc 3/N
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@lileics
Lei Li
2 years
Huake wrote a blog about recently popular learned metrics for text generation and MT, BERTScore and COMET. These two have shown high correlation with recent human evaluation metric like MQM( of @markuseful ) . #NLProc 2/N @ucsbNLP
1
0
3
1
3
6
@lileics
Lei Li
2 years
#AAAI2022 show is on. Welcome to attend our talks at AAAI. The first paper is about counterfactual generation of stories. Using constrained Monte-Carlo sampling, EDUCAT is able to generate plausible counter-factual stories. Talk happens in 1hr. 1/3
Tweet media one
2
0
6
@lileics
Lei Li
2 years
Not attending #ACL2022 in person? Do not miss the opportunity to interact with authors of seven papers at ACL in virtual sessions. I will definitely join May 24 11am session if not all. Here is a list.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
6
@lileics
Lei Li
1 year
2023 AAAI Doctoral Consortium full schedule is here: @RealAAAI #AAAI23
@lileics
Lei Li
1 year
Join us at 2023 AAAI Doctoral Consortium. This year we have 20 phd students from all areas of AI and 20 senior mentors. Don't miss the Keynotes from Amy Zhang( @yayitsamyzhang ) and Pulkit Agrawal ( @pulkitology ). All at AAAI are welcome. @RealAAAI #AAAI23 @j_foerst
0
3
24
0
1
6
@lileics
Lei Li
6 months
Cannot wait to hear from our great lineup of speakers and panelists at #NeurIPS2023 Generative AI for Biology workshop tomorrow Dec 16 at room 265-268. @genbio_workshop full schedule:
Tweet media one
@genbio_workshop
GenBio Workshop @Neurips2023
6 months
Hope to see you all tomorrow! #NeurIPS2023
Tweet media one
0
5
30
0
2
6