Lei Li @lileics Twitter profile

Last Seen Profiles

@xzhki13

@ilvesfootball

@Mackpharr01

@m0tt23

@1256hima

@shibakazs

@BjpUttar

@thottbaio

@bxjdudbd

@wudjdaw

@Mahmoud92927319

@BoutzBad

@pessoas1Qi

@itsTemiOG

@RobloxDino93587

@nonoo9999999

@Im_your_son_

@MoncrieffMusic

@_kuropapu

@NoeRuizjr

@KrampusKuro

@sunset_ge

@boomed__

@CR_Serbia

@dshoham

@hwood60007

@ACharoun

@Taco_Man_289

@Wessleyn_

@CP_3_

@waruyoiseijin

@stwmaniax

@neuralconsult

@KirklistonPS

@raiauad

@morikai3

Lei Li

@lileics

3 years

Very exciting to join @ucsbcs . Excited to working with great folks William and Shiyu, and many others! @WilliamWangNLP @CodeTerminator

William Wang

@WilliamWangNLP

3 years

UCSB #NLProc Group welcomes new faculty and co-Director Prof. Lei Li @lileics : Don’t forget to check out his #ACL2021NLP Best Paper on optimal transport for vocabulary reduction next week.

6

5

108

20

6

192

Lei Li

@lileics

3 years

Honored and flattered to receive the best paper award of #ACL2021NLP . Thank to wonderful co-authors Jingjing, Hao, Chun and Zaixiang, to the extraordinary MLNLC team at Bytedance AI Lab, and zhenyuan, weiying, hang, and ACL committee for support! to Olympic sprinter #SuBingtian

ACL 2024

@aclmeeting

3 years

Congratulations to the authors who have won our paper awards! #ACL2021NLP #NLProc

0

28

138

2

18

109

Lei Li

@lileics

1 year

I’m excited to share that I’ve received an Amazon Research Award for my proposal "Real-time robust simultaneous interpretation with few samples" at UCSB. Learn more about the program on the @AmazonScience web: #AmazonResearchAwards @ucsbcs @UCSBengineering

79 Amazon Research Awards recipients announced

Awardees, who represent 54 universities in 14 countries, have access to Amazon public datasets, along with AWS AI/ML services and tools.

www.amazon.science

5

98

Lei Li

@lileics

9 months

Happy to give a talk at @LTIatCMU Colloquium on "Empowering Responsible Use of Large Language Models" this Friday Sep. 8 12:30pm at Posner Hall A35. I will talk about how can we robustly detect AI generated text and protect copyright of LLMs. @SCSatCMU

1

12

96

Lei Li

@lileics

2 years

How to teach an AI model to reason about concepts? We release a benchmark, E-KAR, for analogical reasoning in English and Chinese. It is extremely challenging (see example next). Few AI models could do it well. Welcome to try the task at #ACL2022

2

14

73

Lei Li

@lileics

11 months

The session chair told me that the fully-zero shot talk by @xuandongzhao is the best received talk in the LLM, which attracts hundreds of audience at #ACL2023NLP . A line of people were asking questions. @siqi_ouyang

0

6

69

Lei Li

@lileics

7 months

Over the years, several of my former colleagues at Bytedance AI lab left -- @weiyingma and Hao Zhou joined Tsinghua University as faculty, @QiuqiangK joined Chinese university of HK, I left two years ago to join UCSB and now CMU. I consider this as fulfilling one original mission

Yann LeCun

@ylecun

7 months

Ross Girschick ( @inkynumbers ) is leaving FAIR for AI2, following @sainingxie who joined the NYU faculty (yay!), and @georgiagkioxari who went to Caltech, and Hervé Jégou and @AlexDefosse who went to non-profit Kyutai. It's a loss for FAIR, but I'm happy for them. There is

23

58

1K

1

4

69

Lei Li

@lileics

11 months

[ACL2023] On the second day of the poster session, I presented the work WACO on speech translation. I managed to put up the other two posters on multi-language translation Lego-MT and automatic evaluation indicator SEScore2. Indeed, people asked to hear about all three papers.

Lei Li

@lileics

11 months

My group will present 5 papers at #ACL2023NLP . I will be onsite for all these papers. @xuandongzhao Siqi @WendaXu2 @jiangjie_chen Fei will join virtually on underline/gathertown. Welcome to talk to me or co-authors!

0

4

46

0

1

64

Lei Li

@lileics

6 months

We had a similar story. our paper on vocabulary learning received all rejections in ICLR2021 (). We revised the paper and submitted to ACL 2021. It received the best paper. But I do not complain the review. We did take comments and revised substantially.

Information-theoretic Vocabularization via Optimal Transport for...

It is well accepted that the choice of token vocabulary largely affects the performance of machine translation. One dominant approach to construct a good vocabulary is the Byte Pair Encoding method...

openreview.net

Gautam Kamath

@thegautamkamath

6 months

You might have heard that word2vec, test of time award winner at #NeurIPS2023 , was rejected from #ICLR2013 back in the day. Interestingly, one reviewer felt so strongly that they recommended "Strong Reject" four times.

43

144

1K

0

4

57

Lei Li

@lileics

4 months

Using LLM to evaluate its own generation quality will lead to bias --- it favors its own generation comparing to other models'. What is the consequence of such bias in LLM's generation? check out new paper:

Perils of Self-Feedback: Self-Bias Amplifies in Large Language Models

Recent studies show that self-feedback improves large language models (LLMs) on certain tasks while worsens other tasks. We discovered that such a contrary is due to LLM's bias towards their own...

arxiv.org

Wenda Xu

@WendaXu2

4 months

[New paper!] Can LLMs truly evaluate their own output? Can self-refine/self-reward improve LLMs? Our study reveals that LLMs exhibit biases towards their output. This self-bias gets amplified during self-refine/self-reward, leading to a negative impact on performance. @ucsbNLP

5

53

210

2

11

49

Lei Li

@lileics

6 months

Anima Anandkumar delivered a highly impressive talk on generative models for predicting new mutations of coronavirus, generating sequence following phylogeny of SARS-COV-2 (genome-scale LM), and 2000x faster md simulation by NeuralMD @AnimaAnandkumar @genbio_workshop

Prof. Anima Anandkumar

@AnimaAnandkumar

6 months

Looking forward to my talk tomorrow afternoon at #NeurIPS2023 @genbio_workshop

1

2

25

0

4

45

Lei Li

@lileics

11 months

My group will present 5 papers at #ACL2023NLP . I will be onsite for all these papers. @xuandongzhao Siqi @WendaXu2 @jiangjie_chen Fei will join virtually on underline/gathertown. Welcome to talk to me or co-authors!

0

4

46

Lei Li

@lileics

4 months

Introducing LingoLLM. LLMs like GPT4 can still fail to process an endangered language such as Gitksan and Manchu. We develop a generic and training-free method to enable language understanding and translation capability for any new language (with some linguistic description).

Kexun Zhang

@kexun_zhang

4 months

🚀Fire linguists in the LLM era? No! Excited to share LingoLLM, a novel method for processing endangered languages with linguistic resources. LingoLLM improves the translation of many endangered languages from 0 to 10.5 BLEU! It helps other tasks as well!

8

45

158

0

7

44

Lei Li

@lileics

5 months

Glad and excited to give a talk in @CarnegieMellon @CyLab Jan 22. I will present the challenge about "how to detect AI generated text and images" and share our recent progress on AI watermark techniques.

CyLab

@CyLab

5 months

We’re excited to welcome @CarnegieMellon ’s own @lileics , Assistant Professor at @LTIatCMU , to the @CyLab Seminar series next Monday, January 22 at 12 p.m. ET to discuss attacks and robust watermarking for generative #AI .

1

0

1

3

42

Lei Li

@lileics

6 months

My students Danqing Wang ( @dqwang122 ) and Wenda Xu ( @WendaXu2 ) and I will be attending #EMNLP2023 in Singapore this week. We will bring four papers to present (Cooperative LLM agents, Fine-grained Explainable Metric(reward model), Auto planning LLM, and Multilingual Extrapo).

0

1

41

Lei Li

@lileics

11 months

The generation metric paper SEScore2 was at the best spot to present, right to the entrance of exhibit hall at #ACL2023NLP . Huge enthusiasm even 30 mins after the session ends. congrats to co-authors @WendaXu2 @WilliamWangNLP Mingxuan Wang and Xian Qian.

0

3

38

Lei Li

@lileics

8 months

Turning a pre-trained large language model into using any external tools (by api call), our latest ToolDec does not require any in-context example to demonstrate, nor any fine-tuning. ToolDec generalizes to orders of magnitude more tools.

Kexun Zhang

@kexun_zhang

8 months

😭Tired of in-context demos & docs for LLM tool use? 💰Too GPU-poor to tune LLMs for unseen tools? 🤬Frustrated with frequent syntax errors in tool calls? Check out our new preprint 𝐓𝐨𝐨𝐥𝐃𝐞𝐜 that addresses all these issues from the decoding side! 1/5

4

32

105

1

7

35

Lei Li

@lileics

4 months

you may still use ChatGPT to evaluate. but be aware that ChatGPT favors ChatGPT, GPT4's generation. (so even if Gemini/LLaMA may generate the same quality text, they won't be preferred by chatGPT). Such bias may be more severe if you use it for self-refinement.

Boyang "Albert" Li

@AlbertBoyangLi

4 months

So what happens to ChatGPT as an evaluation metric of text quality? Do we know exactly what we are evaluating?

0

1

3

5

36

Lei Li

@lileics

5 months

🚀🤖 Excited to drop some serious "Buckeye Brainpower" at #OSU ! 🌰📚 Join me as I unravel the secrets of "Self-assisting and Cooperative Large Language Models" at 12pm in POM301. 👾 I'll be spilling the beans on how LLMs can write code like it's a poetry slam and LLMs cooperate!

0

3

36

Lei Li

@lileics

1 year

Adapting a pre-trained language model to new tasks often requires in-context labeled examples for demonstration. Can LLM be adapted to new tasks in fully zero-shot setting (no fine-tuning, no in-context demo)? Check out NPPrompt (nonparametric prompt) for LLM at #ACL2023NLP 👇

Xuandong Zhao

@xuandongzhao

1 year

Thrilled to announce that our paper "Pre-trained Language Models can be Fully Zero-Shot Learners" has been accepted to #ACL2023 !

1

8

32

0

9

35

Lei Li

@lileics

6 months

If you want to study LLMs, MT, AI4Science(Drug/Protein design), and broader NLP in grad school, please apply to @LTIatCMU We have a great group of faculty: ddl: Dec 13!

Graham Neubig

@gneubig

7 months

If you want to study NLP, LLMs, or broader language technology in grad school, please apply to @LTIatCMU ! We have a great group of faculty covering many topics: I personally will be recruiting students on LLMs/agents/evaluation.

0

59

237

0

5

32

Lei Li

@lileics

3 years

Congratulations to Prof. Yu-Xiang Wang and his student Dheeraj of @ucsbcs winning the best student paper in #COLT2021 , the top theoretical machine learning conference! It is about Optimal Dynamic Regret in Exp-Concave Online Learning, very interesting! They’ll present on 8/16.

0

2

30

Lei Li

@lileics

11 months

My students and I are attending ICML at Honolulu Hawaii. We will present three main conference papers and four workshop papers. Topics are 2x faster diffusion algorithm, protein sequence design, protecting language model with invisible watermark. Welcome to ping me or stop by.

0

1

30

Lei Li

@lileics

2 years

Last quarter, students at UCSB CS taking my course on machine translation () wrote a series of wonderful blogs about recent works in MT. In the following series, I will share some of these blogs. 1/N #NLProc @ucsbcs @ucsbNLP

0

4

29

Lei Li

@lileics

6 months

Highlights of Shuiwang Ji’s talk: generative models (g-spherenet, graphbp, symat) are effective for target based drug design, Crystalline Materials, and proteins! @ShuiwangJi @genbio_workshop

GenBio Workshop @Neurips2023

@genbio_workshop

6 months

Prof Wang is speaking now!

0

3

0

4

26

Lei Li

@lileics

1 year

Join us at 2023 AAAI Doctoral Consortium. This year we have 20 phd students from all areas of AI and 20 senior mentors. Don't miss the Keynotes from Amy Zhang( @yayitsamyzhang ) and Pulkit Agrawal ( @pulkitology ). All at AAAI are welcome. @RealAAAI #AAAI23 @j_foerst

0

3

24

Lei Li

@lileics

6 months

4. The return of academiaAcademia is back as we saw at NeurIPS 2023. This is consistent with my observation at #NeurIPS2023 My impression is that this year's conferences (and ICML, ACL/EMNLP) are no longer focusing on the performance scores, but on the interestingness of ideas.

Thomas Wolf

@Thom_Wolf

6 months

Some predictions for 2024 – keeping only the more controversial ones. You certainly saw the non-controversial ones (multimodality, etc) already 1. At least 10 new unicorn companies building SOTA open foundation models in 2024 Stars are so aligned: - a smart, small and dedicated

19

77

411

0

1

25

Lei Li

@lileics

6 months

I will attend #NeurIPS2023 in New Orleans with my students @kexun_zhang @ZhenqiaoSong @xuandongzhao @qx_dong excited to meet and discuss about our most recent work on program generation, knowledge assessment, LLM tool use, watermark, goal-conditioned RL etc.

0

2

24

Lei Li

@lileics

10 months

How to design drugs to kill bacteria. Danqing will present LSSAMP work on antimicrobial peptides design Wed 2pm in 201A and Tue 6pm. The core idea is generating the amino acid sequence based on secondary structure and quantized latent space. #KDD2023 Paper:

1

2

22

Lei Li

@lileics

2 years

The fastest review! I am on editorial board of one submission to a journal. After I assigned the reviewers, one of the reviewer accepted, read and wrote a high quality review within half a day. And it was Sunday afternoon! amazing! Do these researchers need holiday?

1

0

20

Lei Li

@lileics

1 year

@xwang_lk Well, I donot think the authors aware of the bad implications from the name. Since BAT is just short for bidirectional autoregressive talker. No need to over interpret.

3

1

21

Lei Li

@lileics

1 year

Great mentor for our students!!! @WilliamWangNLP

UCSB Computer Science Department

@ucsbcs

1 year

This past January, Professor William Wang was awarded the 2023 CRA-E Undergraduate Research Faculty Mentoring Award. Read the COE news release, linked on the CS Website! Link in bio. Congratulations!

2

0

21

1

0

20

Lei Li

@lileics

6 months

🤓🚀Headed to #NUS to spill some digital tea LLM! Join my talk 'Self-assisting and Cooperative Large Language Models' where we'll decode how LLMs are turning into algorithm-solving ninjas, syntax-error police, and your personal study sidekick who learns from their blunders. 🦜💻

1

0

20

Lei Li

@lileics

7 months

The best outcome from a lab is its talent. I share the similar gratitude to Bytedance AI Lab. When @weiyingma and I started the lab seven years ago, one of the (three) mission we set and endorsed by the executive team was to cultivate talents with the best supporting environment.

Yann LeCun

@ylecun

7 months

Ross Girschick ( @inkynumbers ) is leaving FAIR for AI2, following @sainingxie who joined the NYU faculty (yay!), and @georgiagkioxari who went to Caltech, and Hervé Jégou and @AlexDefosse who went to non-profit Kyutai. It's a loss for FAIR, but I'm happy for them. There is

23

58

1K

0

2

19

Lei Li

@lileics

2 years

Releasing MLGSum, another dataset for Text Summarization. It includes 1.2 million articles and summary for 12 languages. Summaries have 30 tokens on average. Idea for multilingual text generation. #NLProc 2/2

Contrastive Aligned Joint Learning for Multilingual Summarization

dqwang122.github.io

Lei Li

@lileics

3 years

Release two datasets for text summarization. The first is a large-scale document summarization for Chinese news, with 300k articles, average length 730 characters, and human written summary. Additional annotation of adequacy & deducibility. #NLProc 1/2

0

1

7

0

3

19

Lei Li

@lileics

6 months

Here are the papers and the presentation schedule at #EMNLP2023

Lei Li

@lileics

6 months

My students Danqing Wang ( @dqwang122 ) and Wenda Xu ( @WendaXu2 ) and I will be attending #EMNLP2023 in Singapore this week. We will bring four papers to present (Cooperative LLM agents, Fine-grained Explainable Metric(reward model), Auto planning LLM, and Multilingual Extrapo).

0

1

41

0

1

20

Lei Li

@lileics

2 years

Glad to connect with many friends at #KDD2022 . I will give a talk on User engagement and activeness modelling for social media platforms (Tuesday morning room 202A). The paper is available at

0

2

20

Lei Li

@lileics

3 years

Mingxuan and I will be giving #ACL2021NLP tutorial on Pre-training methods for Neural Machine Translation. The video is pre-recorded. We will be live during the QA session 10-11pm July 31 and 9-10am August 1. More info at #NLProc

0

1

19

Lei Li

@lileics

1 year

Tonight, I will present @jiangjie_chen 's work on editing and correcting false natural language statement at #AAAI2023 tonight Feb 10, 7-9pm during the poster session and on Sunday Feb 12, 2:00pm in room 147B. Paper:

0

3

18

Lei Li

@lileics

9 months

Evaluation is super important in NLP, recommender systems, search engines, etc. This Friday (tomorrow) 12:30pm Fernando Diaz will bring a very exciting talk on Preference-based evaluation at @LTIatCMU colloquium. Highly encouraged to attend! @SCSatCMU @MonaDiab77 @dmort27

0

1

16

Lei Li

@lileics

3 years

UCSB NLP group is admitting multiple Ph.D. students for 2022. Welcome to apply! With a beautiful campus, beach and sunshine, it is a great place to do research. @ucsbcs

William Wang

@WilliamWangNLP

3 years

UC Santa Barbara NLP Group is recruiting Ph.D. Students for 2022! PLEASE NOTE: The UCSB Computer Science department has eliminated the GRE requirement for Fall 2022 applicants. Everyone is welcome to apply! #NLProc

3

9

71

0

17

Lei Li

@lileics

3 years

Human brain tends to activate the same zone and paths when listening to voice and reading text. Inspired by this brain activity pattern, Chimera learns shared semantic space for speech translation. #ACL2021 check out more at

[ACL Findings 2021] Chimera: Uniformly Representing Speech and Text...

# Published on ACL Findings 2021:Learning Shared Semantic Space for Speech-to-Text Translation; Arxiv: https://arxiv.org/abs/2105.03095; Github: https://gith...

www.youtube.com

1

2

16

Lei Li

@lileics

9 months

Congratulations @dan_fried !

Language Technologies Institute | @CarnegieMellon

@LTIatCMU

9 months

Congrats to our own Daniel Fried on winning a 2023 Okawa Research Grant!

0

5

49

1

16

Lei Li

@lileics

6 months

Thanks to all the speakers, panelists, presenters, and participants! Thanks to my wonderful co-organizers (who put great effort) for @genbio_workshop @ZhenqiaoSong @WenxianShi @menghua_wu @MinkaiX @jure Regina @StefanoErmon Fan. And moderator Eric! #NeurIPS2023 see you next yr!

GenBio Workshop @Neurips2023

@genbio_workshop

6 months

On behalf of all the organizers, we'd like to say a huge THANK YOU to everyone who came out to our workshop this year!!! We were honored to have so many speakers, attendees, and amazing conversations. It was our first time organizing, so thank you so much for making it blast 🎉

1

4

39

0

2

17

Lei Li

@lileics

2 years

Welcome to career panel at #AAAI2022 Doctoral Consortium Today (2/23) 5:50pm - 6:30pm PST. Panelists include Yuandong Tian from FAIR, Matt Gombolay from Gatech, Ferdinando Fioretto from Syracuse U. Ruben Glatt from LLNL @tydsh @MatthewGombolay @nandofioretto @RealAAAI

Lei Li

@lileics

2 years

#AAAI2022 Doctoral Consortium schedule is available at

0

2

1

2

16

Lei Li

@lileics

3 years

Happening Monday 6-7am EDT (=6pm Beijing time) in the Machine Translation session of #ACL2021NLP , we will present two multilingual machine translation papers (mRASP2 and LaSS, 1st and 5th), both for multilingual machine translation. welcome to attend. more details to follow.

0

15

Lei Li

@lileics

10 months

Great keynote talk by @erichorvitz at #KDD2023 . I am already familiar with many of the results, yet the talk goes truly inspirational and insightful! Really love the list of challenges for LLM and AI broadly. Excellent topics for (multiple) phd thesis!

0

2

14

Lei Li

@lileics

3 years

Lihua Qian will present our paper “Glancing Transformer for non-autoregressive neural machine translation” during the MT session 5pm PDT (=8am Beijing time) at #ACL2021NLP . Notably Glancing Transformer (or GLAT) achieves the top BLEU score at machine translation contest WMT21!

1

14

Lei Li

@lileics

3 years

Welcome to attend Responsible Machine Learning Summit on AI and Social Good tomorrow. Keynote talks from Russell and Sandholm. Open to public (virtual).

UCSB Center for Responsible Machine Learning

@ucsbcrml

3 years

We are looking forward to tomorrow's virtual summit! Keynote Talk: Stuart Russell "Human Compatible AI" Keynote Talk: Tuomas Sandholm "Sample Complexity of (Automated) Mechanism Design" registration/info:

0

5

7

0

14

Lei Li

@lileics

8 months

When evaluating the quality of AI generated text, we do not just want a score. We also want an explanation! Why is this translation incorrect What errors does it contain? How severe are they? We normally need linguistic expert. Our InstructScore is exactly a model for that.

Wenda Xu

@WendaXu2

1 year

What is missing in the text generation evaluation for BERTScore, BLERUT, COMET, SEScore & SEScore2? Explanation! Can we build a metric that not only produces a well-correlated quality score but also tell you the rationales, error type, and error location? Checkout InstructScore!

7

14

86

1

13

Lei Li

@lileics

11 months

Recognizing and associating knowledge from heterogeneous sources are challenging. I am honored to join the panel at #ACL2023NLP Matching workshop. Thanks @estevamhruschka and colleagues organizing this wonder workshop!

MATCHING WORKSHOP @ ACL 2023

The goal of this workshop is to bring together the research communities (from academia and industry) of these related areas, that are interested in the development and the application of novel...

megagon.ai

Matching Workshop

@Matching_wksh

1 year

@lileics will be joining our Matching workshop panel on July 13th at #ACL2023 ! #ai #MachineLearning #NLProc #data @aclmeeting

0

4

7

1

0

14

Lei Li

@lileics

1 year

Congratulations to all students participated in the ICPC @ucsbcs ! Big congratulations to UCSB NLP team (Siqi Ouyang @siqi_ouyang , Kexun Zhang, Alfonso Amayuelas @AlfonAmayuelas ) winning the runner-up! NO, they are not using ChatGPT... #NLProc

Lei Li

@lileics

1 year

Impressive! What is the final standing?

1

2

1

2

14

Lei Li

@lileics

3 years

This is wonderful! and truly shows the power of cutting-edge machine translation research for the extremely challenging problem. MT is saving culture and history. @mohitban47 @byryuer

UNC Computer Science

@unccs

3 years

Check out this @UNC article+video about the amazing work being done by our @uncnlp 's Shiyue Zhang ( @byryuer ) on helping revitalize Cherokee language using NLP research, with Prof. Mohit Bansal ( @mohitban47 ) and Prof. Ben Frey ( @AMST_UNC @UNCLinguistics )! @UNCResearch @unccollege

0

12

31

2

0

13

Lei Li

@lileics

2 years

AAAI-23 invites your participation in the Doctoral Consortium. Students are encouraged to apply. You will receive advice on both research and career planning from senior researchers. Senior members are welcome to join mentor team. @RealAAAI

0

3

12

Lei Li

@lileics

3 years

Wonderful year for #NeurIPS . Congratulations to all my colleagues and collaborators @ucsbcs ! In addition to those, @yuxiangw_cs and his team also have 5 papers on differential privacy to appear in #NeurIPS2021

William Wang

@WilliamWangNLP

3 years

Congratulations UCSB #NLProc student and faculty authors who have contributed to 8 papers at #NeurIPS this year! We will present in the areas of QA, robust learning, MT, dialogue, ASR, GAN, XAI, and Vision & Language. #NeurIPS2021

1

8

53

1

12

Lei Li

@lileics

11 months

We will present GINSEW paper at 11am tmr (Wed Jul 26) in Hall 1 #708 . Welcome to drop by if you want to hear more about watermarking an LLM to prevent unauthorized distillation or model stealing. Paper: #ICML2023 @xuandongzhao @yuxiangw_cs

Xuandong Zhao

@xuandongzhao

11 months

Heading to Hawaii for #ICML2023 ! Excited to present our work on text/image watermarking & privacy protection for LLMs. If you're interested in building trustworthy #GenrativeAI , my co-authors @yuxiangw_cs @lileics @kexun_zhang and I would love to meet up and chat!

1

4

43

0

2

12

Lei Li

@lileics

2 years

Congratulations to @YejinChoinka for the MacArthur genius award! Well deserved! I and my group have learned so much from her research, and her many inspiring talks! Great encouragement for the #NLProc community.

Oren Etzioni

@etzioni

2 years

I couldn't be more proud of my brilliant @allen_ai and @uwcse colleague @YejinChoinka who just won a MacArthur genius grant. I've learned so much from her, and the best is yet to come!

0

13

119

0

11

Lei Li

@lileics

3 years

or you want to meta-teach meta-learning meta-students? :-) I guess zooming is not at meta-level yet.

Kyunghyun Cho

@kchonyc

3 years

my friends, how do you feel to have become meta employees? ;)

10

3

180

0

11

Lei Li

@lileics

2 years

Given a statement, is it possible to verify its veracity and provide underlying rationale for the prediction? Our new work on LOREN tries to combine logical reasoning and neural inference together. Welcome to the poster session and talk at #AAAI2022 . 2/3

Lei Li

@lileics

2 years

#AAAI2022 show is on. Welcome to attend our talks at AAAI. The first paper is about counterfactual generation of stories. Using constrained Monte-Carlo sampling, EDUCAT is able to generate plausible counter-factual stories. Talk happens in 1hr. 1/3

2

0

6

3

4

11

Lei Li

@lileics

6 months

Excellent talk by Ellen Zhong @ZhongingAlong on structure prediction based on cryoEM imaging. It is similar to human pose estimation from images but more complex and more keypoints (many more atoms on a protein)

Ellen Zhong

@ZhongingAlong

6 months

You can find me at #NeurIPS23 🎷🎶 this week! I will be @workshopmlsb 💖 on Fri and the @genbio_workshop and the Deep Learning & Inverse Problems workshop on Sat. Please reach out if you want to chat!

0

3

26

0

2

11

Lei Li

@lileics

2 years

@WilliamWangNLP Badge unlocked -- 100% acceptance rate for all my submissions (3) to a single AI conference. Tip: better to focus on fewer but important ones! (the last time so close was CVPR 2021 with 4/5 accepted and ACL 2021 with 11/16 accepted.)

0

11

Lei Li

@lileics

6 months

Agree with most except 3. Evaluation based on benchmarks will no longer be the best way to evaluate LLMs. but there are other methods based on statistical inference, e.g. the Karr approach

Statistical Knowledge Assessment for Large Language Models

Given varying prompts regarding a factoid question, can a large language model (LLM) reliably generate factually correct answers? Existing LLMs may generate distinct responses for different...

arxiv.org

Thomas Wolf

@Thom_Wolf

6 months

Some predictions for 2024 – keeping only the more controversial ones. You certainly saw the non-controversial ones (multimodality, etc) already 1. At least 10 new unicorn companies building SOTA open foundation models in 2024 Stars are so aligned: - a smart, small and dedicated

19

77

411

0

1

12

Lei Li

@lileics

8 months

Looking forward to collaborating with co-presenters to deliver #NAACL2024 tutorial on Combating Security and Privacy Issues in the Era of Large Language Models @muhao_chen @ChaoweiX @hhsun1 @AnimaAnandkumar

Leon Derczynski ✍🏻🌱🌧️

@LeonDerczynski

8 months

Our LLMsec tutorial is accepted to NAACL 24! "Combating Security and Privacy Issues in the Era of Large Language Models" w/ @muhao_chen , @ChaoweiX , @hhsun1 , @lileics & @AnimaAnandkumar See you in Mexico and online :)

0

5

32

0

2

11

Lei Li

@lileics

6 months

Welcome to visit our poster. #609 Thursday morning 10:45am-12:45pm. How to tell if an LLM reliably knows some knowledge? We propose Karr method (knowledge assessment risk ratio). Paper: @qx_dong @ikekong

Lei Li

@lileics

6 months

I will attend #NeurIPS2023 in New Orleans with my students @kexun_zhang @ZhenqiaoSong @xuandongzhao @qx_dong excited to meet and discuss about our most recent work on program generation, knowledge assessment, LLM tool use, watermark, goal-conditioned RL etc.

0

2

24

0

3

10

Lei Li

@lileics

2 years

I echo much of the reflections from @AlonHalevy about working in and leading industry labs. I wish I could have read this piece 6 years ago while found my previous lab at industry.

0

10

Lei Li

@lileics

2 years

What are all time classic papers? Suggestions are welcome. @yuxiangw_cs and I are compiling a list of classics for our grad #MachineLearning course.

Yu-Xiang Wang

@yuxiangw_cs

2 years

What are some *timeless classics* among #MachineLearning papers in your opinion? I am compiling a list of papers for new grad students in my ML course with @lileics , for them to develop good research taste. #NeurIPS #icml #iclr

37

224

0

9

Lei Li

@lileics

3 years

5050 language pairs! wow!!

William Wang

@WilliamWangNLP

3 years

Our #EMNLP2021 paper "A Massively Multilingual Analysis of Cross-linguality in Shared Embedding Space" is now available! Our intrepid intern Alex Jones studied various linguistic factors in x-ling pretrained LMs for 101 langs and 5,050 lang. pairs. #NLProc

1

9

46

0

1

10

Lei Li

@lileics

9 months

Kinda funny, I asked ChatGPT to create a tweet for me. see below. 🎙️ Hey yinzers at @CarnegieMellon ! 🖐️ Get ready to turn "yinzpiration" into action with my talk on "Empowering Responsible Use of Large Language Models"! 🤖💡 1/3

Lei Li

@lileics

9 months

Happy to give a talk at @LTIatCMU Colloquium on "Empowering Responsible Use of Large Language Models" this Friday Sep. 8 12:30pm at Posner Hall A35. I will talk about how can we robustly detect AI generated text and protect copyright of LLMs. @SCSatCMU

1

12

96

0

10

Lei Li

@lileics

1 year

Welcome to my talk on Multilingual Machine translation at the Stats Seminar today 3:30pm at HSSB 1174. Thanks @ucsbpstat for organizing this event!

UCSB Computer Science Department

@ucsbcs

1 year

IN ONE HOUR: Head down to HSSB 1174 for CS Professor Lei Li’s talk on his research in multilingual machine translation systems. See you there!

0

1

0

1

8

Lei Li

@lileics

6 months

How to advance Foundation model for bio? Le: borrow techniques from nlp/cv, multimodal data, new objectives, consider cell as a system! Evan: Far less bio data. (Expensive) Human in the loop. Max: more data. More expert knowledge. Combine and align! Kyunghyun:build feedback loop

Lei Li

@lileics

6 months

Impact of gen ai bio on market? Kyunghyun: early stage ai drug startups are not ambitious enough! Rethink the whole process of drug discovery! (Not just on one compound) @kchonyc @genbio_workshop

1

0

3

0

10

Lei Li

@lileics

6 months

The first author (Hongqiao Chen @hongqiao_chen ) of this ToolDec paper is a high schooler. He reached USACO platinum-level and already shows great talent in research.

Lei Li

@lileics

6 months

see you this afternoon 3pm in Neurips MATH-AI workshop in room 217-219. Kexun @kexun_zhang will present our latest work on enable accurate and generalizable tool use for LLMs Paper:

0

2

6

0

9

Lei Li

@lileics

2 years

Our recent work CRT with @xuandongzhao @yuxiangw_cs make it possible to train language models such as GPT/T5 on private corpus without memorizing confidential information (e.g. user's name, phone number and address). It provides a provably confidential bound. #NLProc @ucsbcs

Xuandong Zhao

@xuandongzhao

2 years

Happy to share that our paper "Provably Confidential Language Modelling" got accepted to #NAACL2022 ! with @yuxiangw_cs , @lileics We propose a method to train language generation models while protecting the confidential segments #NLProc

0

1

11

0

9

Lei Li

@lileics

11 months

[ACL2023] Introducing LegoMT: building machine translation models like stacking lego blocks. This paper introduces a novel approach to translation using a modular and detachable multilingual neural network structure inspired by Lego building blocks. 1/2

Lei Li

@lileics

11 months

[ACL2023] We present Word Aligned Contrastive Learning for speech translation, which only needs one hour of data to train speech translation, especially suitable for low-resource languages. Paper: Code: #ACL2023NLP

0

7

0

1

9

Lei Li

@lileics

2 years

Do users engage more on video-sharing platforms if they are provided with more diverse content? We analyze the user daily stay time and their content diversity on two popular video sharing platforms. #KDD2022 1/2

0

8

Lei Li

@lileics

3 years

Excellent tips from @yisongyue for CS academic job applications!!

Yisong Yue

@yisongyue

3 years

Just updated my tips for CS Faculty Applications! Hope people find it useful!

2

50

231

0

1

9

Lei Li

@lileics

6 months

thank you for sharing your great insights at @genbio_workshop ! Highlights From Dr. Le Song: important to borrow techniques from nlp/cv to build foundation model for biology/drug design. integrate multimodal data. designing new learning objectives. should model cell as a system

BioMap Inc.

@BiomapI

6 months

Dr. Le Song, our CTO, delivered an impressive discussion at Generative AI and Biology Workshop @NeurIPSConf ! Kudos to the @genbio_workshop for hosting the discussion! #GenerativeAI #Biology #NeurIPS2023 #BioMap

3

1

9

0

10

Lei Li

@lileics

6 months

I am looking for one postdoc on Large Language Model safety and security through CBI program. If the topic fits your interest, please feel free to reach out and apply at

CBI Fellowship Program

Announcement Applications are invited for Carnegie Bosch Postdoctoral Fellows for the Fall Semester of 2024 supporting research that seeks to have a positive impact on society (e.g. sustainability)...

carnegiebosch.cmu.edu

0

8

Lei Li

@lileics

1 year

Google could try our algorithm to correct the factual error ==> VENCE correction algorithm.

Grady Booch

@Grady_Booch

1 year

Misinformation at scale; bullshit as a service. (The European Very Large Telescope- not the JWST - took the first optical photograph of an exoplanet in 2004.)

18

63

384

0

1

8

Lei Li

@lileics

8 months

Strongly support the open source for academic research. a research has to be examined and verified by the public. a project that only claims to "novel" or "best" or "state-of-the-art" but publicly verifiable material is called PR.

Ameet Talwalkar

@atalwalkar

8 months

I agree with @ylecun . Relatedly, open-source is the *only* way for academic researchers to contribute. I’m surprised this is even being debated.

0

2

14

0

1

8

Lei Li

@lileics

2 years

Can language understanding skills acquired in one language be transferred to other (even dissimilar) languages? We observe that cross-lingual transfer works best for high-resource and similar languages (not surprisingly), but not for low-resource and distant ones. 1/3 #ICLR2022

0

1

8

Lei Li

@lileics

2 years

Multilingual joint training proves beneficial for getting a unified translation model for many languages. Our work switch-GLAT shares an important insight: a truly multilingual translation model benefits from the capability of generating word-level switched codes. 1/3 #ICLR2022

Jocelyn Song

@SongZhenqiao

2 years

Our paper "switch-GLAT: Multilingual Parallel Machine Translation via Code-Switch Decoder" will be presented at ICLR 2022 poster session 1 on April 25th at 5:30 pm. Welcome to discuss in GatherTown!

0

1

0

8

Lei Li

@lileics

3 years

Finding novel chemical molecules for drug purpose is just like generating sentences in natural language. check out the latest #ICLR2021 paper: MARS with code at

0

8

Lei Li

@lileics

5 months

@ysu_nlp Thank you for hosting me! Wonderful to meet with faculty and students at OSU. Great to learn all the exciting work here!

0

8

Lei Li

@lileics

3 years

Fast Transformer training and inference with LightSeq 2.0. The new release introduce accelerated training on GPU with optimized CUDA kernels (3x speedup). for machine translation, text generation, text classification, matching and many other NLP tasks.

GitHub - bytedance/lightseq: LightSeq: A High Performance Library for Sequence Processing and...

LightSeq: A High Performance Library for Sequence Processing and Generation - bytedance/lightseq

github.com

0

1

8

Lei Li

@lileics

8 months

Beautiful place to do research! 👇

Caroline Lemieux

@cestlemieux

8 months

For those on the academic job market: @UBC_CS has two positions open in AI/ML this year! Come join us :). Full ad here:

3

7

56

0

7

Lei Li

@lileics

1 year

Excited to have Danny Zhou visiting UCSB!

William Wang

@WilliamWangNLP

1 year

We are super excited about Google Brain's @denny_zhou 's visit to UCSB Computer Science next week. He will talk about "Teach Language Models to Reason." Date: Monday, January 16th, 2022. Time: 3:30 - 4:30 pm. Location: Henley 1010. See you there!

2

9

58

0

7

Lei Li

@lileics

3 years

or use the ocean power, which we have a lot 😄

William Wang

@WilliamWangNLP

3 years

Just right after the first week of a new academic quarter, our group brought down the entire rack of GPU servers due to excessive current draw... 😅 The need for #GreenAI and #GreenNLProc is imminent.

1

0

30

0

7

Lei Li

@lileics

3 years

Release two datasets for text summarization. The first is a large-scale document summarization for Chinese news, with 300k articles, average length 730 characters, and human written summary. Additional annotation of adequacy & deducibility. #NLProc 1/2

A Large-scale Chinese News Summarization Dataset with Human-annotated Adequacy and Deducibility...

A Large-scale Chinese News Summarization Dataset with Human-annotated Adequacy and Deducibility Level

dqwang122.github.io

0

1

7

Lei Li

@lileics

2 years

Shocked. I was explaining his ResNet to students in this year’s 165B deep learning course. A huge loss for the community. R.I.P Dr. Sun.

Yi Ma

@YiMaTweets

2 years

I was shocked to know that Dr. Jian Sun, my former colleague of the MSRA Visual Computing Group, has passed away. We will miss him dearly. May his soul rest in peace.

20

61

603

0

7

Lei Li

@lileics

11 months

[ACL2023] We present Word Aligned Contrastive Learning for speech translation, which only needs one hour of data to train speech translation, especially suitable for low-resource languages. Paper: Code: #ACL2023NLP

Lei Li

@lileics

11 months

[ACL2023] On the second day of the poster session, I presented the work WACO on speech translation. I managed to put up the other two posters on multi-language translation Lego-MT and automatic evaluation indicator SEScore2. Indeed, people asked to hear about all three papers.

0

1

64

0

7

Lei Li

@lileics

11 months

[ACL2023] LegoMT: We trained a unified translation model for 433 languages, using seven languages as pivots. This model cover the most number of languages -- more than NLLB. It is relatively compact, with only 1.2 billion parameters (NLLB has 54B). Paper:

Lei Li

@lileics

11 months

[ACL2023] Introducing LegoMT: building machine translation models like stacking lego blocks. This paper introduces a novel approach to translation using a modular and detachable multilingual neural network structure inspired by Lego building blocks. 1/2

0

1

9

0

1

7

Lei Li

@lileics

11 months

Come to Tomorrow (Tue July 25) 11-12:30 in Exhibit Hall 1 poster 544 for details of ReDi: a generic and efficient inference method for diffusion models. @kexun_zhang and I will be there. Welcome to drop by. Paper: Code: #ICML2023

Lei Li

@lileics

11 months

ReDi improves the inference speed of diffusion by 2 times and it works for any underlying sampler for Diffusion. The core idea is to generate a few initial steps using sampler, retrieve similar trajectories, and use the retrieved trajectory to continue polish the image. #ICML2023

0

4

0

7

Lei Li

@lileics

2 years

Cool! PyTorch supports training on Mac's M1 GPU! Time to heat my computer!

PyTorch

@PyTorch

2 years

We’re excited to announce support for GPU-accelerated PyTorch training on Mac! Now you can take advantage of Apple silicon GPUs to perform ML workflows like prototyping and fine-tuning. Learn more:

79

710

3K

0

7

Lei Li

@lileics

8 months

Excellent initiative! Looking forward to this new dedicated conference on large language model!

Sasha Rush

@srush_nlp

8 months

Introducing COLM () the Conference on Language Modeling. A new research venue dedicated to the theory, practice, and applications of language models. Submissions: March 15 (it's pronounced "collum" 🕊️)

34

437

2K

0

7

Lei Li

@lileics

2 years

@roydanroy some biased stats: 4 former colleagues (including myself) moved to universities, and another is considering. one person was moving from university to industry. hard to compare the rate.

2

0

6

Lei Li

@lileics

2 years

Xianjun from @ucsbcs wrote a detailed blog about speech translation and a recent model Chimera. The core idea is motivated from neural cognitive science -- a model produces similar representation for a speech and its corresponding transcript text in a shared space. #NLProc 3/N

Lei Li

@lileics

2 years

Huake wrote a blog about recently popular learned metrics for text generation and MT, BERTScore and COMET. These two have shown high correlation with recent human evaluation metric like MQM( of @markuseful ) . #NLProc 2/N @ucsbNLP

1

0

3

1

3

6

Lei Li

@lileics

2 years

#AAAI2022 show is on. Welcome to attend our talks at AAAI. The first paper is about counterfactual generation of stories. Using constrained Monte-Carlo sampling, EDUCAT is able to generate plausible counter-factual stories. Talk happens in 1hr. 1/3

2

0

6

Lei Li

@lileics

2 years

Not attending #ACL2022 in person? Do not miss the opportunity to interact with authors of seven papers at ACL in virtual sessions. I will definitely join May 24 11am session if not all. Here is a list.

0

1

6

Lei Li

@lileics

1 year

2023 AAAI Doctoral Consortium full schedule is here: @RealAAAI #AAAI23

AAAI-DC 2023

AAAI/SIGAI-23 Doctoral Consortium

aaaidc.github.io

Lei Li

@lileics

1 year

Join us at 2023 AAAI Doctoral Consortium. This year we have 20 phd students from all areas of AI and 20 senior mentors. Don't miss the Keynotes from Amy Zhang( @yayitsamyzhang ) and Pulkit Agrawal ( @pulkitology ). All at AAAI are welcome. @RealAAAI #AAAI23 @j_foerst

0

3

24

0

1

6

Lei Li

@lileics

6 months

Cannot wait to hear from our great lineup of speakers and panelists at #NeurIPS2023 Generative AI for Biology workshop tomorrow Dec 16 at room 265-268. @genbio_workshop full schedule:

GenBio Workshop @Neurips2023

@genbio_workshop

6 months

Hope to see you all tomorrow! #NeurIPS2023

0

5

30

0

2

6