Haoyi Qiu @HaoyiQiu Twitter profile

Pinned Tweet

Haoyi Qiu

@HaoyiQiu

28 days

🥳 Excited to share that our work on LVLMs evaluation will appear at #ACL2024 ! 🇹🇭

Haoyi Qiu

@HaoyiQiu

2 months

🔍Hallucination or informativeness? 🤔Our latest research unveils a multi-dimensional benchmark and an LLM-based metric for measuring faithfulness and coverage in LVLMs. Explore our new method for a more reliable understanding of model outputs! 📣

10

26

111

2

6

75

Last Seen Profiles

@alixjukii

@kyrowley28

@dolllyfied

@UNESCO_AZ

@DarrellVasquez

@J_Carr_Esquire

@svx_

@ScottTheFox94

@coppoUx

@OMON_

@RAREBANDY

@MAKO10549731616

@DrJennyVaughan

@truckerJeramey

@norman86165422

@Air_0405rin

@OffSasii

@jack_murley

@therealmthanggg

@Laurel_Bristow

@sarah_holl54404

@Dylan_silvarojo

@SuperFreshTR

@stw_pdg

@kasparsskincs

@xandvt

@ttdage_

@WillMartinez270

@K1DDQ

@jjNo_11

@Vital11s

@PixboWallenstam

@ZealOptics

@cgast

@elfuro_1203

@guinowuzher

Haoyi Qiu

@HaoyiQiu

2 months

🔍Hallucination or informativeness? 🤔Our latest research unveils a multi-dimensional benchmark and an LLM-based metric for measuring faithfulness and coverage in LVLMs. Explore our new method for a more reliable understanding of model outputs! 📣

10

26

111

Haoyi Qiu

@HaoyiQiu

6 months

🏆 Thrilled to share that in my first time as a paper reviewer, I have been honored with the Best Reviewer Award for the Ethics in NLP track. Heartfelt thanks to the Area Chairs and EMNLP PC Chairs for this recognition. It is a fantastic learning experience! 🙏

6

5

106

Haoyi Qiu

@HaoyiQiu

2 months

🔥 Unlocking the power of Abstract Meaning Representations, AMRFact generates coherent, factually inconsistent summaries with high error-type coverage to improve the factuality evaluation on abstractive summarization! 📣 Check out our new #NAACL2024 🇲🇽work:

5

31

102

Haoyi Qiu

@HaoyiQiu

6 months

🌟 Exciting times at EMNLP! Just wrapped up a fantastic experience presenting a poster. A big shoutout to my amazing advisor @VioletNPeng for the incredible support and guidance. Also thanks to everyone from @uclanlp 🩷

0

3

78

Haoyi Qiu

@HaoyiQiu

8 months

🚨Model-based evaluation metrics like CLIPScore can unintentionally favor gender-biased captions in image captioning tasks! 📣 Check out our new #EMNLP2023 work: A joint effort with @ZiYiDou @Tianlu_Wang @real_asli and my amazing advisor @VioletNPeng .

5

15

66

Haoyi Qiu

@HaoyiQiu

3 months

🤗 Please check out our work 💥 It's a fantastic collaboration!!!

Kung-Hsiang Steeve Huang

@steeve__huang

3 months

📢 Excited to share our latest work: a comprehensive survey on chart understanding! We dive into the evolution of datasets, vision-language models, challenges, and future directions in this vibrant field 📊. 📝: 💻: 1/n

4

25

68

0

4

27

Haoyi Qiu

@HaoyiQiu

2 months

1️⃣Proposed AMRFact that uses AMR-based perturbations to generate factually inconsistent summaries, which allows for more coherent generation with high error-type coverage.

0

11

Haoyi Qiu

@HaoyiQiu

6 months

Received my certificate in advance since I will leave Singapore earlier to prepare for my visa renewal 🫶🏻🥰

0

7

Haoyi Qiu

@HaoyiQiu

2 months

2️⃣Devise a data validation module NegFilter to filter out invalid negative summaries to enhance the quality of generated data.

0

9

Haoyi Qiu

@HaoyiQiu

2 months

3️⃣Our approach achieves state-of-the-art performance on the AggreFact-FTSOTA dataset.

0

9

Haoyi Qiu

@HaoyiQiu

2 months

A joint effort between @uclanlp and @uiuc_nlp with @steeve__huang @jingnong_qu and my amazing advisor @VioletNPeng 🩷

0

9

Haoyi Qiu

@HaoyiQiu

21 days

How corrective feedback from interactions influences neural language acquisition?🤔 Please check out this interesting paper by Martin!!!! 💥

Martin Ziqiao Ma ✈️ CVPR 2024

@ziqiao_ma

21 days

Happy to share some “slow research” - it's been 15 months and it's now (finally) on arXiv! Our language development is different from LLMs. We're asking: How do you interactively babysit a language model from scratch, and would it help?🤔 🔗 @Michigan_AI

6

18

110

1

0

8

Haoyi Qiu

@HaoyiQiu

3 months

@steeve__huang @PhilippeLaban @elgreco_winter I took this picture of him 😆

1

0

7

Haoyi Qiu

@HaoyiQiu

2 months

𝓒𝓸𝓷𝓰𝓻𝓪𝓽𝓼!! 𝓓𝓻. 𝓗𝓾𝓪𝓷𝓰 ᕕ( ᐛ )ᕗ

Kung-Hsiang Steeve Huang

@steeve__huang

2 months

Officially Dr. Huang! 🎓 Thrilled to share that I've successfully defended my PhD thesis. Immensely grateful for my inspiring advisor @hengjinlp , and the guidance of my thesis committee: ChengXiang Zhai, Kathleen McKeown, @haopeng_nlp , @hanzhao_ml , and @JotyShafiq #PhDDone

15

8

151

1

5

Haoyi Qiu

@HaoyiQiu

8 months

@ZiYiDou @Tianlu_Wang @real_asli @VioletNPeng @AIatMeta @UCLAengineering @uclanlp 1️⃣ We collected a dataset comprising profession, activity, and object concepts associated with stereotypical gender associations.

0

5

Haoyi Qiu

@HaoyiQiu

8 months

@ZiYiDou @Tianlu_Wang @real_asli @VioletNPeng @AIatMeta @UCLAengineering @uclanlp 2️⃣ We dived deep into this issue & found: Biased metrics can’t differentiate between unbiased & biased captions, affecting generation models.

0

5

Haoyi Qiu

@HaoyiQiu

8 months

@ZiYiDou @Tianlu_Wang @real_asli @VioletNPeng @AIatMeta @UCLAengineering @uclanlp 4️⃣ Additionally, we proposed a method to reduce metric bias, aligning closer with human judgments. Pushing for more inclusive evaluations!

0

5

Haoyi Qiu

@HaoyiQiu

8 months

@ZiYiDou @Tianlu_Wang @real_asli @VioletNPeng @AIatMeta @UCLAengineering @uclanlp 3️⃣ We also found that biased metrics can amplify the model-encoded gender biases through reinforcement learning.

0

4

Haoyi Qiu

@HaoyiQiu

2 months

@steeve__huang @hengjinlp @haopeng_nlp @hanzhao_ml @JotyShafiq @uiuc_nlp @IllinoisCS @uofigrainger 𝓒𝓸𝓷𝓰𝓻𝓪𝓽𝓼!! 𝓓𝓻. 𝓗𝓾𝓪𝓷𝓰 🥰

0

4

Haoyi Qiu

@HaoyiQiu

2 months

1️⃣We propose an LLM-based two-stage evaluation framework VALOR-EVAL that generalizes previous methods by introducing semantic matching and incorporates both the faithfulness and coverage aspects into our evaluation.

0

3

Haoyi Qiu

@HaoyiQiu

2 months

🖥️Github:

GitHub - haoyiq114/VALOR: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language...

Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models - haoyiq114/VALOR

github.com

0

2

Haoyi Qiu

@HaoyiQiu

2 months

3️⃣ (Cont.) Our benchmark highlights the critical balance between faithfulness and coverage of model outputs, and encourages future works to address hallucinations in LVLMs while keeping their outputs informative.

0

2

Haoyi Qiu

@HaoyiQiu

2 months

4️⃣To compare our LLM-based framework to LLM-free evaluation, we measured the average accuracy of hallucinated and covered objects detected by the metric. Results demonstrate that VALOR-EVAL significantly outperforms in both faithfulness and coverage accuracy by a large amount.

0

2

Haoyi Qiu

@HaoyiQiu

2 months

2️⃣ We introduce a comprehensive multi-dimensional benchmark, named VALOR-BENCH, dedicated to the evaluation of LVLMs, with a particular focus on measuring hallucinations in generative tasks.

0

2

Haoyi Qiu

@HaoyiQiu

2 months

3️⃣We evaluate 10 mainstream LVLMs on VALOR-BENCH, focusing on the balance between faithfulness and coverage score.

0

2

Haoyi Qiu

@HaoyiQiu

6 months

@roeschinc @uclanlp Thanks ☺️

0

1

Haoyi Qiu

@HaoyiQiu

2 months

A joint effort in @uclanlp with @gordonhu608 (contributed equally), @ZiYiDou and our amazing advisor @VioletNPeng .

0

1

Haoyi Qiu

@HaoyiQiu

2 months

@steeve__huang 😊

0

1

Haoyi Qiu

@HaoyiQiu

2 months

1️⃣ (Cont.) Our VALOR-EVAL can handle complex hallucination types in object, attribute, and relations in open vocabulary captions generated by large vision-language models.

0

1

Haoyi Qiu

@HaoyiQiu

6 months

@steeve__huang @uclanlp Thanks Steeve 😁

0

1

Haoyi Qiu

@HaoyiQiu

2 months

2️⃣ (Cont.) Our benchmark categorizes hallucinations into three distinct types – object, attribute, and relation – offering a detailed understanding of model inaccuracies.

0

1

Haoyi Qiu

@HaoyiQiu

3 years

@skychwang Congratulations Sky !!!!!! 🎉🎉

1

0

1