Haoyi Qiu Profile Banner
Haoyi Qiu Profile
Haoyi Qiu

@HaoyiQiu

525
Followers
589
Following
18
Media
73
Statuses

PhD student @UCLANLP 💙 BS in CS&Math @UMich 〽️ #NLP 🌷Trustworthy AI

Los Angeles, CA
Joined October 2018
Don't wanna be here? Send us removal request.
Pinned Tweet
@HaoyiQiu
Haoyi Qiu
28 days
🥳 Excited to share that our work on LVLMs evaluation will appear at #ACL2024 ! 🇹🇭
@HaoyiQiu
Haoyi Qiu
2 months
🔍Hallucination or informativeness? 🤔Our latest research unveils a multi-dimensional benchmark and an LLM-based metric for measuring faithfulness and coverage in LVLMs. Explore our new method for a more reliable understanding of model outputs! 📣
Tweet media one
10
26
111
2
6
75
@HaoyiQiu
Haoyi Qiu
2 months
🔍Hallucination or informativeness? 🤔Our latest research unveils a multi-dimensional benchmark and an LLM-based metric for measuring faithfulness and coverage in LVLMs. Explore our new method for a more reliable understanding of model outputs! 📣
Tweet media one
10
26
111
@HaoyiQiu
Haoyi Qiu
6 months
🏆 Thrilled to share that in my first time as a paper reviewer, I have been honored with the Best Reviewer Award for the Ethics in NLP track. Heartfelt thanks to the Area Chairs and EMNLP PC Chairs for this recognition. It is a fantastic learning experience! 🙏
Tweet media one
6
5
106
@HaoyiQiu
Haoyi Qiu
2 months
🔥 Unlocking the power of Abstract Meaning Representations, AMRFact generates coherent, factually inconsistent summaries with high error-type coverage to improve the factuality evaluation on abstractive summarization! 📣 Check out our new #NAACL2024 🇲🇽work:
Tweet media one
5
31
102
@HaoyiQiu
Haoyi Qiu
6 months
🌟 Exciting times at EMNLP! Just wrapped up a fantastic experience presenting a poster. A big shoutout to my amazing advisor @VioletNPeng for the incredible support and guidance. Also thanks to everyone from @uclanlp 🩷
Tweet media one
0
3
78
@HaoyiQiu
Haoyi Qiu
8 months
🚨Model-based evaluation metrics like CLIPScore can unintentionally favor gender-biased captions in image captioning tasks! 📣 Check out our new #EMNLP2023 work: A joint effort with @ZiYiDou @Tianlu_Wang @real_asli and my amazing advisor @VioletNPeng .
Tweet media one
5
15
66
@HaoyiQiu
Haoyi Qiu
3 months
🤗 Please check out our work 💥 It's a fantastic collaboration!!!
@steeve__huang
Kung-Hsiang Steeve Huang
3 months
📢 Excited to share our latest work: a comprehensive survey on chart understanding! We dive into the evolution of datasets, vision-language models, challenges, and future directions in this vibrant field 📊. 📝: 💻: 1/n
Tweet media one
4
25
68
0
4
27
@HaoyiQiu
Haoyi Qiu
2 months
1️⃣Proposed AMRFact that uses AMR-based perturbations to generate factually inconsistent summaries, which allows for more coherent generation with high error-type coverage.
Tweet media one
0
0
11
@HaoyiQiu
Haoyi Qiu
6 months
Received my certificate in advance since I will leave Singapore earlier to prepare for my visa renewal 🫶🏻🥰
Tweet media one
0
0
7
@HaoyiQiu
Haoyi Qiu
2 months
2️⃣Devise a data validation module NegFilter to filter out invalid negative summaries to enhance the quality of generated data.
Tweet media one
0
0
9
@HaoyiQiu
Haoyi Qiu
2 months
3️⃣Our approach achieves state-of-the-art performance on the AggreFact-FTSOTA dataset.
Tweet media one
0
0
9
@HaoyiQiu
Haoyi Qiu
2 months
A joint effort between @uclanlp and @uiuc_nlp with @steeve__huang @jingnong_qu and my amazing advisor @VioletNPeng 🩷
0
0
9
@HaoyiQiu
Haoyi Qiu
21 days
How corrective feedback from interactions influences neural language acquisition?🤔 Please check out this interesting paper by Martin!!!! 💥
@ziqiao_ma
Martin Ziqiao Ma ✈️ CVPR 2024
21 days
Happy to share some “slow research” - it's been 15 months and it's now (finally) on arXiv! Our language development is different from LLMs. We're asking: How do you interactively babysit a language model from scratch, and would it help?🤔 🔗 @Michigan_AI
Tweet media one
6
18
110
1
0
8
@HaoyiQiu
Haoyi Qiu
3 months
Tweet media one
1
0
7
@HaoyiQiu
Haoyi Qiu
2 months
𝓒𝓸𝓷𝓰𝓻𝓪𝓽𝓼!! 𝓓𝓻. 𝓗𝓾𝓪𝓷𝓰 ᕕ( ᐛ )ᕗ
@steeve__huang
Kung-Hsiang Steeve Huang
2 months
Officially Dr. Huang! 🎓 Thrilled to share that I've successfully defended my PhD thesis. Immensely grateful for my inspiring advisor @hengjinlp , and the guidance of my thesis committee: ChengXiang Zhai, Kathleen McKeown, @haopeng_nlp , @hanzhao_ml , and @JotyShafiq #PhDDone
Tweet media one
15
8
151
1
1
5
@HaoyiQiu
Haoyi Qiu
8 months
@ZiYiDou @Tianlu_Wang @real_asli @VioletNPeng @AIatMeta @UCLAengineering @uclanlp 1️⃣ We collected a dataset comprising profession, activity, and object concepts associated with stereotypical gender associations.
Tweet media one
0
0
5
@HaoyiQiu
Haoyi Qiu
8 months
@ZiYiDou @Tianlu_Wang @real_asli @VioletNPeng @AIatMeta @UCLAengineering @uclanlp 2️⃣ We dived deep into this issue & found: Biased metrics can’t differentiate between unbiased & biased captions, affecting generation models.
Tweet media one
0
0
5
@HaoyiQiu
Haoyi Qiu
8 months
@ZiYiDou @Tianlu_Wang @real_asli @VioletNPeng @AIatMeta @UCLAengineering @uclanlp 4️⃣ Additionally, we proposed a method to reduce metric bias, aligning closer with human judgments. Pushing for more inclusive evaluations!
Tweet media one
Tweet media two
0
0
5
@HaoyiQiu
Haoyi Qiu
8 months
@ZiYiDou @Tianlu_Wang @real_asli @VioletNPeng @AIatMeta @UCLAengineering @uclanlp 3️⃣ We also found that biased metrics can amplify the model-encoded gender biases through reinforcement learning.
Tweet media one
0
0
4
@HaoyiQiu
Haoyi Qiu
2 months
0
0
4
@HaoyiQiu
Haoyi Qiu
2 months
1️⃣We propose an LLM-based two-stage evaluation framework VALOR-EVAL that generalizes previous methods by introducing semantic matching and incorporates both the faithfulness and coverage aspects into our evaluation.
0
0
3
@HaoyiQiu
Haoyi Qiu
2 months
3️⃣ (Cont.) Our benchmark highlights the critical balance between faithfulness and coverage of model outputs, and encourages future works to address hallucinations in LVLMs while keeping their outputs informative.
0
0
2
@HaoyiQiu
Haoyi Qiu
2 months
4️⃣To compare our LLM-based framework to LLM-free evaluation, we measured the average accuracy of hallucinated and covered objects detected by the metric. Results demonstrate that VALOR-EVAL significantly outperforms in both faithfulness and coverage accuracy by a large amount.
Tweet media one
0
0
2
@HaoyiQiu
Haoyi Qiu
2 months
2️⃣ We introduce a comprehensive multi-dimensional benchmark, named VALOR-BENCH, dedicated to the evaluation of LVLMs, with a particular focus on measuring hallucinations in generative tasks.
0
0
2
@HaoyiQiu
Haoyi Qiu
2 months
3️⃣We evaluate 10 mainstream LVLMs on VALOR-BENCH, focusing on the balance between faithfulness and coverage score.
Tweet media one
0
0
2
@HaoyiQiu
Haoyi Qiu
6 months
0
0
1
@HaoyiQiu
Haoyi Qiu
2 months
A joint effort in @uclanlp with @gordonhu608 (contributed equally), @ZiYiDou and our amazing advisor @VioletNPeng .
0
0
1
@HaoyiQiu
Haoyi Qiu
2 months
0
0
1
@HaoyiQiu
Haoyi Qiu
2 months
1️⃣ (Cont.) Our VALOR-EVAL can handle complex hallucination types in object, attribute, and relations in open vocabulary captions generated by large vision-language models.
Tweet media one
0
0
1
@HaoyiQiu
Haoyi Qiu
6 months
@steeve__huang @uclanlp Thanks Steeve 😁
0
0
1
@HaoyiQiu
Haoyi Qiu
2 months
2️⃣ (Cont.) Our benchmark categorizes hallucinations into three distinct types – object, attribute, and relation – offering a detailed understanding of model inaccuracies.
Tweet media one
0
0
1
@HaoyiQiu
Haoyi Qiu
3 years
@skychwang Congratulations Sky !!!!!! 🎉🎉
1
0
1