Ben Blaiszik @BenBlaiszik Twitter profile | Pikagi

Pikagi

Ben Blaiszik

@BenBlaiszik

4,235

Followers

2,706

Following

507

Media

4,313

Statuses

Group Leader - AI and data infrastructure for science at @uchicago / @argonne / @globus . materials, chemistry, physics. Opinions are my own. 🤖🔬

Chicago, IL

https://t.co/gsmto3IMOp

Joined March 2021

Don't wanna be here? Send us removal request.

Pinned Tweet

@BenBlaiszik

Ben Blaiszik

1 year

We wrapped up the first LLM hackathon for applications in materials and chemistry last week. The results to me were astounding. We are at the point now where some tasks that took years can now be completed in days. Here is a list of the fantastic submissions!

Tweet media one

54

384

2K

Last Seen Profiles

@ihearttberlinn

@enfuegonow

@nico_piyoking

@junbooster

@gman_eth

@Dr_Stoddart

@myideasADV

@habanxro

@davethebluesman

@Otero_Muras

@command_tower2

@inouetake

@trppn

@zdrowiePRP

@rblz_daniel

@jcohyuun

@koleksi_stw69

@sho19910420

@ChaseHarlan_

@kotetsu09646016

@HEYOLlX

@RohiniSilverScr

@keijito_tvasahi

@Zq7H0cfDKCjUn8E

@Elangselatan15

@KabekonaKrusher

@leonHoward54

@eddiemclau43

@pinsupansu

@pengen_stw

@holtsfamilyhis1

@TEFA__22

@morris_ant0

@PokoMeES

@ahmedansari1982

@tongtjilap

@BenBlaiszik

Ben Blaiszik

2 years

Did you know you can create a Google Scholar profile for your research group? While GS is mostly used to track individual metrics, this process allows you to track and highlight team and project metrics. Here's 5 steps to get to this 👇 #AcademicChatter #AcademicTwitter

Tweet media one

38

336

2K

@BenBlaiszik

Ben Blaiszik

2 years

It's been a wild week already in #ML / #AI for science. Advancements in using diffusion models for protein folding, using learned potentials to discover new catalyst materials, a proposed battery data genome to speed energy storage material discovery, and so much more! 🧵 (1/8)

9

155

853

@BenBlaiszik

Ben Blaiszik

3 months

Spent 2 hours this morning with Claude 3, and it's the most intensely I've been shocked yet. The Claude 3 Opus understanding of complex scientific topics is far ahead of GPT-4 on my self-made qualitative evals. I'd guess mid to advanced PhD level understanding of the topics

12

56

414

@BenBlaiszik

Ben Blaiszik

5 months

Had to recommend rejecting a paper today, which I hate doing. It was an ML application paper and after multiple rounds of asking, authors refused to share ~any~ data or code. 🚩How can such be reasonably reviewed??

31

17

313

@BenBlaiszik

Ben Blaiszik

2 years

@MaximZiatdinov @antoniogm I think they are a lot further down the curve.

1

0

223

@BenBlaiszik

Ben Blaiszik

26 days

The academic feeling where everything is due all at once. 🫠

6

26

195

@BenBlaiszik

Ben Blaiszik

4 months

Looking for a way to easily access the highest quality ML-ready datasets in materials science and chemistry? Look no further! 🎉 Foundry-ML 🎉 Read the paper or visit the website below. 📰: Web ✨: A great partnership of

Tweet media one

4

32

121

@BenBlaiszik

Ben Blaiszik

9 months

@hankgreen A foul isn’t a third strike unless it’s a swinging fouk tip that goes into the catcher’s glove - but a bunt attempt on a second strike that goes foul is a strikeout. Also if a catcher drops the third strike the runner must be tagged out or thrown out at first base. 😵‍💫

4

0

120

@BenBlaiszik

Ben Blaiszik

1 year

Andrew White talking about how to find papers as a chemist in 2023. @andrewwhite01 Recordings will be available soon!

Tweet media one

5

13

113

@BenBlaiszik

Ben Blaiszik

12 days

🚀How can we use LLMs to accelerate scientific discovery? Let's find out! This year, hundreds of people from across the globe worked together in a hackathon to BUILD groundbreaking prototypes — showing the path to breakthroughs in next generation batteries, sustainability,

Tweet media one

3

20

104

@BenBlaiszik

Ben Blaiszik

2 years

Post a picture of the @matplotlib plot you are most proud of. Extra credit: link to the code to reproduce the plot any papers it may have appeared in! 📈

16

15

99

@BenBlaiszik

Ben Blaiszik

1 year

Defne Circi and Shruti Badhwar developed GraphInsight to automatically create materials knowledge graphs via entity and relationship extraction with GPT4 backed by @neo4j . Their prototype example showed development of a materials knowledge graph for polymer nanocomposites. 👏

@DCirci

Defne Circi

1 year

I had an amazing time being part of the LLM Materials / Chemistry Hackathon! I learned so much, and it was a pleasure to work with @yborg_2014 . Together, we worked on a tool that uses #LLMs and data visualization to create materials knowledge graphs. Check out our video!

10

16

133

3

9

93

@BenBlaiszik

Ben Blaiszik

3 months

✨Trillion Parameter Models in Science✨ We present an initial vision for a shared ecosystem to take the next step in large language models for scientific research – Trillion Parameter Models (TPMs). #LLM are becoming more powerful by the day. But, there is still work done to

Tweet media one

Tweet media two

3

17

84

@BenBlaiszik

Ben Blaiszik

1 year

@SamCox822 and Mark Caldras developed composable tools that combine structured queries to the Materials Project with LLM context creation, synthesis prediction, property prediction and more. @LangChainAI 🎉

@SamCox822

Sam Cox

1 year

Check out our submission for the LLM March Madness Hackathon! We created some tools for materials to use with LLMs, using LangChain and the Materials Project API. Check it out here! @Kyam888

2

7

38

2

6

80

@BenBlaiszik

Ben Blaiszik

6 months

It's been quite sad to see so many remote science collaboration and learning opportunities removed this year. I am all for in person, but remote options dramatically increase accessibility and visibility.

3

8

75

@BenBlaiszik

Ben Blaiszik

2 years

For the past few years, I've been tracking the number of AI/ML-related publications in scientific domains. As expected, it's been a record year for AI/ML in the sciences. Here is my final official update covering 2021 publications. #AI4Science #MachineLearning #OpenScience

Tweet media one

3

18

74

@BenBlaiszik

Ben Blaiszik

2 months

💡Unlocking New Frontiers: 2nd LLM Hackathon for Applications in Materials and Chemistry💡 Join us on May 8-9th for the 2nd Large Language Model Hackathon for Applications in Materials and Chemistry! This hybrid event is designed to connect brilliant minds to explore

3

29

72

@BenBlaiszik

Ben Blaiszik

1 year

@jakublala and Sean Warren developed a conversational interface to 3dmol.js showing the power of incorporating language models on future interface design. 🥳

@jakublala

Jakub Lála 🦸‍♂️

1 year

As a part of the #llmhack , me and Sean Warren developed MolVerse, a language interface to 3dmol.js: a web app for all chemists and biologists that want to visualise their (bio)molecules without needing to code. Here we load the structure of benzene:

6

34

183

1

6

69

@BenBlaiszik

Ben Blaiszik

2 years

We published a 780 GB dataset of quantum calculations this morning, and a 7.1 TB peptide assembly dataset this afternoon. Technical details on these soon. The Materials Data Facility offers researchers unrivaled capabilities to share materials data!

2

12

70

@BenBlaiszik

Ben Blaiszik

2 months

@MurerCorrigan @HeejungChung Were you interested in their work? Can always go either way with old research threads. 😅

0

0

69

@BenBlaiszik

Ben Blaiszik

2 years

What if you could access large datasets that are #machinelearning ready with just 2 lines of code? Or share your own ML-ready dataset with others just as easily? Well, now you can! 🧵 #datascience #openscience #matsci #data #academictwitter #materialsdatamonth @NIST @NSF

Tweet media one

3

21

64

@BenBlaiszik

Ben Blaiszik

1 month

🚀Excited to announce the speaker list for the LLM Hackathon for Applications in Materials and Chemistry. This year, we will hear from experts from industry and academia including 🔶Elsa Olivetti ( @OlivettiGroup , @MIT ), 🔷Marwin Segler ( @marwinsegler , @Microsoft ), 🔶Michael

Tweet media one

2

13

65

@BenBlaiszik

Ben Blaiszik

3 months

@dhuang26 @MaximZiatdinov Right, but the thing that shocked me is that it was able to come up with the solution we found that took top tier chemists ~1 year to formulate through various in-lab failures. Claude did this in one shot - for 5 cents. So, it gets potentially much easier to find fruitful paths

4

9

65

@BenBlaiszik

Ben Blaiszik

1 year

The #LLM hack kicked off with 3️⃣ inspiring speakers 🔷 @andrewwhite01 discussed ways to find chem information in 2023, a paper-qa agent extension, direct synthesis to property prediction via LLMs, and the answer to all my questions - it's not 42. #ML #AI

3

16

62

@BenBlaiszik

Ben Blaiszik

1 year

I deeply appreciate everyone who is here and engaging. But, I’m feeling quite heavy about the loss of a significant fraction of the Twitter science community and all that we’ve already collectively missed out on and what we will miss out on. 😔

6

3

62

@BenBlaiszik

Ben Blaiszik

2 years

For drug discovery we need to understand molecular orientation in 3-dimensions. The GEOM dataset (Simon Axelrod and Rafa Gómez-Bombarelli) contains 37M conformations for >450k molecules - quickly becoming a benchmark in the field of molecular generative modeling. @mit @Harvard

Tweet media one

2

7

62

@BenBlaiszik

Ben Blaiszik

2 years

I’m thrilled to announce the @NSF Garden project🌱. The AI4Science community has already seen tremendous successes. Yet, these advances are available to only a few specialists. We will make these advances ethically accessible to all researchers. 🧵

Tweet card media

UChicago/Argonne Researchers Will Cultivate AI Model “Gardens” With $3.5M NSF Grant - Department of...

Artificial intelligence (AI) is an increasingly essential tool for scientific discovery, helping researchers unlock new insights from growing pools of data. But AI also creates new barriers in...

cs.uchicago.edu

4

15

61

@BenBlaiszik

Ben Blaiszik

11 months

By far my favorite thing this summer is that our kids (9 and 5) have become best friends. 💜💜

Tweet media one

Tweet media two

3

0

59

@BenBlaiszik

Ben Blaiszik

2 years

We recently set up a Google Scholar profile for our research group! The idea is to allow the team - and others - to see one type of progress as a cohesive whole rather than only at the individual level. @ianfoster @chard_kyle

Tweet media one

4

5

59

@BenBlaiszik

Ben Blaiszik

1 year

📢ATTENTION! 🚨 Virtual hackathon alert! 🔥 Excited by @openai 's release of GPT-4? 💡We're bringing together the brightest minds to tackle problems via #LLM and create datasets in energy storage, materials/drug discovery and more. #LLMHackathon @andrewwhite01 @rachel_l_woods

Tweet media one

2

24

59

@BenBlaiszik

Ben Blaiszik

1 year

✨ Yay, I can finally announce Globus Compute! ✨ 💻 Globus Compute enables cloud-managed fire and forget computation, running functions on computers ranging from laptops to cloud to supercomputers. This sounds abstract, but it's already enabling insane productivity boosts.👇

Tweet media one

3

12

58

@BenBlaiszik

Ben Blaiszik

1 year

@sciliz @DEC0L0NIZE Add pizza and I'm totally onboard. I've really enjoyed doing 20x more outside than before.

1

0

57

@BenBlaiszik

Ben Blaiszik

10 months

It was a very scary last month. A series of events and, in particular, people lined up perfectly to very likely save my life. More details on that sometime later, but the good news is that I’m expected to make a complete and durable recovery with no long-term effects. It’s easy

23

1

57

@BenBlaiszik

Ben Blaiszik

2 years

Today I hit 1000 Twitter followers. ✨I know this is a small amount but it’s been amazing getting to know you all on #AcademicTwitter . Thank you everyone! With this new network, some amazing things in #OpenScience have already happened here…

1

4

54

@BenBlaiszik

Ben Blaiszik

1 month

📢 Attention students! 📢 We're a team of scientists, researchers, and software engineers dedicated to creating software and services to accelerate scientific progress. We build tools used by thousands of researchers to publish datasets and models, create state-of-the-art AI

1

21

53

@BenBlaiszik

Ben Blaiszik

1 month

Very intriguing work by Daniel Schwalbe-Koda et al. where they propose a new framework grounded in information theory that unifies key aspects of: 1. predictions of phase transformations, 2. kinetic events, 3. dataset optimality, and 4. model-free UQ from atomistic

Tweet media one

4

6

52

@BenBlaiszik

Ben Blaiszik

3 months

@MaximZiatdinov It still can't actually "do" anything, so you're safe for a while. But it did correctly guess the solution to what I thought was the hardest unpublished aspect of a very tricky materials/chem problem from my grad school days. It also had two other ideas, but the idea we

7

3

52

@BenBlaiszik

Ben Blaiszik

1 year

We are less than a week from the virtual hackathon for #LLM applications in materials science and chemistry. Here is some inspiration from other's applications to drive you! 🗓️ Register and start teaming today: #hackathon #openscience #ML #AI

LLM March Madness Materials / Chemistry Hackathon

Join a virtual hackathon exploring large language model applications in materials science & chemistry. Innovate, collaborate, & discover!

www.eventbrite.com

1

16

52

@BenBlaiszik

Ben Blaiszik

24 days

@bielleogy It’s actually 120% of my feed.

2

0

51

@BenBlaiszik

Ben Blaiszik

1 year

I really dislike large kickoff meetings (>10) that spend half the time with individual intros. Nobody is listening because they are figuring out how to describe themselves! Instead, set up a doc, spend 5 min having people write/read there. Also this is searchable in the future.

6

4

50

@BenBlaiszik

Ben Blaiszik

1 year

A team led by @QaiAlex fine tuned GPT-3 on examples from the open reaction database (ORD) and applied this to extract structured data from synthesis sections of papers. 🧑‍🔬

@QaiAlex

Qianxiang Ai

1 year

Use LLM to parse free text synthesis recipes to structured data! With only 300 training pairs, the fine tuned model can already pick up chemical identities/amounts and generate valid JSON in ORD schema. Check out our video demo!

4

10

70

1

3

49

@BenBlaiszik

Ben Blaiszik

2 years

@KevinKaichuang et al use diffusion models to generate novel foldable protein structures. (2/8)

@KevinKaichuang

Kevin K. Yang 楊凱筌

@KevinKaichuang

2 years

We present Folding Diffusion: a diffusion model for protein structure inspired by physical protein folding. Let by @Kevin_E_Wu during his internship, with @alexijielu @vdbergrianne @james_y_zou @avapamini Code: Preprint:

13

168

806

1

2

49

@BenBlaiszik

Ben Blaiszik

3 years

For the past few years, I've been tracking the number of AI/ML publications in scientific domains. With months of data yet for 2021, materials science, chemistry, and physics have surpassed their ML publications from 2020. Usually the numbers settle in by February... #AI4Science

Tweet media one

3

9

46

@BenBlaiszik

Ben Blaiszik

2 years

Pleased to announce the Materials Research Coordination Network (MaRCN)! 👾MaRCN👾 is an @NSF funded project focused on concepts in metadata standards, open science, shared benchmark problems, FAIR as applied to ML models and training protocols, and more.

Tweet card media

ARL Applauds NSF Open Science Investment - Association of Research Libraries

The Association of Research Libraries (ARL) commends the ongoing commitment of the US National Science Foundation (NSF) to open science. NSF today announced awards for 10 new projects focused on...

3

7

47

@BenBlaiszik

Ben Blaiszik

6 months

@kevinmawright @DataFacility publishes and hosts multi TB datasets - I think our largest is 10 TB. We see a lot of reuse in materials science and chemistry adjacent applications at least! For many datasets, the question of what to preserve long term is challenging though.

1

1

46

@BenBlaiszik

Ben Blaiszik

3 years

I'm writing a proposal on making #machinelearning in science more accessible. What are the biggest things that keep you from using ML in your research? Finding models, lack of expertise, adjusting for a new task, software and hardware incompatibilities, unknown model quality?

9

10

45

@BenBlaiszik

Ben Blaiszik

2 years

@DrAnneCarpenter Maybe it’s worth considering just posting the preprint and authors copy on your website? Google scholar makes these links automatically show up in search. 40k is too much!

1

0

44

@BenBlaiszik

Ben Blaiszik

6 months

@pauldauenhauer Between that income stream and all of the patent residuals, no more grant writing.

2

0

43

@BenBlaiszik

Ben Blaiszik

2 years

@_akhaliq presents NeuralPLexer to predict protein-ligand structures. This will help understand the interaction between small molecules and proteins. Also with diffusion models ✨ (3/8)

@_akhaliq

AK

2 years

Dynamic-Backbone Protein-Ligand Structure Prediction with Multiscale Generative Diffusion Models abs:

Tweet media one

2

35

163

3

3

44

@BenBlaiszik

Ben Blaiszik

2 years

I absolutely love the fact that there are people in our group that know more than me about many technical topics. It’s not a sign of weakness or insufficiency, it’s how you know you’re building a real team.

2

1

43

@BenBlaiszik

Ben Blaiszik

2 years

In academia this can be almost impossible to achieve, but I'm thankful every day that my parents live 10 min away and my kids are able to see them almost every day. Do I regret not applying to TT positions at random Unis? Nope. We're building something special in the lab and life

@SahilBloom

Sahil Bloom

2 years

Last year, I had a conversation that changed my life. It caused me to upend everything and move across the country. The lesson from it may change yours:

572

3K

19K

2

1

42

@BenBlaiszik

Ben Blaiszik

9 months

I’ve been working late nights to help my kid’s new school understand and optimize their air quality. With just a few free and simple tweaks, measurements today showed that we reduced CO2 concentration 2.5x already. This will improve attendance, improve learning, and lessen the

2

4

42

@BenBlaiszik

Ben Blaiszik

1 year

Twitter is clearly tweaking the algorithm, and not in a good way. Yesterday, I was getting a lot of replays of tweets I'd already seen. Also, moticed all week that I'm not seeing content from people I respect (that are tweeting), and more from random "influencer" accounts. 🙄

6

4

42

@BenBlaiszik

Ben Blaiszik

2 years

I have no idea how this works or how it could be healthy for a specific field. Counted roughly 170 authored publications in the 9.5 months of 2022. Anyone know the story?

@DrMarkGriffiths

Mark Griffiths

@DrMarkGriffiths

2 years

Delighted to see that I had over 5000 citations in a 2-month period on Google Scholar to take me over 125,000 citations and an h-index of 170. Big thanks to all my co-authors @PsychologyNTU @NTUNews @BPSCyberPsych @ntu_research @daraghmcdermott @ProfEdwardPeck @NTUSocSciences

Tweet media one

22

3

31

11

1

41

@BenBlaiszik

Ben Blaiszik

23 days

The whole sky in Chicago area is pink with aurora. Wow.

6

2

41

@BenBlaiszik

Ben Blaiszik

1 year

Does it feel like you are seeing more impactful #ML and #AI for science publications? This probably explains it. 🚀 📈 We've seen strong continued growth in #AI and #ML for science across a broad set of domains including materials science, chemistry, physics and more.

Tweet media one

3

8

40

@BenBlaiszik

Ben Blaiszik

3 months

@Brttnyblm Had the same issue once. Ultimately found an unused one in the garage in a cabinet from previous owner. 👻

2

0

38

@BenBlaiszik

Ben Blaiszik

2 years

@WardLT2 and colleagues at @argonne , @INL , and more propose a Battery Data Genome to speed development of energy storage materials. Exploring in depth the needs for data, software, models, and community. @ENERGY @jam3es #OpenScience (7/8)

@WardLT2

Logan Ward

2 years

So excited our paper is published! It describes our on-going discussions started in Nov'19 on how to make battery data science possible and easier for a larger community of researchers

4

7

57

1

6

38

@BenBlaiszik

Ben Blaiszik

2 years

@Deepmind @AxelrodSimon show a learned force field performs equal or better to traditional DFT methods for finding interesting configurations at catalyst surfaces via #OpenCatalyst . 🚀 (5/8)

@GoogleDeepMind

Google DeepMind

@GoogleDeepMind

2 years

New research shows that learned force fields are ready for use in catalyst discovery. Here, ‘Easy Potentials’ outperform classical quantum chemistry tools on the #OpenCatalyst challenge and find lower energy structures outside of the training set: 1/2

Tweet media one

8

124

504

1

2

36

@BenBlaiszik

Ben Blaiszik

1 year

We are closing in on breakthroughs in fusion (maybe 15 years until deployment). Wind, solar, and energy storage costs are plummeting. AI breakthroughs every day. Technical problems can be solved, but how do we make similar progress on social issues?

11

0

36

@BenBlaiszik

Ben Blaiszik

3 years

Materials Genome Initiative (MGI) 2.0 is released! The MGI effort to speed the discovery and deployment of new materials enters its second decade boasting notable successes. @NIST @ENERGY @NSF @DARPA @NASA @DeptofDefense PDF: Web:

Tweet media one

1

9

36

@BenBlaiszik

Ben Blaiszik

3 months

Ok here is one example set. Microencapsulation of adhesive materials (e.g., cyanoacrylate and epoxy curing agent). Starting with a general question of how to encapsulate cyanoacrylate, Claude first identifies 3 of the main encapsulation techniques interfacial, in situ,

Tweet media one

@BenBlaiszik

Ben Blaiszik

3 months

Spent 2 hours this morning with Claude 3, and it's the most intensely I've been shocked yet. The Claude 3 Opus understanding of complex scientific topics is far ahead of GPT-4 on my self-made qualitative evals. I'd guess mid to advanced PhD level understanding of the topics

12

56

414

1

5

36

@BenBlaiszik

Ben Blaiszik

2 years

I miss in-person conferences, but have to admit something. I’ve made more diverse contacts and started more collaborations in 3 months of #AcademicTwitter than I’d expect from 2 yrs of a full slate of conferences. Online communication with the right audience is just so scalable.

2

2

37

@BenBlaiszik

Ben Blaiszik

10 months

Amazing to have @argonne called out specifically by @ericschmidt as one of the places advancing automated labs in his piece on how #Ai will transform science. Our team will be submitting an expansive article Monday to @digital_rsc - @A_Aspuru_Guzik reporting our team's efforts

Tweet media one

4

3

37

@BenBlaiszik

Ben Blaiszik

9 months

Taking notes: do not call @arxiv a cancer I personally submit everything that I can to ArXiv because it means the important information gets to start percolating MONTHS earlier and perpetually for free to other researchers. Thank you to all those who tirelessly run these

3

1

35

@BenBlaiszik

Ben Blaiszik

1 year

Kevin Jablonka described his latest work on LLMs applied to chemical discovery. He is building software to make using these models ridiculously simple. @kmjablonka #llmhack @EPFL_en Recording available later

Tweet media one

1

4

36

@BenBlaiszik

Ben Blaiszik

4 months

🌟 Exciting News for Open Source Enthusiasts in Science! We're launching an innovative online hub to bridge the gap between talented developers and cutting-edge open source science projects. 🚀 Are you leading a project and eager to expand your team of contributors? Connect

3

12

35

@BenBlaiszik

Ben Blaiszik

2 years

Proteins are hot! A protein/structure generative model makes generation possible at previously inaccessible scales - from way back at the end of May. 🥵 (4/8)(1/8)

@tachim

Tudor Achim

2 years

Super excited about all the possible applications for these new protein diffusion generative models from @namrata_anand2 ! The contrast between diffusion for 3D structures vs images is fascinating. Great to see this progress on the structural side.

3

66

252

1

2

35

@BenBlaiszik

Ben Blaiszik

3 years

It's amazing what a simple web interface widget can do to change user behavior. As such, Google Scholar has figured out that most researchers will do hours of work just to complete a progress bar... #OpenScience

Tweet media one

0

5

34

@BenBlaiszik

Ben Blaiszik

2 years

@andrewwhite01 You're doing some very creative work, and that is often lost on others. But we see you!

1

0

34

@BenBlaiszik

Ben Blaiszik

1 year

Elias Moubarak and team developed ClipDigest to summarize videos and fetch structured information from known databases (e.g., information on specific molecules mentioned) and speed learning. 📖

1

3

33

@BenBlaiszik

Ben Blaiszik

2 years

💫 That's it. Now you can track citations and impact for your research group or shared research projects! I hope this can help encourage focus and celebration of team metrics instead of just individual metrics. ✅ Follow me for more tips and tricks like this #MaterialsDataMonth

1

3

33

@BenBlaiszik

Ben Blaiszik

2 years

Looking to find materials informatics resources? 😎 Evgeny Blokhin and co. maintain an "Awesome List" of software, cloud services, dataset repositories, and standards. What are your favorite resources? Post them, and I will add them! #MaterialsDataMonth

Tweet media one

4

5

33

@BenBlaiszik

Ben Blaiszik

1 year

ScholarBert, a science focused BERT model trained on 221B tokens from scientific corpus, openly available on @huggingface . @LabsGlobus 🤗 link:

globuslabs (Globus Labs)

@BenBlaiszik

Ben Blaiszik

2 years

Zhi Hong et al have created ScholarBERT, the largest and most diverse scientific language model - trained on a 221B token scientific lit. dataset spanning disciplines. Interestingly, performance was similar among all BERT models across benchmark tasks.

Tweet media one

2

1

8

2

7

32

@BenBlaiszik

Ben Blaiszik

1 year

My mother called me today super excited ~3 times~. She wanted to tell me that she found a website that can write letters, do some math, and answer almost any question. So yeah, ChatGPT must be getting pretty close to the top of the saturation curve. 😅

2

0

32

@BenBlaiszik

Ben Blaiszik

2 years

Today I learned that Olympic figure skating gold medalist Nathan Chen's sister is a founder of likely CRISPR unicorn bio startup @mammothbiosci . It would be so interesting to hear about families like this in detail.

Tweet media one

1

1

32

@BenBlaiszik

Ben Blaiszik

2 years

We start off the month with a quick profile of the Center for Hierarchical Materials and Design (CHiMaD). CHiMaD is a @NIST -funded flagship Materials Genome Initiative (MGI) research center, strongly embracing MGI AI/ML/data approaches #MaterialsDataMonth

Tweet media one

1

7

32

@BenBlaiszik

Ben Blaiszik

1 year

Just scripted a task that would have taken me 3h manually or 2h to automate previously. Now, took 15 min of scripting by passing the API documentation to ChatGPT and asking it for the script + minor debug time. Balance of time for writing scripts has drastically changed. 🪄

3

2

32

@BenBlaiszik

Ben Blaiszik

2 years

What are the most interesting open datasets you've used in materials science and chemistry? Share them here! We're always looking for more data to work with and share with others 🤓 #data #machinelearning #artificialintelligence #datascience #openscience #MaterialsDataMonth

2

13

30

@BenBlaiszik

Ben Blaiszik

1 year

@drecmb and @60jaHa developed BOLLaMa to provide a new interface to chemical reaction optimization using Bayesian Optimization and LLMs. @EPFL 🦙

@drecmb

Andrés M Bran

1 year

Meet BOLLaMA! If you have ever wondered how to optimize a chemical reaction quickly, Bayesian Optimization might be for you! It's not very accessible tho: BOLLaMA tackles this by introducing an LLM-powered assistant, running @6ojaHa 's BO in the backend! #llmhack

4

17

82

1

3

31

@BenBlaiszik

Ben Blaiszik

2 years

Great interview with Rafa Gomez-Bombarelli ( @MIT ), one of my favorite collaborators, discussing how #MachineLearning is used for materials discovery. Some light moments discussing ML-driven zeolite discovery, peptide "invention", drug discovery, and more.

Tweet card media

Designing New Energy Materials with Machine Learning with Rafael...

Today we’re joined by Rafael Gomez-Bombarelli, an assistant professor in the department of material science and engineering at MIT. In our conversation with ...

www.youtube.com

0

7

28

@BenBlaiszik

Ben Blaiszik

7 months

Announcing a novel framework that merges Time-Dependent DFT and #ML to expedite electronic stopping power predictions by a factor of 10M! 📄 @WardLT2 @aschleife @ianfoster @UofIllinois @UChicago @argonne @argonne_lcf #openscience

Tweet media one

1

6

30

@BenBlaiszik

Ben Blaiszik

1 year

@andrewwhite01 and I have something cooking up for Tuesday next week. If you're interested in applications of large language models for materials science and chemistry you're going to love it. More details as soon as possible! #LLM @rachel_l_woods

3

5

31

@BenBlaiszik

Ben Blaiszik

3 months

🌟 Updated AI/ML Publication Data for 2023! 🌟 It feels like we are hearing about new advancements in #ML and #AI every day now. But, is that translating to publications in the sciences? 💡 2023 Numbers 🚀 Materials Science soared with a 18.5% increase in publications over

Tweet media one

Tweet media two

1

5

30

@BenBlaiszik

Ben Blaiszik

1 year

@andrewwhite01 An under appreciated aspect of LLMs is that we may potentially solve new _classes_ of problems considered hopelessly complex by being able to connect the dots across millions of publications and other sources. The battle begins against aging, cancer, brain related illnesses...

1

2

30

@BenBlaiszik

Ben Blaiszik

3 months

I’ve been here on Twitter for 3 years now. It’s been so rewarding getting to know many of you and learn about research I never would have found through other venues. Thank you all! 💜💙💜 👩‍🔬🔬👨‍🔬 💜💙💜

0

0

30

@BenBlaiszik

Ben Blaiszik

2 months

As part of the Bayesian Optimization Hackathon today, I've started an "Awesome List" of relevant tutorials, software, tools, datasets, and more. If you have favorite resources, reply with them here, or issue a pull request to the repo Repo:

Tweet card media

GitHub - materials-data-facility/awesome-bayesian-optimization

Contribute to materials-data-facility/awesome-bayesian-optimization development by creating an account on GitHub.

@SterlingBaird1

Sterling G. Baird

@SterlingBaird1

2 months

I'm pumped for the BO hackathon tomorrow! Nearly 400 registrants and 36 confirmed projects and counting 🚀 The hackathon website with all the necessary info is at . See you soon!

3

2

41

2

7

27

@BenBlaiszik

Ben Blaiszik

11 months

I love this city in the summer. #chicago

Tweet media one

1

1

28

@BenBlaiszik

Ben Blaiszik

1 year

Batteries are a cornerstone of future clean energy and transportation. 😎 🌬️ 🔋 🚗 Today @argonne released electrochem data from >600 Li-ion pouch cells w/ varying chemistries. This is a treasure trove of data to speed #ML / #AI applications in energy storage. @ENERGY #OpenScience

@WardLT2

Logan Ward

1 year

My team and I just published 10 years of @argonne battery testing data on @DataFacility . 600 Li-ion cells with many chemistries:

1

10

64

3

5

29

@BenBlaiszik

Ben Blaiszik

2 months

This was a first for me. A new PI title - "Principal Investor".

Tweet media one

4

2

29

@BenBlaiszik

Ben Blaiszik

1 year

February 2023 #ML and #AI for Science thread includes: 🔷 Machine learned MOF potentials 🔶 Open brain MRI dataset 🔷 ML for protein design (Hot topic🔥) 🔶 New open solar material database 🔷 ChemNLP and #GPT3 modeling 🔶 National AI Research Resource Report +more #OpenScience

2

3

28

@BenBlaiszik

Ben Blaiszik

3 years

I'm just getting started with #AcademicTwitter . If you are interested in topics of #DataScience , #AI4Science , #AI , and data infrastructure especially as applied to discovery in materials science, physics, chemistry, and more, give me a follow!

0

5

28

@BenBlaiszik

Ben Blaiszik

2 years

If you EVER see the line “Data available upon reasonable request” in a manuscript you are working on, please send the lead researcher this way. There are petabytes of storage available for scientists, and your research is too valuable to remain hidden. We've got you. #OpenScience

3

8

29

@BenBlaiszik

Ben Blaiszik

2 years

It is thus no surprise that #ML4PS2022 @NeurIPSConf is hot sauce. We can only expect these breakthroughs to accelerate in the coming months and years. Submission has passed here, but please consider participating! (8/8)

@KyleCranmer

Kyle Cranmer

2 years

We got an astounding 253 submissions to our #ML4PS2022 workshop @NeurIPSConf ! Dear reviewers, we are counting on you! Don't forget the review deadline is October 14. Thank you to the reviewers, organizers, and authors for contributing to this workshop & fostering the community!

2

18

120

1

2

28

@BenBlaiszik

Ben Blaiszik

9 months

Imagine a future where labs combine human ingenuity, AI, HPC, and robotics with reconfigurable modules to realize endless discovery. From education to materials, to biology, we've prototyped 5 use cases. Let's take a look at some concepts! 🧵 @argonne @argonne_lcf @ENERGY

Tweet media one

Tweet media two

2

6

28

@BenBlaiszik

Ben Blaiszik

1 year

In 2023, we should incentivize and reward data products and software equivalently to journal pubs. As we approach an #AI -centric research enterprise, software, services, #openscience , and data products are key components that will automate, connect, and unify distributed effort.

Tweet media one

1

8

27

@BenBlaiszik

Ben Blaiszik

2 years

I finally convinced Logan Ward ( @WardLT2 ) to join Twitter! He does amazing work in applied #machinelearning for materials/chemistry, and development of #openscience software like Matminer, Colemena, DLHub, and many more. Give him a follow! ✨🤖🙌 #AcademicTwitter

3

4

26

@BenBlaiszik

Ben Blaiszik

1 year

Have to admit that I've never seen a figure with panels that go from A-Z and then AA-AG. Don't even ask about the caption...🥴

Tweet media one

4

1

27

@BenBlaiszik

Ben Blaiszik

2 years

@PeakSquirrel @CT_Bergstrom @stripe @patrickc It's hard to imagine a reasonable or fair process that would allow for shedding 50% of a company the size of Twitter in a week. It's intentionally antagonistic and capricious to show what is coming.

0

0

27