Günter Klambauer @gklambauer Twitter profile

Pinned Tweet

Günter Klambauer

25 days

VN-EGNN: E(3)-Equivariant GNNs with Virtual Nodes Enhance Protein Binding Site Identification New method to find binding pockets of proteins. Virtual nodes allow to employ distance losses directly. P: C: 🤗:

6

50

169

Last Seen Profiles

@linaboobina

@KevinScimecca

@OrangeTreeGolf

@hotwifesubs

@Brune4Vegas

@tsubasaE3

@lake_jovita

@runawaypacy

@DanielD31812543

@FuerzaRegidafr

@lhamtil

@MUCYOtweets

@lasvegashotwife

@DrRichardMarqu1

@z7iAZBd9An57GhR

@Root66uk

@gqiankun

@Grelo

@CMurr05

@Shishiliza44

@kailajames8388

@hands_full

@DarraghImp

@rednosebulldozn

@sumtingwong2019

@christinabeth17

@haseebcongress

@hugs4jazzy

@BethanyClift22

@feitas_official

@MxMinSuga

@LuyandaLyon

@NH2463482578282

@caseyrawlings3

@sewal_ls

@000rabi000

Günter Klambauer

@gklambauer

1 year

Very concise intro to Graph Neural Networks. I appreciate how the connection to Transformers and Deep Sets is made...

Everything is Connected: Graph Neural Networks

In many ways, graphs are the main modality of data we receive from nature. This is due to the fact that most of the patterns we see, both in natural and artificial systems, are elegantly...

arxiv.org

4

90

428

Günter Klambauer

@gklambauer

5 months

HITCHHIKER'S GUIDE Comprehensive overview of Geometric Deep Learning approaches to molecular system! Paper:

0

48

258

Günter Klambauer

@gklambauer

2 years

The ELLIS ML4Molecules workshop will also happen this year on November 28 in VIRTUAL format! Please find the announcement and the call for papers here: Looking forward to your contributions!

2

86

234

Günter Klambauer

@gklambauer

1 year

Diffusion models used to generate realistically looking microscopy images of cells:

4

31

177

Günter Klambauer

@gklambauer

8 months

🎾🥳DEEP LEARNING GRAND SLAM🥳🎾 (get a paper accepted at ICLR, ICML, NeurIPS within one year 😜) ICLR2023: [MHNfs]() ICML2023: [CLAMP]() NeurIPS2023: Initialization for Input-Convex Nets Credits should go to the PhD students 🎓👏

1

14

170

Günter Klambauer

@gklambauer

1 year

Large language models can generate new proteins -- even quite different from all known ones -- that show catalytic activity in wet-lab tests! Wow!

Large language models generate functional protein sequences across diverse families

Nature Biotechnology - A generative deep-learning model designs artificial proteins with desired enzymatic activities.

www.nature.com

1

27

164

Günter Klambauer

@gklambauer

1 year

Converting a diffusion model into a classifier... Simply by using the Bayes theorem. The simplicity of this is so cool:

Your Diffusion Model is Secretly a Zero-Shot Classifier

The recent wave of large-scale text-to-image diffusion models has dramatically increased our text-based image generation abilities. These models can generate realistic images for a staggering...

arxiv.org

0

28

165

Günter Klambauer

@gklambauer

1 year

Finally, someone made the decision-tree learning differentiable. Reformulation of the classification function to dense representations; approximation of step function with sigmoids, entmax function etc. Good results on a large number of datasets.

GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent

Decision Trees (DTs) are commonly used for many machine learning tasks due to their high degree of interpretability. However, learning a DT from data is a difficult optimization problem, as it is...

arxiv.org

2

32

160

Günter Klambauer

@gklambauer

8 months

Generating molecule with a given 3D shape. The method ShapeMol uses a conditional diffusion probabilistic model and equivariant layers. Benchmarked on experiments suggested by SQUID.

0

29

144

Günter Klambauer

@gklambauer

5 months

### MACHINE LEARNING FOR MOLECULES ### This Friday (Dec 8), at 9am in European Central Time zone!! Free to join for everyone! Program available here:

Advancing Molecular Machine Learning - Overcoming Limitations [ML4Molecules]

ELLIS workshop, VIRTUAL, December 8, 2023, unofficial NeurIPS2023 side-event

moleculediscovery.github.io

0

40

125

Günter Klambauer

@gklambauer

2 years

Self-supervised pre-trained networks are usually more robust to distribution shifts than networks pre-trained in supervised or un-supervised fashion. Thorough analysis of this behaviour:

How Robust is Unsupervised Representation Learning to Distribution Shift?

The robustness of machine learning algorithms to distributions shift is primarily discussed in the context of supervised learning (SL). As such, there is a lack of insight on the robustness of the...

arxiv.org

1

19

121

Günter Klambauer

@gklambauer

4 months

Multi-Modal Representation Learning for Molecular Property Prediction: Sequence, Graph, Geometry Three chemical modalities are contrasted against each other and used for property prediction. Unfortunately, only evaluated on the MoleculeNet benchmarks

0

24

120

Günter Klambauer

@gklambauer

11 months

Forget the large language models, protein engineering is the exciting field:

Machine Learning for Protein Engineering

Directed evolution of proteins has been the most effective method for protein engineering. However, a new paradigm is emerging, fusing the library generation and screening approaches of...

arxiv.org

3

21

115

Günter Klambauer

@gklambauer

7 months

Bayesian Deep Learning (BDL) Library released: Use BDL easily: e.g. MC-Dropout with a vision transformer can readily be coded in few lines Paper: Repo:

0

20

115

Günter Klambauer

@gklambauer

8 months

New physics-inspired descriptors for non-bonded interactions. Overcoming limitations of local geometry-based ML approaches, this method incorporates long-range effects with a focus on diverse non-bonded potentials. 🚀📊

0

23

111

Günter Klambauer

@gklambauer

10 months

Honored to be invited to talk about ArtificiaI Intelligence in the Austrian parliament! Will provide a re-recording of my talk here soon. Should it be English or German?

5

6

107

Günter Klambauer

@gklambauer

2 years

Open positions (PostDoc/PhD) in machine learning and deep learning at the LIT AI Lab and the ELLIS unit Linz: Please share!

2

57

100

Günter Klambauer

@gklambauer

9 months

Guide to training physics-informed neural nets (PINNs). Dives into architecture, featurization, loss types and loss balancing...

An Expert's Guide to Training Physics-informed Neural Networks

Physics-informed neural networks (PINNs) have been popularized as a deep learning framework that can seamlessly synthesize observational data and partial differential equation (PDE) constraints....

arxiv.org

1

19

104

Günter Klambauer

@gklambauer

9 months

PREFER, a Python-based framework powered by AutoSklearn, assists molecular property prediction. Effortlessly compare diverse representations and ML models for accelerated discovery. Open-source on GitHub. 🧪🔬 #Cheminformatics #ML Details:

PREFER: A New Predictive Modeling Framework for Molecular Discovery

Machine-learning and deep-learning models have been extensively used in cheminformatics to predict molecular properties, to reduce the need for direct measurements, and to accelerate compound...

pubs.acs.org

1

26

101

Günter Klambauer

@gklambauer

1 year

At NeurIPS, @HochreiterSepp critized LLMs for using the parameters for storing phrases; there would be better ways, such as a modern Hopfield network, to store this. Well, here is a retrieval system with 500x less parameters that follows this approach:

Nonparametric Masked Language Modeling

Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first nonparametric...

arxiv.org

2

16

96

Günter Klambauer

@gklambauer

1 year

Workshop program & registration (FREE): See you in a week!

6

45

95

Günter Klambauer

@gklambauer

1 year

Good survey on Diffusion models for graphs focused on applications for molecules (e.g. conformer prediction):

A Survey on Graph Diffusion Models: Generative AI in Science for...

Diffusion models have become a new SOTA generative modeling method in various fields, for which there are multiple survey works that provide an overall survey. With the number of articles on...

arxiv.org

1

21

98

Günter Klambauer

@gklambauer

9 months

Our method CLAMP (Contrastive Language Assay Molecule Pre-training, ICML2023) uses the same loss:

Enhancing Activity Prediction Models in Drug Discovery with the...

Activity and property prediction models are the central workhorses in drug discovery and materials sciences, but currently they have to be trained or fine-tuned for new tasks. Without training or...

arxiv.org

Lucas Beyer (bl16)

@giffmana

9 months

What makes CLIP work? The contrast with negatives via softmax? The more negatives, the better -> large batch-size? We'll answer "no" to both in our ICCV oral🤓 By introducing SigLIP, a simpler CLIP that also works better and is more scalable, we can study the extremes. Hop in🧶

27

290

2K

0

16

98

Günter Klambauer

@gklambauer

1 year

Geometric Learning concepts (Equivariance, Invariances) carried over to interpretability-methods: if a neural net is invariant to a certain transformation of the input, also the feature-importance should be invariant. Overly complicated notation, imho

Evaluating the Robustness of Interpretability Methods through...

Interpretability methods are valuable only if their explanations faithfully describe the explained model. In this work, we consider neural networks whose predictions are invariant under a specific...

arxiv.org

1

17

96

Günter Klambauer

@gklambauer

1 year

I really appreciate this one:

0

17

92

Günter Klambauer

@gklambauer

6 months

The ChEMBL Database in 2023 Thanks for supporting the development of AI/ML since 2014! :D

The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and...

Abstract. ChEMBL (https://www.ebi.ac.uk/chembl/) is a manually curated, high-quality, large-scale, open, FAIR and Global Core Biodata Resource of bioactive

academic.oup.com

0

9

90

Günter Klambauer

@gklambauer

1 year

AlphaFold is also vulnerable to adversarial attacks: small changes in the protein sequence can lead to drastic changes in folding structure. Shown on data with controlled changes to the protein sequence and on COVID-19 protein variants:

On the Robustness of AlphaFold: A COVID-19 Case Study

Protein folding neural networks (PFNNs) such as AlphaFold predict remarkably accurate structures of proteins compared to other approaches. However, the robustness of such networks has heretofore...

arxiv.org

1

16

88

Günter Klambauer

@gklambauer

2 years

One GNN encodes the graph in the usual way; a second GNN gets the input graph with an adjacency matrix, in which edges are determined by node similarity (kNN). Contrastive learning forces the GNN to preserve the structure:

Structure-Preserving Graph Representation Learning

Though graph representation learning (GRL) has made significant progress, it is still a challenge to extract and embed the rich topological structure and feature information in an adequate way....

arxiv.org

2

19

82

Günter Klambauer

@gklambauer

1 year

This work reports strong improvements at "protein design": This task is defined as predicting amino-acid sequence from 3D structure (coordinates) -- that means it's a kind of *inverse-AlphaFold*.

Structure-informed Language Models Are Protein Designers

This paper demonstrates that language models are strong structure-based protein designers. We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs),...

arxiv.org

1

19

84

Günter Klambauer

@gklambauer

8 months

The 3rd edition of the ELLIS ML4Molecules Workshop has been announced! ⏰ Virtual event on December 8 ⏰ Call for contributions open! Participation is free -- join us!!!

0

29

83

Günter Klambauer

@gklambauer

5 months

WOW! Diffusion models.. "We redesign the network layers to preserve activation, weight, and update magnitudes on expectation. [..] systematic application of this philosophy [..] results in considerably better networks at equal computational complexity"

0

14

81

Günter Klambauer

@gklambauer

2 years

Machine learning and deep learning methods compared on a large number of tasks on molecules (ADME-prediction, retro-synthesis, ...) with high imbalancedness:

ImDrug: A Benchmark for Deep Imbalanced Learning in AI-aided Drug Discovery

The last decade has witnessed a prosperous development of computational methods and dataset curation for AI-aided drug discovery (AIDD). However, real-world pharmaceutical datasets often exhibit...

arxiv.org

2

17

76

Günter Klambauer

@gklambauer

9 months

Uncertainty quantification with ensembles of graph neural networks (GNNs) in a regression setting:

Uncertainty Quantification for Molecular Property Predictions with...

Graph Neural Networks (GNNs) have emerged as a prominent class of data-driven methods for molecular property prediction. However, a key limitation of typical GNN models is their inability to...

arxiv.org

2

17

79

Günter Klambauer

@gklambauer

1 year

Using diffusion models to generate 3D conformations of molecules:

MiDi: Mixed Graph and 3D Denoising Diffusion for Molecule Generation

This work introduces MiDi, a novel diffusion model for jointly generating molecular graphs and their corresponding 3D arrangement of atoms. Unlike existing methods that rely on predefined rules to...

arxiv.org

2

19

79

Günter Klambauer

@gklambauer

1 year

Large language model pre-trained on masked SMILES. Then applied to MoleculeNet. Obviously, we can now predict clinical tox with AUC 99.7. Basically all wet-lab tox and clinical trials unnecessary now - LOL. The community must stop to deceive itself..

BARTSmiles: Generative Masked Language Models for Molecular Representations

We discover a robust self-supervised strategy tailored towards molecular representations for generative masked language models through a series of tailored, in-depth ablations. Using this...

arxiv.org

3

12

79

Günter Klambauer

@gklambauer

1 year

Prediction of "cancer genes" using a Graph Neural Networks on protein-protein-interaction networks, datasets like string-db.

2

16

77

Günter Klambauer

@gklambauer

1 year

Prediction of drug synergies with Deep Learning: now there seem to be sufficient data to predict the synergistic effects for new compounds . Advances the ideas of DeepSynergy ( ).

DeepSynergy: predicting anti-cancer drug synergy with Deep Learning

AbstractMotivation. While drug combination therapies are a well-established concept in cancer treatment, identifying novel synergistic combinations is chal

academic.oup.com

0

16

75

Günter Klambauer

@gklambauer

1 year

🗜️CLAMP (Contrastive Language-Assay-Molecule Pre-training), a new 🚀 method for *zero-shot drug discovery* that utilizes textual assay descriptions for molecular property prediction. Shows that scientific language models are bad at activity prediction.

2

26

75

Günter Klambauer

@gklambauer

1 year

New self-supervised learning: instead of predicting the masked part of an image, the method tries to match the embedding of the masked part of the region with the embedding from the context (for the masked region). Sry, difficult without formulas.. ;)

Self-Supervised Learning from Images with a Joint-Embedding...

This paper demonstrates an approach for learning highly semantic image representations without relying on hand-crafted data-augmentations. We introduce the Image-based Joint-Embedding Predictive...

arxiv.org

0

11

75

Günter Klambauer

@gklambauer

1 year

A message-passing network (MPNN) with one virtual node, connected to all nodes, can approximate a Transformer layer. Many groups have already used such virtual nodes in the past...

On the Connection Between MPNN and Graph Transformer

Graph Transformer (GT) recently has emerged as a new paradigm of graph learning algorithms, outperforming the previously popular Message Passing Neural Network (MPNN) on multiple benchmarks....

arxiv.org

0

14

75

Günter Klambauer

@gklambauer

9 months

Survey on self-supervised representation learning methods: from VICreg, SwAV, DINO to contrastive learning methods:

A Survey on Self-Supervised Representation Learning

Learning meaningful representations is at the heart of many tasks in the field of modern machine learning. Recently, a lot of methods were introduced that allow learning of image representations...

arxiv.org

0

18

74

Günter Klambauer

@gklambauer

1 year

We've used CLOOB to develop a search engine that unlocks querying a bioimaging database with chemical structures. The CLOOME encoder produces bioimage embeddings that can clearly distinguish new cell phenotypes: Search engine:

Andreas Fürst

@fuerst_andreas

1 year

We are excited to announce that our latest work, CLOOB, was accepted at this year's NeurIPS 🎉 CLOOB consistently outperforms CLIP at zero-shot transfer learning on a large variety of datasets. 🧵 1/4

2

62

246

1

21

71

Günter Klambauer

@gklambauer

2 years

Can a molecular graph be reconstructed from chemical fingerprints using machine learning? In this study ( ) a Transformer reconstructs molecules (SMILES) from ECFP4. Only 1-4% of SMILES could be correctly reconstructed -- this should work much better...

11

12

69

Günter Klambauer

@gklambauer

2 years

Open: 5 PhD and 2 postdoc positions in cheminformatics and molecular simulations at University Vienna

0

37

69

Günter Klambauer

@gklambauer

1 year

We are happy that our new ELLIS program "Machine Learning for Molecule Discovery" has been accepted by @ELLISforEurope !

ELLIS

@ELLISforEurope

1 year

Great news! We are excited to announce that our #ELLISforEurope research activities are expanding further! The proposals ‘Machine Learning for Molecule Discovery’ and ‘Learning for Graphics and Vision’ have been accepted as new #ELLISPrograms ! #AI #ML

1

21

109

0

14

68

Günter Klambauer

@gklambauer

2 years

Rich Sutton's new take on AI research:

1

18

68

Günter Klambauer

@gklambauer

6 months

🔬A “Google” for microscopy images and molecules🔬 Given an image of cells treated with a molecule, CLOOME can correctly identify this molecule – a task considered impossible even for human experts! Work by @ana_sanchezf of @AiddOne . Paper:

1

13

69

Günter Klambauer

@gklambauer

8 months

Oh, finally someone else also calling for more scientific rigor in the GNN-community :) Methods with and without hyperparameter selection compared on longe-range graph benchmark. In 2018, we also described this "hyperparameter selection bias"..

Where Did the Gap Go? Reassessing the Long-Range Graph Benchmark

The recent Long-Range Graph Benchmark (LRGB, Dwivedi et al. 2022) introduced a set of graph learning tasks strongly dependent on long-range interaction between vertices. Empirical evidence...

arxiv.org

2

18

68

Günter Klambauer

@gklambauer

7 months

In-Context Learning for Drug Discovery: Embedding-based few-shot learning methods are equivalent to "in-context learning" of LLMs. Here this concept is used again (but introduced before in MHNfs by Schimunek et al, 2023):

0

14

68

Günter Klambauer

@gklambauer

7 months

Excited to share our paper “A community effort in SARS-CoV-2 drug discovery" (after 3 years in work)! We report the results of an open science community effort to identify small-molecule inhibitors against SARS-CoV-2. Paper:

1

13

67

Günter Klambauer

@gklambauer

5 months

LAST REMINDER Tomorrow (Dec 8), 9am CET, this year's @ELLISforEurope Machine Learning for Molecules Workshop starts! Open for everyone to join for free! Schedule and registration:

5

18

66

Günter Klambauer

@gklambauer

8 months

Predicting molecular activity with XGBoost. 📊🧪 Study on feature importance, highlighting the need for expert interpretation. Hyperparameter optimization is crucial. Valuable guidelines for #cheminformatics practitioners. #ML Paper:

1

7

63

Günter Klambauer

@gklambauer

7 months

Beam enumeration: an approach for goal-directed molecular generation. Substructures with high likelihood are kept during learning and others are discarded. Language model as basis for the generation process. Paper:

0

8

63

Günter Klambauer

@gklambauer

1 year

HyperPCM for predicting drug target interactions: Join us today at the poster session of the @AI_for_Science workshop at #NeurIPS2022 . 12:05pm-1pm and 5:10pm-6pm, room388! @AiddOne

0

11

61

Günter Klambauer

@gklambauer

6 months

Key elements underlying molecular property prediction? Here self-supervised learning methods for small molecules are compared against descriptor-based methods. Another call for scientific rigor and to not over-hype GNNs for small mols..

2

11

60

Günter Klambauer

@gklambauer

4 months

Pretraining a GNN encoder for molecular structures on a conformation sampling task. Builds on the smart idea (that has been around) to use a diffusion model on the atom coordinates.

Pre-training of Molecular GNNs via Conditional Boltzmann Generator

Learning representations of molecular structures using deep learning is a fundamental problem in molecular property prediction tasks. Molecules inherently exist in the real world as...

arxiv.org

1

5

60

Günter Klambauer

@gklambauer

1 year

Overview of AI/ML methods for chemical synthesis:

Recent advances in artificial intelligence for retrosynthesis

Retrosynthesis is the cornerstone of organic chemistry, providing chemists in material and drug manufacturing access to poorly available and brand-new molecules. Conventional rule-based or...

arxiv.org

0

14

59

Günter Klambauer

@gklambauer

2 years

NVIDIA is also generating small molecules now: ;). VAE-like approach to generate molecules and optimize their properties by sampling from latent space. They are aware of the problems of these optimization cycles ( ).

Improving Small Molecule Generation using Mutual Information Machine

We address the task of controlled generation of small molecules, which entails finding novel molecules with desired properties under certain constraints (e.g., similarity to a reference molecule)....

arxiv.org

1

14

58

Günter Klambauer

@gklambauer

1 year

Many prominent deep architectures, such as ConvNeXT and VisionTransformers, compared on histopathology slides. ConvNext often perform best.

From Modern CNNs to Vision Transformers: Assessing the...

While machine learning is currently transforming the field of histopathology, the domain lacks a comprehensive evaluation of state-of-the-art models based on essential but complementary quality...

arxiv.org

1

15

56

Günter Klambauer

@gklambauer

1 year

Happy Easter! Luckily, it's Monday and arxiv does not take vacation: A graph neural network approach to predict chemical properties using an initial (bad) 3D conformation. Then gradually improves this conformation with de-noising. Good results on QM9:

GeoTMI:Predicting quantum chemical property with easy-to-obtain...

As quantum chemical properties have a dependence on their geometries, graph neural networks (GNNs) using 3D geometric information have achieved high prediction accuracy in many tasks. However,...

arxiv.org

1

9

56

Günter Klambauer

@gklambauer

1 year

ELLIS ML4Molecules 2022: will take place on Nov 28; FREE EVENT! Please register here:

ELLIS Machine Learning for Molecule Discovery Workshop

Big successes of machine learning (ML) for molecules have been achieved recently, e.g. the accurate prediction of protein 3D structure (Jump

www.eventbrite.com

0

21

55

Günter Klambauer

@gklambauer

10 months

Study of drug-ranking with anti-cancer properties for particular cell lines characterized by their gene expression profile. Interesting fact 1: bilinear regression is used to combine modalities Fact 2: Morgan FPs plus MLP worked better than GNN (sry folks)

3

7

56

Günter Klambauer

@gklambauer

1 year

Neural architecture searching (NAS) for new graph neural network architectures. The found architectures (Table 8 in Appendix) are relatively shallow and often LSTMs are selected to aggregate layers. Tests on non-molecule tasks (CiteSeer, PubMed).

Adversarially Robust Neural Architecture Search for Graph Neural Networks

Graph Neural Networks (GNNs) obtain tremendous success in modeling relational data. Still, they are prone to adversarial attacks, which are massive threats to applying GNNs to risk-sensitive...

arxiv.org

0

14

55

Günter Klambauer

@gklambauer

1 year

@andrewwhite01 According to a recent discussion at the ML4Molecules workshop, prediction of binding affinities must be strongly improved...

2

1

53

Günter Klambauer

@gklambauer

5 months

New way of goal-directed optimization of molecular structure: GNNs reformulated as mixed integer linear programming (MILP) problems. Solvers for those problems then provide chemical structures. Paper:

Mixed-Integer Optimisation of Graph Neural Networks for...

ReLU neural networks have been modelled as constraints in mixed integer linear programming (MILP), enabling surrogate-based optimisation in various domains and efficient solution of machine...

arxiv.org

0

9

53

Günter Klambauer

@gklambauer

1 year

OPEN POSITIONS! We have open positions for PhD students and PostDocs at my lab both in core machine learning as well as in "Machine Learning in Life Sciences":

0

34

54

Günter Klambauer

@gklambauer

6 months

🔥🔥🔥 Contrastive learning unleashed 🔥🔥🔥 A powerful, transferable MICROSCOPY IMAGE and MOLECULE ENCODER has been trained on CellPainting data through self-supervised learning (SSL). Paper: Code: App:

2

7

54

Günter Klambauer

@gklambauer

9 months

@giffmana @XiaohuaZhai @__kolesnikov__ @_basilM Happy to add that our method CLAMP (Contrastive Language Assay Molecule Pre-training, ICML2023) uses the same loss:

Enhancing Activity Prediction Models in Drug Discovery with the...

Activity and property prediction models are the central workhorses in drug discovery and materials sciences, but currently they have to be trained or fine-tuned for new tasks. Without training or...

arxiv.org

1

7

51

Günter Klambauer

@gklambauer

3 months

Thanks for three thousand and one citations! :D

1

52

Günter Klambauer

@gklambauer

7 months

Deep Docking methods now on the rise. This one, too, shows good performance with a relatively standard GNN/MPNN approach. Paper:

1

5

51

Günter Klambauer

@gklambauer

4 months

HyperPCM: Robust Task-Conditioned Modeling of Drug–Target Interactions Hypernetworks can carry over information from one protein activity prediction task to another. work by @EmmaJMSvensson in @AiddOne

0

7

51

Günter Klambauer

@gklambauer

1 year

Large dataset for ACTIVITY CLIFF modeling consists of 400K matched molecular pairs. Deep Learning models perform ok, again ECFP plus deep multi-layer perceptrons perform best -- in accordance with works by other groups...

Activity Cliff Prediction: Dataset and Benchmark

Activity cliffs (ACs), which are generally defined as pairs of structurally similar molecules that are active against the same bio-target but significantly different in the binding potency, are of...

arxiv.org

3

9

49

Günter Klambauer

@gklambauer

7 months

3

4

49

Günter Klambauer

@gklambauer

6 months

Exploring Causality in Single-Cell Genomics 🧬🤖 ML adapts causal techniques to high-dimensional single-cell data, addressing challenges and paving the way for informed experimental design. 📊🔬 Paper:

0

3

49

Günter Klambauer

@gklambauer

8 months

Multi-modal anything to anything :) NextGPT can take text, image, audio or video as input and generate output in these modalities. Quite efficient because only input projectors and output are trained.

1

7

48

Günter Klambauer

@gklambauer

1 year

Self-supervised representation learning for time series: a sequence is split into past and future, and the learned representations have to be similar. Does not need negatives -- I don't see why mode collapse should not happen..

SimTS: Rethinking Contrastive Representation Learning for Time...

Contrastive learning methods have shown an impressive ability to learn meaningful representations for image or time series classification. However, these methods are less effective for time series...

arxiv.org

0

14

47

Günter Klambauer

@gklambauer

1 year

It's a matter of the initial molecular representations. That's why I always for using ECFP/Morgan plus MLPs as baseline in all studies. Would have saved us from the pile of the zillion molecule encoders that we are now facing...

Kevin K. Yang 楊凱筌

@KevinKaichuang

1 year

Activity cliff: two molecules have similar structure, but a big difference in bioactivity. ML approaches don't do well with activity cliffs, but 'old-school' ML models tend to work better than deep learning. @DerekvTilborg @korney34 @fra_grisoni

5

29

161

2

1

47

Günter Klambauer

@gklambauer

1 year

🚀 Transforming molecules discovery with few-shot learning 🚀! Our new method **MHNfs** enriches molecule representations with CONTEXT. MHNfs sets a new state-of-the-art on the FS-Mol benchmark dataset. #drugdiscovery #fewshotlearning #AI Paper:

Johannes Schimunek

@JSchimunek

1 year

🚀 Excited to share our #ICLR2023 work on 🚨 context-enriched molecule representations🚦 improve few-shot drug discovery 💊 🚨 Paper: App: HuggingFace 🤗 under prep! #ICLR2023 🧑‍💼 poster 🗨: ⏰ Wed 3 May 4:30 pm - 6:30 pm CAT

2

23

44

0

11

45

Günter Klambauer

@gklambauer

6 months

Context dependent molecular representations are enabled by LLM-like architectures and self-supervised learning strategies. Performance and benchmarking are still limited. Paper:

0

8

46

Günter Klambauer

@gklambauer

1 year

I really appreciate this result about the expressiveness of the molecular fingerprint representations. The authors used transformers to translate from chemical fingerprints to SMILES; showed that conversion re-construct the connectivity of the molecule

Reconstruction of lossless molecular representations from fingerprints - Journal of Cheminformatics

The simplified molecular-input line-entry system (SMILES) is the most prevalent molecular representation used in AI-based chemical applications. However, there are innate limitations associated with...

jcheminf.biomedcentral.com

2

7

45

Günter Klambauer

@gklambauer

1 year

Investigating the difficulties with generating molecular graphs, for example, with a Variational Autoencoder:

Are VAEs Bad at Reconstructing Molecular Graphs?

Many contemporary generative models of molecules are variational auto-encoders of molecular graphs. One term in their training loss pertains to reconstructing the input, yet reconstruction...

arxiv.org

1

6

46

Günter Klambauer

@gklambauer

2 years

Please consider submitting to your work to our special issue "AI meets Toxicology"! We are happy to get submissions until November 30 (will remind you again ;)) !

3

18

45

Günter Klambauer

@gklambauer

6 months

Researchers have been striving to bridge BIOLOGY 🧫 and CHEMISTRY ⚗️ via transcriptomics, metabolomics, and whatever.... ... funny that the bridge turned out to be the oldest biological technique: CELL MICROSCOPY.

CLOOME: contrastive learning unlocks bioimaging databases for queries with chemical structures

Nature Communications - Artificial intelligence can assist in obtaining knowledge from bioimaging data, but need human annotation. Here the authors use multimodal contrastive learning to link...

www.nature.com

0

8

44

Günter Klambauer

@gklambauer

1 year

Molecular simulations of protein dynamics made easier by machine-learning based *backmapping' (TIL this is the opposite of coarse-graining):

Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

Coarse-graining (CG) accelerates molecular simulations of protein dynamics by simulating sets of atoms as singular beads. Backmapping is the opposite operation of bringing lost atomistic details...

arxiv.org

1

7

44

Günter Klambauer

@gklambauer

1 year

Generative model for molecules in 3D space: using latent diffusions on point-structured latent spaces. As for all generative models,metrics is difficult: stability & molecule stability, validity, and validity & uniqueness used.

Geometric Latent Diffusion Models for 3D Molecule Generation

Generative models, especially diffusion models (DMs), have achieved promising results for generating feature-rich geometries and advancing foundational science problems such as molecule design....

arxiv.org

0

4

43

Günter Klambauer

@gklambauer

10 months

How good are large language models at answering questions about PATHWAYS, MOLECULAR INTERACTIONS, and MECHANISMS? Answer: Not too good, but the LLM trained on bio-literature (BioGPT) are better than the usual GPTs, etc.. Study:

Comparative Performance Evaluation of Large Language Models for...

Understanding protein interactions and pathway knowledge is crucial for unraveling the complexities of living systems and investigating the underlying mechanisms of biological functions and...

arxiv.org

0

9

42

Günter Klambauer

@gklambauer

11 months

Offline black-box optimization with diffusion models. The denoising process is conditioned on the label (output of black-box function). Also used for optimizing molecules (ChEMBL), but unclear how this was exactly done...

Diffusion Models for Black-Box Optimization

The goal of offline black-box optimization (BBO) is to optimize an expensive black-box function using a fixed dataset of function evaluations. Prior works consider forward approaches that learn...

arxiv.org

0

10

41

Günter Klambauer

@gklambauer

1 year

Based on seminal work of using Normalizing Flows to learn Boltzmann distributions, here is an approach that tries to alleviate the problem of the costly simulation of training data:

Designing losses for data-free training of normalizing flows on...

Generating a Boltzmann distribution in high dimension has recently been achieved with Normalizing Flows, which enable fast and exact computation of the generated density, and thus unbiased...

arxiv.org

0

3

42

Günter Klambauer

@gklambauer

2 years

A comparison of different pooling functions for graph neural networks w.r.t. property prediction:

Physical Pooling Functions in Graph Neural Networks for Molecular...

Graph neural networks (GNNs) are emerging in chemical engineering for the end-to-end learning of physicochemical properties based on molecular graphs. A key element of GNNs is the pooling function...

arxiv.org

1

7

42

Günter Klambauer

@gklambauer

1 year

Combining Bayesian Neural Networks and Contrastive Learning: Unlabelled samples are used to learn a prior distribution of weights in a style similar to SimCLR.

Incorporating Unlabelled Data into Bayesian Neural Networks

Conventional Bayesian Neural Networks (BNNs) cannot leverage unlabelled data to improve their predictions. To overcome this limitation, we introduce Self-Supervised Bayesian Neural Networks, which...

arxiv.org

0

4

43

Günter Klambauer

@gklambauer

8 months

Some insights into problems of GNN generalization: it might have to do with the regularity of the graphs:

Graph Neural Networks Use Graphs When They Shouldn't

Predictions over graphs play a crucial role in various domains, including social networks and medicine. Graph Neural Networks (GNNs) have emerged as the dominant approach for learning on graph...

arxiv.org

0

10

42

Günter Klambauer

@gklambauer

5 months

From @patwalters at ELLIS ML4Molecules: **STOP using MoleculeNet and TDC!** I fully support this statement (and have criticized those datasets myself for a long time)

4

5

41

Günter Klambauer

@gklambauer

1 year

Modeling long range intramolecular forces by extending message-passing networks: on addition to the local messages, nodes are mapped to non-local Fourier space and updated there (Ewald message passing). Improves MAE on energy in OC20:

Ewald-based Long-Range Message Passing for Molecular Graphs

Neural architectures that learn potential energy surfaces from molecular data have undergone fast improvement in recent years. A key driver of this success is the Message Passing Neural Network...

arxiv.org

0

7

41

Günter Klambauer

@gklambauer

7 months

Almost identical work to CLAMP "Contrastive Language-Assay-Molecule-Pretraining" (ICML2023): common space for natural language and mols. CLAMP allows to steer activity prediction with natural language models for zero-shot activity/property prediction.

Haiteng Zhao

@Syd59067213

11 months

Introducing our new work: GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning Say goodbye to the supervised molecule property prediction and embrace the instruction-based zero-shot paradigm with GIMLET!

2

13

24

0

5

41

Günter Klambauer

@gklambauer

1 year

Since the LLM community has an evolutionary tree, I thought we should have one, too.. :) Sharing my personal perspective on the evolution of Deep Activitiy prediction Models (DAMs). PDF w links Bib

Yann LeCun

@ylecun

1 year

A survey of LLMs with a practical guide and evolutionary tree. Number of LLMs from Meta = 7 Number of open source LLMs from Meta = 7 The architecture nomenclature for LLMs is somewhat confusing and unfortunate. What's called "encoder only" actually has an encoder and a decoder…

74

742

3K

2

15

40

Günter Klambauer

@gklambauer

2 years

Self-supervised learning for molecular graphs is still in its infancy according to this paper: . For example, on the MUV dataset, the methods don't even beat a random method. Downstream: molecular property prediction. Suggest a new suite of probe tasks.

Evaluating Self-Supervised Learning for Molecular Graph Embeddings

Graph Self-Supervised Learning (GSSL) provides a robust pathway for acquiring embeddings without expert labelling, a capability that carries profound implications for molecular graphs due to the...

arxiv.org

0

8

38

Günter Klambauer

@gklambauer

2 years

Transformer architecture learns chemical substructures together with natural language and can answer to questions about chemistry: (similar to our BioassayCLR method, but less focused on activity/property prediction).

A deep-learning system bridging molecule structure and biomedical text with comprehension compara...

Nature Communications - To accelerate biomedical research process, deep-learning systems are developed to automatically acquire knowledge about molecule entities by reading large-scale biomedical...

www.nature.com

1

39

Günter Klambauer

@gklambauer

7 months

We welcome the new researchers @sohvi_luukkonen and @weilincv to the Institute for Machine Learning and @LITAILab of @jkulinz !

2

1

39

Günter Klambauer

@gklambauer

8 months

Wow, this is really cool: Contrastive pre-training on DNA sequences an the human genome to build a retrieval system. Then making a DNA vector database -- allows to align reads via maximum-inner-product-search!

Embed-Search-Align: DNA Sequence Alignment using Transformer Models

DNA sequence alignment involves assigning short DNA reads to the most probable locations on an extensive reference genome. This process is crucial for various genomic analyses, including variant...

arxiv.org

1

3

40

Günter Klambauer

@gklambauer

8 months

I will be giving non-scientific, non-technical talks about #AI , #LargeLanguageModels , and #MachineLearning at the #ArsElectronicaFestival the next three days! Happy to discuss and meet you there!

1

5

40