NW-CP

Intersecting Graph Representation Learning and Cell Profiling

A Novel Approach to Analyzing Complex Biomedical Data

Nima Chamyani

Supervisor: Wesley Schaal

Introduction
Aim of the study
Methods
Results
Future outlook

Cell profiling with chemical preturnation

Sample Preparation

Cell Culture

Chemical Perturbation

Fixation

Staining (Painting)

Microscopy

Graphs

Nodes (Vertices)

Edges (Links)

Multimodal Graphs

Chemicals

Proteins

Pathways

Degree of Nodes

Highly connected

Typically connected

Less connected

Graph's Modules

Community 1

Community 2

Network Analysis

Community Detection

Centrality Measures

Path Analysis

Subgraph Mining

Graph Clustering

Role Discovery

Network Analysis (Mathematics)

Community Detection

Centrality Measures

Path Analysis

Subgraph Mining

Graph Clustering

Role Discovery

Graph Representation Learning (ML)

Learning Node Embeddings

Preservation of Graph Structure

Scalability

Edge and Graph-Level Embedding

Incorporation of Node and Edge Attributes

Combination with Deep Learning

Graph Representation Learning (ML)

Learning Node Embeddings

Preservation of Graph Structure

Scalability

Edge and Graph-Level Embedding

Incorporation of Node and Edge Attributes

Combination with Deep Learning

Aim of the Study

Analyzing cell painting biomedical data using graphs.

Enhance understanding of chemical structure-cellular phenotype relationships.

Apply advanced machine learning with graphs in replace to conventional ml techniques.

Developing drug repurposing, drug combination and drug generative models with GRL.

Introducing a new workflow for drug discovery and development.

Data

BioData Aggregation

BioData Featurization

Graphs

Data

BioData Aggregation

BioData Featurization

Graphs

Data

BioData Aggregation

BioData Featurization

Graphs

Data

BioData Aggregation

BioData Featurization

Graphs

Graph-Level Molecular Predictor (GLMP)

Bio-Graph Integrative Predictor (BioGIP)

Optimized Molecular Generator (OMG)

Graph-Level Molecular Predictor (GLMP)

Bio-Graph Integrative Predictor (BioGIP)

Optimized Molecular Generator (OMG)

Graph-Level Molecular Predictor (GLMP)

Bio-Graph Integrative Predictor (BioGIP)

BioGIP on HeteroGraphs

Optimized Molecular Generator (OMG)

Graph-Level Molecular Predictor (GLMP)

Bio-Graph Integrative Predictor (BioGIP)

Optimized Molecular Generator (OMG)

Combination prediction of nodes

Inactive chemicals

Combination prediction of nodes

Inactive chemicals

Selection should make sense

Combination prediction of nodes

Inactive chemicals

Selection should make sense

PCA1 Range Selection Method

Combination prediction of nodes

PCA1 Range Selection Method

Louvain community detection Method

Combination prediction of nodes

PCA1 Range Selection Method

Louvain community detection Method

Combination prediction of nodes

PCA1 Range Selection Method

Louvain community detection Method

Combination prediction of nodes

PCA1 Range Selection Method

Louvain community detection Method

96795-89-0 322473-89-2	eperisone PIK-75	P7C3 JNJ-38877605	BENZYDAMINE 1257628-77-5
Y 134 67198-19-0	terbinafine SNAP-94847	Encenicline BMS-794833	Homochlorcyclizine 1130067-06-9
ML298 JTC-801 free base	lumateperone drofenine	tolperisone hydroquinidine	54635-62-0 XANOMELINE

OMG learning from ordinal regression

GCPN

GraphAF

Future Perspectives

Investigate disparities between regression and classification
Validate drug combination discovery approach
Examine more network analysis
Improve OMG model's components
Use graphs for interpretation

Acknowledgements

Special thanks to Wes and Ola
Sincere appreciation to David, Martin, Anders, Jonne and Phil
Gratitude to all master students specially Erik and Victor

Thank You all for listening

InfoGraph

Maximizing the mutual information

Attribute masking

Capturing domain knowledge by learning the regularities of the node/edge attributes distributed over graph structure

GCPN

Graph Convolutional Network (GCN) as a Policy Network: A policy network, in reinforcement learning terms, dictates the action to be taken at each step. Here, the actions are the addition of new nodes and edges to the graph being generated.

Reward Function: This function scores potential actions (i.e., adding a new edge or node) based on their quality.

Optimization with Policy Gradient: GCPN uses a technique called policy gradient to optimize the policy network. The idea is to increase the probability of actions that lead to higher rewards. This is done by iteratively updating the policy network's parameters to maximize the expected cumulative reward.

GraphAF

Autoregressive Approach: The term 'autoregressive' means that the model uses its own previous outputs as input for the next step. For GraphAF, this means generating a new edge in the graph based on the edges that were generated in the previous steps. Essentially, the graph is generated incrementally, with each new edge being influenced by the structure of the graph up to that point.

Flow Model: The 'flow' part of GraphAF refers to the concept of normalizing flows, which is a method used in machine learning to create complex probability distributions. This allows the model to learn a complex distribution over possible graphs, which can then be sampled to generate new graphs.

Sequential Generation: building the graph step by step. At each step, GraphAF proposes a new edge by predicting its two endpoints based on the current graph structure.

Graphs

Nodes (Vertices)

Edges (Links)

Multimodal Graphs

Chemicals

Proteins

Pathways

Degree of Nodes

Highly connected

Typically connected

Less connected

Graph's Modules

Community 1

Community 2

Jeancarlo C. Leão et. al