Biocomputing 2022 - Proceedings Of The Pacific Symposium.

Saved in:
Bibliographic Details
:
TeilnehmendeR:
Place / Publishing House:Singapore : : World Scientific Publishing Company,, 2021.
Ã2022.
Year of Publication:2021
Edition:1st ed.
Language:English
Online Access:
Physical Description:1 online resource (431 pages)
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Intro
  • Content
  • Preface
  • AI-DRIVEN ADVANCES IN MODELING OF PROTEIN STRUCTURE
  • Session Introduction: AI-Driven Advances in Modeling of Protein Structure
  • 1. A short retrospect
  • 2. A brief outline of current research
  • 3. Future developments (complexes, ligand interactions, other molecules, dynamics, language models, geometry models, sequence design)
  • 4. What is needed for further progress?
  • 5. Overview of papers in this session
  • 5.1. Evaluating significance of training data selection in machine learning
  • 5.2. Geometric pattern transferability
  • 5.3. Supervised versus unsupervised sequence to contact learning
  • 5.4. Side chain packing using SE(3) transformers
  • 5.5. Feature selection in electrostatic representations of ligand binding sites
  • References
  • Training Data Composition Affects Performance of Protein Structure Analysis Algorithms
  • 1. Introduction
  • 2. Methods
  • 2.1. Experimental Design
  • 2.2. Task-specific Methods
  • 3. Results
  • 3.1. Performance on NMR and cryo-EM structures is consistently lower than performance on X-ray structures, independent of training set
  • 3.2. Inclusion of NMR data in the training set improves performance on held-out NMR data and does not degrade performance on X-ray data
  • 3.3. Known biochemical and biophysical effects are replicated in trained models
  • 3.4. Downsampling X-ray structures during training negatively affects performance on all types of data
  • 4. Conclusion
  • 5. Acknowledgments
  • References
  • Transferability of Geometric Patterns from Protein Self-Interactions to Protein-Ligand Interactions
  • 1. Introduction
  • 2. Related Work
  • 3. Methods
  • 3.1. Datasets
  • 3.2. Contact extraction
  • 3.3. Representing contact geometry
  • 4. Results
  • 4.1. Protein self-contacts exhibit clear geometric clustering.
  • 4.2. Many geometric patterns transfer to protein-ligand contacts
  • 4.3. Application to protein-ligand docking
  • 5. Conclusion and Future Work
  • Supplemental Material, Code, and Data Availability
  • Acknowledgments
  • References
  • Interpreting Potts and Transformer Protein Models Through the Lens of Simplified Attention
  • 1. Introduction
  • 2. Background
  • 3. Methods
  • 3.1. Potts Models
  • 3.2. Factored Attention
  • 3.3. Single-layer attention
  • 3.4. Pretraining on Sequence Databases
  • 3.5. Extracting Contacts
  • 4. Results
  • 5. Discussion
  • Acknowledgements
  • References
  • Side-Chain Packing Using SE(3)-Transformer
  • 1. Introduction
  • 2. Methods
  • 2.1. Neighborhood Graph Representation
  • 2.2. The SE(3)-Transformer Architecture
  • 2.3. Node Features
  • 2.4. Final Layer
  • 2.5. Rotamer Selection
  • 2.6. Experiments
  • 3. Results
  • 4. Conclusion
  • 5. Acknowledgements
  • 6. References
  • DeepVASP-E: A Flexible Analysis of Electrostatic Isopotentials for Finding and Explaining Mechanisms that Control Binding Specificity
  • 1. Introduction
  • 2. Methods
  • 2.1. Convolutional Neural Network
  • 2.2. Experimental Design
  • 2.3. Comparison with Existing Methods
  • 3. Results
  • 4. Conclusions
  • Acknowledgements
  • References
  • BIG DATA IMAGING GENOMICS
  • Session Introduction: Big Data Imaging Genomics
  • 1. Introduction
  • 2. Overview of Contributions
  • References
  • A New Mendelian Randomization Method to Estimate Causal Effects of Multivariable Brain Imaging Exposures
  • 1. Introduction
  • 2. Methods
  • 2.1. Step 1 : Mendelian randomization analysis on a single imaging exposure
  • 2.2. Step 2: Joint instrumental variables and imaging exposures selection
  • 2.3. Step 3: Causal effect identification for multiple imaging exposures
  • 3. Application to evaluate the causal effect of white matter microstructure integrity on cognitive function.
  • 3.1. Data and study cohort
  • 3.2. Results
  • 4. Simulation
  • 5. Discussion
  • Funding
  • Availability of data and materials
  • Authors' contributions
  • References
  • Efficient Differentially Private Methods for a Transmission Disequilibrium Test in Genome Wide Association Studies
  • 1. Introduction
  • 2. Preliminaries
  • 2.1. TDT
  • 2.2. Differential Privacy
  • 3. Methods
  • 3.1. Exact Algorithm
  • 3.2. Approximation Algorithm
  • 4. Experiments
  • 4.1. Simulation Data
  • 4.2. Results
  • 4.2.1. Run Time
  • 4.2.2. Accuracy
  • 4.3. Real Data
  • 5. Conclusion
  • Acknowledgement
  • References
  • Identifying Imaging Genetic Associations via Regional Morphometricity Estimation
  • 1. Introduction
  • 2. Methods
  • 3. Materials
  • 4. Experimental Design
  • 5. Results and Discussion
  • 6. Conclusion
  • Acknowledgements
  • References
  • Identifying Highly Heritable Brain Amyloid Phenotypes Through Mining Alzheimer's Imaging and Sequencing Biobank Data
  • 1. Introduction
  • 2. Method
  • 3. Materials
  • 4. Experimental Workow
  • 5. Results and Discussion
  • 6. Conclusion
  • Acknowledgements
  • References
  • Effects of ApoE4 and ApoE2 Genotypes on Subcortical Magnetic Susceptibility and Microstructure in 27,535 Participants from the UK Biobank
  • 1. Introduction
  • 2. Methods
  • 2.1. UK Biobank Participants
  • 2.2. T1-Weighted MRI
  • 2.3. Quantitative Magnetic Susceptibility
  • 2.4. Diffusion-Weighted MRI
  • 2.5. Statistical Analyses
  • 3. Results
  • 3.1. ApoE4 Microstructural Associations
  • 3.2. ApoE2 Microstructural Associations
  • 3.3. ApoE-by-Age Interactions
  • 3.3.1. ApoE Associations Stratified by Age
  • 4. Discussion
  • References
  • Separating Clinical and Subclinical Depression by Big Data Informed Structural Vulnerability Index and Its impact on Cognition: ENIGMA Dot Product
  • 1. Introduction
  • 2. Methods.
  • 2.1 Participants.
  • 2.2 Major Depressive Disorder Classification
  • 2.3 Imaging Protocol and Processing
  • 2.4 Calculation of linear indices of similarity
  • 2.5 Calculation of QRI
  • 2.7 Cognitive assessment
  • 2.8 Statistics
  • 3. Results
  • 3.1 Group differences in symptoms and biomarkers
  • 3.2 Effects of MDD on cognition.
  • 3.3. Cognitive association
  • 4. Discussion.
  • 5. Conclusion
  • 6. Acknowledgement
  • References
  • Generalizing Few-Shot Classification of Whole-Genome Doubling Across Cancer Types
  • 1. Introduction
  • 2. Related Work
  • 3. Cohort
  • 3.1. Cohort Selection
  • 3.2. Feature Extraction
  • 4. Methods
  • 4.1. Model
  • 4.2. Training
  • 4.2.1. Pre-Training
  • 4.2.2. Meta-Training
  • 4.3. Meta-Validation and Meta-Test
  • 4.4. Experiments
  • 4.4.1. Cancer Types
  • 4.4.2. Batch Effects
  • 5. Results
  • 5.1. Cancer Types
  • 5.2. Batch Effects
  • 5.2.1. Image Resolution
  • 5.2.2. Image Brightness
  • 6. Discussion
  • Software and Data
  • References
  • HUMAN INTRIGUE: META-ANALYSIS APPROACHES FOR BIG QUESTIONS WITH BIG DATA WHILE SHAKING UP THE PEER REVIEW PROCESS
  • Session Introduction: Human Intrigue: Meta-Analysis Approaches for Big Questions with Big Data While Shaking Up the Peer Review Process
  • 1. Introduction
  • 2. The Crowd Peer Review Process
  • 2.1 Reviewer's Feedback
  • 2.2 Conclusions
  • 3. Meta-Analysis in Biocomputing
  • 3.1 Novel Methods for Meta-Analysis of 'Omics Data
  • 3.2 Using Publicly Available Data in Methods Development
  • 3.3 Studying the Structure of Publicly Available Data
  • 3.4 Conclusions
  • Acknowledgements
  • References
  • Multitask Group Lasso for Genome Wide Association Studies in Diverse Populations
  • 1. Introduction
  • 2. Methods
  • 2.1. Population stratification
  • 2.2.1. Adjacency-constrained hierarchical clustering
  • 2.2.2. LD-groups across populations
  • 2.3. Multitask group Lasso.
  • 2.3.1. General framework and problem formulation
  • 2.3.2. Related work
  • 2.3.3. Gap safe screening rules
  • 2.4. Stability selection
  • 3. Experiments
  • 3.1. Data
  • 3.2. Preprocessing
  • 3.3. Comparison partners
  • 4. Results
  • 4.1. MuGLasso draws on both LD-groups and the multitask approach to recover disease SNPs
  • 4.2. MuGLasso provides the most stable selection
  • 4.3. MuGLasso selects both task-speci c and global LD-groups
  • 5. Discussion and Conclusions
  • Acknowledgments
  • Supplementary Materials and code
  • References
  • Mixed Effects Machine Learning Models for Colon Cancer Metastasis Prediction Using Spatially Localized Immuno-Oncology Markers
  • 1. Introduction
  • 2. Motivation for Comparison Study
  • 2.1. Review of Prior Spatial Omics Analysis Methods
  • 2.2. Motivation for Mixed Effects Machine Learning Approaches
  • 3. Materials and Methods
  • 3.1. Data Acquisition and Preprocessing
  • 3.2. Experimental Design: Prediction Tasks and Modeling Approaches
  • 4. Results
  • 4.1. Macro: Inter-Tumoral Prediction
  • 4.2. METS: Nodal and Distant Metastasis Prediction
  • 5. Discussion
  • 6. Conclusion
  • 7. Acknowledgements
  • 8. References
  • Improving QSAR Modeling for Predictive Toxicology Using Publicly Aggregated Semantic Graph Data and Graph Neural Networks
  • 1. Introduction
  • 2. Methods
  • 2.1. Obtaining toxicology assay data
  • 2.2. Aggregating publicly available multimodal graph data
  • 2.3. Heterogeneous graph neural network
  • 2.3.1. Node classification
  • 2.4. Baseline QSAR classifiers
  • 3. Results
  • 3.1. GNN node classification performance vs. baseline QSAR models
  • 3.2. Ablation analysis of graph components' inuence on the trained model
  • 4. Discussion
  • 4.1. GNNs versus traditional ML for QSAR modeling
  • 4.2. Interpretability of GNNs in QSAR
  • 4.3. Sources of bias and their effects on QSAR for toxicity prediction.
  • 5. Conclusions.