Enabling Scientific Discovery Cluster

Data-enabled Computational Science
Researchers in this group are drawn together from the ongoing Center for Computational Mathematics that administers the campus-wide Computational Science, Mathematics and Engineering (CSME) graduate program. With the rise of data, CSME area has evolved into Data-enabled Computational Science that seeks to advance and make available integrated approaches to massively parallel computation – from architectures to algorithms — as building blocks to scientists and engineers.
- Nuno Bandeira Computer Science & Engineering and Skaggs School of Pharmacy & Pharmaceutical Sciences
- Randolph Bank Mathematics
- Chaitan Baru San Diego Supercomputer Center
- Jelena Bradic Mathematics
- Timothy Brady Psychology
- Manmohan Chandraker Computer Science & Engineering
- Alex Cloninger Mathematics
- George Fuller Physics
- Philip Gill Mathematics
- Ron Graham Computer Science & Engineering and Mathematics
- Melissa Gymrek Medicine and Computer Science & Engineering
- Michael Holst Mathematics and Physics
- Ryan Kastner Computer Science & Engineering
- Melvin Leok Mathematics
- Stefan Llewellyn Smith Scripps Institution of Oceanography
- Jill Mesirov Medicine
- Michael Norman Physics and San Diego Supercomputer Center
- Tajana Rosing Computer Science & Engineering
- Rayan Saab Mathematics
- Debashish Sahoo Pediatrics and Computer Science & Engineering
- Nuno Vasconcelos Electrical & Computer Engineering
- Bradley Voytek Cognitive Science and Neurosciences
- Ed Vul Psychology
- Frank Wuerthwein Physics and San Diego Supercomputer Center
- Kesong Yang NanoEngineering
- Sonia Martinez Diaz Mechanical and Aerospace Engineering
Understanding and Predicting Dynamics of Complex Biological Systems - Biomedical Informatics
Although advances in biomedical technology have led to first direct glimpses of how distributed biological networks of many kinds support our experience, behavior, and cognition, these views are still highly limited in spatiotemporal resolution and detail. Continued advances in microelectronics and computing capabilities make possible the measurement of the dynamics of biological systems with increasingly high spatial and temporal resolution. Clinical windows even afford the possibility of recording this activity on multiple scales simultaneously. Signal processing of high-dimensional data to model relationships between complex brain dynamic patterns, experience, and behavior, i.e. natural cognition in complex environments is an increasing challenge. Its impact on medicine can be enormous, for instance, in methods to identify, validate and optimize usable biomarkers of neurological and psychiatric diseases. Tackling these and related problems must make use of continuing advances in data science across fields.
- Ludmil Alexandrov Bioengineering
- Rommie Amaro Chemistry & Biochemistry
- Nuno Bandeira Computer Science & Engineering and Skaggs School of Pharmacy & Pharmaceutical Sciences
- Jelena Bradic Mathematics
- Li-Tien Cheng Mathematics
- Todd Coleman Bioengineering
- Jade d’Alpoim Guedes Anthropology and Scripps Institution of Oceanography
- Virginia de Sa Cognitive Science
- Barry Grant Molecular Biology
- Rob Knight Pediatrics and Computer Science & Engineering
- Sergey Kryazhimskiy Biological Sciences
- Bo Li Mathematics
- J. Andrew McCammon Chemistry & Biochemistry and Pharmacology
- Jill Mesirov Medicine
- David Moore Psychiatry
- Lucila Ohno-Machado Medicine
- Kim Prather Chemistry & Biochemistry
- Padmini Rangamani Mechanical and Aerospace Engineering
- Debashish Sahoo Pediatrics and Computer Science & Engineering
- Terrence Sejnowski Neurobiology
- Alan Simmons Psychiatry
- Nick Spitzer Neurobiology
- Shankar Subramaniam Bioengineering
- Ruth Williams Mathematics
Statistics in Biology and Health Sciences
We aim to address the twin challenges of inferential rigor and scientific understanding in the biological and health sciences. Scientific discovery requires rigorous inference to ensure validity of the studies conducted and the scientific conclusions drawn from them. The validity of the inference depends on explicit and implicit assumptions about the data, and rigorous verification of those assumptions is required. Statistical rigor is imperative for reproducibility of scientific discoveries. Scientific understanding requires formulating answerable questions and interpretable models. While every scientific study involves experts on the particular subject matter, even experts can have difficulty in formulating precise testable hypotheses and designing studies that address those hypotheses, as well as developing and fitting appropriate statistical models that directly address those hypotheses. We will combine strategies for developing interpretable and appropriate statistical models with statistically-rigorous inference to address these data science challenges.
- Ian Abramson Mathematics
- Ludmil Alexandrov Bioengineering
- Nuno Bandeira Computer Science & Engineering and Skaggs School of Pharmacy & Pharmaceutical Sciences
- Jelena Bradic Mathematics
- Melissa Gymrek Medicine and Computer Science & Engineering
- Sergey Kryazhimskiy Biological Sciences
- Thomas Liu Center for fMRI and Radiology Psychiatry and Bioengineering
- Jill Mesirov Medicine
- Bhaskar Rao Electrical & Computer Engineering
- Debashish Sahoo Pediatrics and Computer Science & Engineering
- Armin Schwartzman Biostatistics
- Alan Simmons Psychiatry
- Wesley Thompson Biostatistics and Family Medicine & Public Health
- Xin Tu Biostatistics and Family Medicine & Public Health
- Ronghui (Lily) Xu Mathematics and Family Medicine & Public Health
- Angela Yu Cognitive Science
- Kun Zhang Bioengineering
- Wenxin Zhou Mathematics
Open Collaborative Ecosystem for Advanced Neuroimaging eXploration (OCEANX)
We aim to address challenges in the analysis and processing of large and complex neuroimaging datasets that are acquired across a diverse range of studies of brain physiology, function, and structure. Our approach is to create a flexible storage and computing environment that will enable investigators to readily take advantage of state-of-the art data science tools and to collaborate more easily across studies. We envision that OceanX will serve as a unique testbed for discovery, collaboration, and training in the neuroimaging data sciences. It will also enable research aimed at determining the environments, processes, and methods that best foster discovery and collaboration.
- Christine Fennema-Notestine Psychiatry and Radiology
- Terry Jernigan Cognitive Science and Psychiatry
- Thomas Liu Center for fMRI and Radiology Psychiatry and Bioengineering
- Scott Makeig Institute for Neural Computation
- Nick Spitzer Neurobiology
- Xin Tu Biostatistics and Family Medicine & Public Health
- Eric Wong Radiology and Psychiatry
- Angela Yu Cognitive Science
Computational Discovery in Material Sciences
We aim to develop data science as a powerful accelerator for materials discovery and design across all scales, from atomistic manipulation to nanoscale properties to device-level integration. In recent years, data science has emerged as an increasingly important tool for materials science due to the advent of vast computing power and efficient, accurate quantum chemical codes as well as development of combinatorial experimental techniques. The consequence of both trends is an explosion in the quantity of materials data generated. Increasingly, the key challenge is the development of analytics to generate useful insights and design principles from this large materials data, aka the decoding of the Materials Genome.
- Manmohan Chandraker Computer Science & Engineering
- Todd Coleman Bioengineering
- Massimiliano Di Ventra Physics
- Jian Luo NanoEngineering and Materials Science & Engineering
- Shyue Ping Ong Nanoengineering
- Francesco Paesani Chemistry & Biochemistry and San Diego Supercomputer Center
- Tod Pascal NanoEngineering
- Kesong Yang NanoEngineering
Computational Neurosciences
Computational neuroscience is a unique source of data science innovation because it both requires advancing data analytic methods to grapple with high-dimensional, structured, dynamic neural signals, and because its subject of study — the human mind and brain — is the most effective natural learning system we know of. Thus computational neuroscience offers two avenues for advancing data science: innovation in analytic methods to understand the neural substrates of the human mind, and the design of computational models to emulate natural intelligence.
- Timothy Brady Psychology
- Gert Cauwenberghs Bioengineering
- Todd Coleman Bioengineering
- Virginia de Sa Cognitive Science
- Massimiliano Di Ventra Physics
- Thomas Liu Center for fMRI and Radiology Psychiatry and Bioengineering
- David Moore Psychiatry
- Padmini Rangamani Mechanical and Aerospace Engineering
- Bhaskar Rao Electrical & Computer Engineering
- Alan Simmons Psychiatry
- Bradley Voytek Cognitive Science and Neurosciences
- Ed Vul Psychology
- Eric Wong Radiology and Psychiatry
- Angela Yu Cognitive Science
- Peter Gerstoft San Diego Supercomputer Center, Structural Bioinformatics Laboratory
Geosciences and Climate/Weather Predictions
Physical models and hypotheses are central to Geosciences and Climate/Weather Predictions. Earth science data sets tend to be poorly sampled, noisy, and incomplete, and are often difficult to use in standard machine learning algorithms. It would be transformative if we could develop a modeling framework that combines machine learning and physical modeling. In this group we use data science tools to solve inverse problems in ocean and earth observations. This will improve geophysical predictions such a weather and wildfires.
- Jennifer Burney Global Policy & Strategy
- Richard Carson Economics
- Bruce Cornuelle Scripps Institution of Oceanography
- Jade d’Alpoim Guedes Anthropology and Scripps Institution of Oceanography
- Peter Gerstoft Scripps Institution of Oceanography and Electrical & Computer Engineering
- Stefan Llewellyn Smith Scripps Institution of Oceanography
- Thomas G. Masters Scripps Institution of Oceanography
- Michael Norman Physics and San Diego Supercomputer Center
- Armin Schwartzman Biostatistics
- Frank Vernon Scripps Institution of Oceanography
- Peter Gerstoft San Diego Supercomputer Center, Structural Bioinformatics Laboratory
Bioinformatics
Understanding large biological datasets from high-throughput methods such as DNA sequencing and mass spectrometry poses considerable challenges. Advanced analytical methods are required for problems ranging from genomic evolution to gene expression to studies of human, animal and environmental microbiomes. Typical issues include sparse and/or compositional data, highly multivariate data with far more dimensions than samples, and the need to integrate with heterogeneous phenotype data including imaging and clinical records. Solving these problems has the potential to revolutionize our understanding of the living world, and our ability to control it in medical and technological applications.
- Ludmil Alexandrov Bioengineering
- Nuno Bandeira Computer Science & Engineering and Skaggs School of Pharmacy & Pharmaceutical Sciences
- Barry Grant Molecular Biology
- Melissa Gymrek Medicine and Computer Science & Engineering
- Rob Knight Pediatrics and Computer Science & Engineering
- Thomas Liu Center for fMRI and Radiology Psychiatry and Bioengineering
- Jill Mesirov Medicine
- Siavash Mirarab Electrical & Computer Engineering
- Pavel Pevzner Computer Science & Engineering
- Bing Ren Cellular and Molecular Medicine
- Debashish Sahoo Pediatrics and Computer Science & Engineering
- Wei Wang Chemistry & Biochemistry and Cellular & Molecular Medicine
- Ronghui (Lily) Xu Mathematics and Family Medicine & Public Health
- Kun Zhang Bioengineering
- Sheng Zhong Bioengineering
- Peter Rose San Diego Supercomputer Center
Sensing Data and Sensor Networks
Sensor networks are used in most observation systems. Examples include classical seismic sensor network to modern 5G telecommunications using wave propagation data. Other types of networks temperature or textual information and the sensor networks might evolve with observations. Array processing has here been a main workhorse. Graph signal processing might provide a more general approach for processing the data.
Traditional approaches to traffic engineering and network deployments rely on generic modelling assumptions and rule of thumb over provisioning. Future generation systems, such as 5G systems, aspire to network vastly larger variety of devices to support highly diverse applications. The design and operation of these expensive, complex interconnected systems will be increasingly data driven and can benefit from advances in machine learning algorithms. Our goals in this regard include (1) creation of datasets to capture city scale data traffic and mobility patterns and (2) algorithms to infer numerous measures of value to network designers and operators as well as multiple disciplines, including public health, mental health, environment, transportation and energy usage.
- Chaitan Baru San Diego Supercomputer Center
- Manmohan Chandraker Computer Science & Engineering
- Bruce Cornuelle Scripps Institution of Oceanography
- William Griswold Computer Science & Engineering and Design Lab
- Ryan Kastner Computer Science & Engineering
- David Moore Psychiatry
- Bhaskar Rao Electrical & Computer Engineering
- Tajana Rosing Computer Science & Engineering
- Glenn Tesler Mathematics
- Frank Vernon Scripps Institution of Oceanography
- Sonia Martinez Diaz Mechanical and Aerospace Engineering