- Curriculum Review with HDSI Program Vice Chair
- Capstone Overview with Industry Partner
- How to engage and recruit our talent
- Industry Partnership Alliance Program
EnCORE will tackle important problems in foundations of Data Science
From left to right, Prof. Yusu Wang, Prof. Barna Saha, Prof. Gal Mishne, Prof. Arya Mazumdar, and Outreach Leader Saura Naderi.
A new National Science Foundation initiative has created a $10 million dollar institute led by computer and data scientists at University of California San Diego that aims to transform the core fundamentals of the rapidly emerging field of Data Science.
Called the Institute for Emerging CORE Methods in Data Science (EnCORE), the institute will be housed in the Department of Computer Science and Engineering (CSE), in collaboration with The Halıcıoğlu Data Science Institute (HDSI), and will tackle a set of important problems in theoretical foundations of Data Science.
EnCORE will join three other NSF-funded institutes in the country dedicated to the exploration of data science through the NSF’s Transdisciplinary Research in Principles of Data Science Phase II (TRIPODS) program.
“The NSF TRIPODS Institutes will bring advances in data science theory that improve health care, manufacturing, and many other applications and industries that use data for decision-making,” said NSF Division Director for Electrical, Communications and Cyber Systems Shekhar Bhansali.
UC San Diego Chancellor Pradeep K. Khosla said UC San Diego’s highly collaborative, multidisciplinary community is the perfect environment to launch and develop EnCORE. “We have a long history of successful cross-disciplinary collaboration on campus and off campus, with renowned research institutions across the nation. UC San Diego is also home to the San Diego Supercomputer Center, the HDSI, and leading researchers in artificial intelligence and machine learning,” Khosla said. ”We have the capacity to house and analyze a wide variety of massive and complex data sets by some of the most brilliant minds of our time, and then share that knowledge with the world.”
Barna Saha, the EnCORE project lead and an associate professor in UC San Diego’s Department of Computer Science and Engineering and HDSI, said: “We envision EnCORE will become a hub of theoretical research in computing and Data Science in Southern California. This kind of national institute was lacking in this region, which has a lot of talent. This will fill a much-needed gap.”
The other UC San Diego faculty members in the institute include professors Arya Mazumdar, Gal Mishne, and Yusu Wang from HDSI; Kamalika Chaudhury, and Sanjoy Dasgupta from CSE; and Fan Chung Graham from Mathematics. Saura Naderi of HDSI will spearhead the outreach activities of the institute.
UC San Diego team members will work with researchers from three partnering institutions – University of Pennsylvania, University of Texas at Austin and University of California, Los Angeles – to transform four core aspects of data science: complexity of data, optimization, responsible computing, and education and engagement.
“Professor Barna Saha has assembled a team of exceptional scholars across UC San Diego and across the nation to explore the underpinnings of data science. This kind of institute, focused on groundbreaking research, innovative education and effective outreach, will be a model of interdisciplinary initiatives for years to come,” said Department of Computer Science and Engineering Chair Sorin Lerner.
EnCORE Institute will be directed by HDSI and CSE Associate Professor Barna Saha.
CORE Pillars of Data Science
The EnCORE Institute seeks to investigate and transform three research aspects of Data Science:
- C, for Complexities of Data: data the researchers are dealing with is complex, of massive size and noisy. They will investigate what new tools and approaches are needed to address data complexity, including an overhaul of the concepts of algorithms, statistics and machine learning.
- O, for Optimization: a very old and traditional field, it now needs to be data driven, which brings new challenges. Modern data and technology have created a large gulf between theory and practice of optimization. Adaptive methods and human intervention can lead to major advancement in machine learning.
- R, for Responsible Learning: the ethical responsibility of when researchers are dealing with massive data, data with sensitive information and using that data to make decisions needs to be reoriented to adapt to an uncertain world.
“EnCORE represents exactly the kind of talent convergence that is necessary to address the emerging societal need for responsible use of data. As a campus hub for data science, HDSI is proud of a compelling talent pool to work together in advancing the field,” said HDSI founding director Rajesh K. Gupta.
Team members expressed excitement about the opportunity of interdisciplinary research that the institute will provide. They will work together to improve the state-of-the-art in privacy-preserving machine learning and robust learning, and to integrate geometric and topological ideas with algorithms and machine learning methodologies to advance the frontier of taming complexity in modern data. They envision a new era in optimization with the presence of strong statistical and computational components adding new challenges.
“One of the exciting research thrusts at EnCORE is data science for accelerating scientific discoveries in domain sciences,” said Gal Mishne, a professor at HDSI. As part of EnCORE, the team will be developing fast, robust low-distortion visualization tools for real-world data in collaboration with domain experts. In addition, the team will be developing geometric data analysis tools for neuroscience, a field which is undergoing an explosion of data at multiple scales.
From K-12 and Beyond
A distinctive aspect to EnCORE will be the “E,” education and engagement, component.
The institute will engage students at all levels, from K-12 to postdoctoral students, and junior faculty and conduct extensive outreach activities at all of its four sites.
The geographic span of the institute in three regions of the United States will be a benefit as the institute executes its outreach plan, which includes regular workshops, events, hiring of students and postdoctoral students. Online and joint courses between the partner institutions will also be offered.
Activities to reach out to high school, middle school and elementary students in Southern California are also part of the institute’s plan, with the first engagement planned for this summer with the Sweetwater Union High School District to teach students about the foundations of data science.
There will also be mentorship and training opportunities with researchers affiliated with EnCORE, helping to create a pipeline of data scientists and broadening the reach and impact of the field. Additionally, collaboration with industry is being planned.
Arya Mazumdar, an EnCORE co-principal investigator and associate professor in the HDSI and an affiliated faculty member in CSE, said the team has already put much thought and effort into developing data science curricula across all levels. “We aim to create a generation of experts while being mindful of the needs of society and recognizing the demands of industry,” he said.
“We have made connections with numerous industry partners, including prominent data science techs and also with local Southern California industries including start-ups, who will be actively engaged with the institute and keep us informed about their needs,” Mazumdar added.
An interdisciplinary, diverse field- and team
Data science has footprints in computer science, mathematics, statistics and engineering. In that spirit, the researchers from the four participating institutions who comprise the core team have diverse and varied backgrounds from four disciplines.
“Data science is a new, and a very interdisciplinary area. To make significant progress in Data Science you need expertise from these diverse disciplines. And it’s very hard to find experts in all these areas under one department,” said Saha. “To make progress in Data Science, you need collaborations from across the disciplines and a range of expertise. I think this institute will provide this opportunity.”
And the institute will further diversity in science, as EnCORE is being spearheaded by women who are leaders in their fields.
From KUSI Newsroom:
The Afrofuturism Lounge creates a positive space for Black futurists to connect
SAN DIEGO (KUSI) – The Afrofuturism Lounge, in partnership with Comic-Con, will be back in person for the 5th year this Thursday, July 21st.
The lounge is a “unique aspirational global experience where Black Comix, cosplay fashionistas, and web-comic artists, advance their art centered in a space connecting a diverse Black geek community of creatives, critical thinkers and community builders to inspire futurist thought and industry opportunities.”
The goal of this lounge is to bring people together in a positive space to connect with diverse, creative thinkers who “form the diaspora of Black futurists”.
Dr. LaWana Richmond, Founder of the Afrofuturism Lounge, joined KUSI’s Paul Rudy on “Good Morning San Diego” to discuss their upcoming experience partnering with Comic-Con.
Tickets can be purchased at: afrofuturismlounge.com
Counting things: it’s harder than you think
Modern organizations leverage machine learning, data science, and AI to build predictive, responsive, and personalized applications. BUT! Most are bad at counting things. That’s where dbt comes in. dbt is an open source framework used to define, test, and document datasets. In this talk, we will discuss the what, why, and how behind dbt and data warehousing in the year 2022.
Drew Banin is the co-founder and former Chief Product Officer at dbt Labs, a Philadelphia startup pioneering the practice of modern analytics engineering. dbt is used by over 9,000 companies every week to organize, catalog, and distill knowledge in their data warehouses. Drew works with open source maintainers, contributors, and users to build dbt and strike fear in the hearts of database optimizers.
To watch the lecture series, click here.
Data Insights cycle 2
Deadline: Aug 25
Funding: $200,000 for grants that primarily support the effort of one to two full-time employees (FTEs) working on a given project. $400,000 for networked grants that will require the participation of two to four FTEs.
The goal of this opportunity is to create a network of projects that address broad computational challenges and needs within single-cell biology at a variety of scales. Applications are encouraged from computational experts outside the field of single-cell biology but with expertise relevant to overcoming current bottlenecks. Projects may include dedicated efforts to refine existing computational tools, benchmark classes of tools, improve standards, integrate available data that enables greater biological insight, develop new features that support interoperability of data or tools, and other major challenges brought forward. Projects must propose and rely on existing data that is openly and freely available (count matrices at minimum) at the time of application.
Please join me in congratulating Terry Sejnowski, a winner of the 2022 Neuroscience Prize from Gruber Foundation. As the citation below notes, the winners have literally made the field of neuroscience.
Terry is being recognized for his work on information processing by neural circuits in the brain, ICA, its use in brain imaging and discoveries he has made on brain wave patterns. The La Jolla mesa is particularly proud to host Terry and his research group, including the Institute for Neural Computation at UC San Diego (just a couple of floors below in the SDSC building).
Terry has been among the founding thought-leaders leading up to the formation of Data Science initiative at UC San Diego, and has been a key and consistent contributor to building HDSI. We couldn’t be more proud of our founding DNA and his continued guidance to HDSI faculty and leadership!
2022 Neuroscience Prize
Larry Abbott, PhD, of Columbia University, Emery Neal Brown, MD, PhD, of the Massachusetts Institute of Technology and Massachusetts General Hospital, Terrence Sejnowski, PhD, of the Salk Institute for Biological Studies and UC San Diego, and Haim Sompolinsky, PhD, of the Hebrew University of Jerusalem and Harvard University, have been pioneers in the fields of computational and theoretical neuroscience, fields that have become crucial to helping scientists unravel the complexities of neural networks in the brain. Using mathematics, physics, statistics, and machine learning, they have generated theories, models, and computational tools that have transformed the field of neuroscience and provided profound insights into the nature of the brain and the mind.