“A Difficult Message” from Our Founding Director

Photo of Rajesh Gupta HDSI UCSD Founding Director
Photo of Rajesh Gupta HDSI UCSD Founding Director
Computer Science and Engineering – UC San Diego – End of Year Awards 2015

As I write this, I search for the right words, and ways to connect these words to actions. That video of George Floyd being killed in a broad daylight by a policeman, employed and sworn to protect him, haunts us. It comes right after a careless dog-owner called the cops on Chris Cooper, whose only crime was being black in a public park. An insult that could have just as easily resulted in another black man’s death.

Death and insults; we live in a society starkly divided in its basic guarantee of life and liberty to its black American citizens. Outrage does not capture the sense of wrong here, nor does sadness. We reiterate the call by Chancellor Khosla in rejecting this continuing dehumanization of fellow Black Americans. Melody Cooper, sister of Chris Cooper, writes, “Stand with us. Bear witness. Continue the discussion and support legal action. Refuse to accept racism in your midst, even in small ways—call out a cruel joke or rude behavior.”

We can do more than just reject racism. As an organization, we can leverage the skills and resources at our disposal to turn ideas into action. We believe data science can serve as a powerful tool for transparency and support for systemic change. Our advisor, DJ Patil, writes, it can expose inequality and injustice while providing data-driven solutions for change such as data-driven justicemonitoring policing and use of force, and removing the reliance and adoption of racist and biased algorithms from our criminal justice system. 

In that spirit, we are setting a task force with Professor Brad Voytek to develop ideas and proposals on how best to leverage our resources to launch projects that educate us, improve diversity and representation in data science, and to engage the community and government leaders on how data science, machine learning, and artificial intelligence are used to improve lives, and how they are misused to propagate and amplify bias, racism, and division.

In the meantime, on behalf of HDSI faculty, staff, students, industry and community partners, kindly allow me to express our solidarity and support for members of our community who have been and continue to be hurt by these events and the continuing injustices. We stand with you and refuse to accept the status quo.

Rajesh Gupta

Distinguished Professor, HDSI Founding Director

HDSI awards 27 scholarships to undergraduate researchers

  Halıcıoğlu Data Science Institute at University of California San Diego has announced its second annual undergraduate research scholarships. HDSI has awarded to 27 students funding 20 projects. The Undergraduate Research Scholarship Program awards $2,500 to each student project, all of them guided by experienced mentors. The yearlong fellowships focus on providing hands-on training to emerging data science talent. All UC San Diego undergraduate students enrolled full-time are eligible to apply for research funding. Each researcher works with a mentor to develop analytical skills, data science portfolios, and foster novel data-driven approaches to problem solving. The research program is geared to producing results, with students submitting final reports highlighting accomplishments. Learn more about the HDSI Undergraduate Scholarship program. Projects and awardees:
  • Students’ gender biases toward professors
    • Catherine Eng, Cognitive and Behavioral Neuroscience
  • Single-cell transcriptomics and web mining for rapid reverse genetics in plants
    • Kian Faizi, Molecular Biology
  • Vista: An end-to-end declarative transfer learning system for multimodal analytics with deep neural networks
    • Advitya Gemawat, Data Science
  • Quantifying the effect of redlining policies on racial segregation in America’s urban centers
    • Arunav Gupta, Data Science
  • Quantifying tennis player performance
    • Karthik Guruvayurappan, Data Science
    • Eric Jiang, Data Science
    • Shweta Kumar, Data Science
  • Mangrove mapping, image segmentation, and classification
    • Stanley Hicks, Electrical Engineering
    • Arden Ma, Electrical Engineering
    • Katherine Qi, Marine Biology
  • Using machine learning for diagnosis in oculofacial plastic surgery
    • Zhaoyi Hou, Data Science
  • Predicting the outcome of an NFL play using pursuit curve features
    • Justin Kang, Data Science
    • Shone Patil, Data Science
  • A supervised machine learning approach to predicting mild cognitive impairment (MCI) among diverse Latinos
    • Alexandra Keamy, Data Science
  • Building an inference model to classify Chinese computational propaganda bots
    • Nicole Lee, Data Science
    • Nikolas Racelis-Russell, Data Science
  • Effects of battle and journey metaphors on charitable donations for cancer patients
    • Alex Liebscher, Cognitive Science
  • Weather forecasting using hybrid model
    • Xiang Lu, Applied Mathematics
  • Reassessment of P2P credit risk modeling with macroeconomic factors
    • Samson Qian, Data Science
  • Video games and player retention
    • Sasami Scott, Data Science
  • How do humans convey what they perceive and know about objects in visual form?
    • Justin Yang, Cognitive Science and Math – Computer Science
  • Using geographic data and social media to predict the potential health problems
    • Shuyi Judy Yang, Data Science
  • Identifying biomarkers in recurrent Chronic Thromboembolic Pulmonary Hypertension (CTEPH) patients
    • Eric Yu, Data Science
  • How does visual context influence sketched representations of faces?
    • Julia Xu, Cognitive Science – Machine Learning and Neural Computation
  • ASL (American Sign Language) signs recognition
    • Xinrui Zhan, Data Science
  • 3D Wireframe Detection
    • Haoming Zhang, Data Science

HDSI Industry Partnership in Action: Matthew Levy from NIWC Pacific

­­­An information technology expert with an academic background, Matthew Levy of Naval Information Warfare Center Pacific (NIWC Pacific) comes to HDSI as part of an Industry Partner Alliance (IPA) Fellowship to work directly with faculty and students.


Over the next 2 years. Dr. Levy will be stationed at HDSI’s campus headquarters partnering on research and academic projects. His fellowship goal is helping find answers for projects fulfilling needs of the U.S. Navy, and learning more about the latest data science tools to bring back to NIWC. Since starting on campus in late spring 2019, he has been stationed at HDSI headquarters in the Supercomputer Center-249E, working on problems in the cybersecurity domain.


Dr. Levy is one of multiple Fellows working with HDSI’s Industry Partnership, an Institute program aimed at strengthening collaboration between working industry professionals and front-line academic researchers and students to work on real-world problems. Coordinating the IPA program is HDSI’s Industry Relations Manager Erik Mjoen, who helps facilitate working partnerships.


“This is a perfect environment for coming out of academia and industry, to get the chance to work directly with researchers on the forefront of the discipline, and getting involved with students who are so excited about the future,” said Dr. Levy. He hopes to build up more core competencies in data science to bring back to NIWC Pacific, and also work closely on recruitment and interaction with data science students, setting up events like hackathons, and running scenarios using game theory.


Dr. Levy also enjoys returning to a campus setting. Before joining NIWC Pacific, Dr. Levy was an Assistant Professor at San Francisco State University, and at Hawaiʻi Pacific University, teaching and researching in the areas of open source and data analytics, and running the Master of Science in Information Systems program. He has also worked in private industry as a software engineer, CEO, and CTO in the defense, homeland security, and travel and tourism industries. He earned his Ph.D. in information systems from Louisiana State University, and an MBA from San Diego State University.


Among the projects he’s working on now is in the area cyber-deception, in analyzing how hackers behave in cyberspace, what they do when faced with deception and deterrent mechanisms, and the efficacy of the deception mechanisms themselves. The hope is that he can work at the crossroads of technology and human behavior with the researchers at UCSD to improve upon this work. In his words, my hope during my time as an industry fellow at HSDI is to bring novel thinking and research to NIWC Pacific to help us solve some of our nation’s most pressing problems in the cyber domain.”

HDSI Awards 10 Graduate Prize Fellowships

The Halıcıoğlu Data Science Institute (HDSI) has awarded 10 new Graduate Prize Fellowships to incoming Ph.D. students to support their data-science related research at UC San Diego for the next four years. This influx of data science expertise will boost the research and teaching capabilities for HDSI.

They will work directly with our data science students as teaching assistants for one course, helping strengthen what has become one of the largest data-science education programs in the nation.

The Graduate Prize Fellowships provide substantial financial support to Fellows throughout their four-year award. The awardees are drawn from multiple academic disciplines, in keeping with the multidisciplinary mission of HDSI. Doctoral student appointments will be shared by HDSI and five departments: Bioinformatics and Systems Biology-biomedical informatics track; Computer Science & Engineering (CSE); Mathematics; Cognitive Science; and Electrical and Computer Engineering (ECE).

Awardees are:

  • Fatemeh Amrollahi — Bioinformatics
  • Yeohee Im–ECE
  • Ranak Roy Chowdhury–CSE
  • Matthew Feiglis–Cognitive Science
  • Zhankui He–CSE
  • Side Li–CSE
  • Tara Mirmira–CSE
  • Yanyi Wang–Math
  • Yuyao Wang–Math
  • Weiwei Wu–Math

HDSI Celebrates 1st Year Accomplishments and Vision

Undergraduate scholarship winner presents poster


Capping the celebration of its first year in operations, the Halıcıoğlu Data Science Institute (HDSI) has released a video that spotlights its role as a national leader in data science programs.

The video, which features a day-long symposium held on HDSI’s first anniversary, highlights leading figures in data science research, industry and academia, as well as current students. The focus of the symposium was on both first-year accomplishments, and discussions of the future of the data science field.

In its first year, HDSI demonstrated the historic vision of University of California San Diego by taking the lead on one of the biggest forces in modern life, noted Bob Continetti, UC San Diego Senior Associate Vice Chancellor. With the innovative HDSI approach, he said, “This is how were going to respond by the ever-increasing rate of change around us.”

The symposium was accompanied by another video summing up HDSI first-year accomplishments since launching March 1, 2018. Among the first-year achievements:

  • Establishing itself as one of the largest academic data science programs in the nation
  • Launched a full data science major and minor, drawing students from every class at the university
  • Creating the Institute’s first postdoctoral scholar program
  • Building a core of more than 200 faculty members affiliated across more than 20 academic disciplines, from mathematics to the medical school
  • Attracted more than a dozen industry partners contributing to enhancing educational program, research opportunities and a job pipeline for students

Another leading academician in the data science field appearing on one of the symposium panels was David Culler, UC Berkeley Dean for Data Science and the former Department Chair, Electrical Engineering and Computer Sciences. Culler spoke on the urgent need for responsible data science taught and utilized in research under the guidance of experienced leading universities.

“We’re in a moment where were trying to understand the complicated dynamics of the planet when we also have the ability to see every square yard of it through remote sensing and so forth,” said Culler on the challenges. “What we are seeing are the frontiers of knowledge in essentially every domain are integrated in character.”

The video features undergraduate Luyanda Mdanda accepting his prize for winning the best scientific poster among inaugural group of HDSI Undergraduate Research Scholarship winners.

Mdanda, a junior, was one of more than 40 undergraduates who won competitive scholarships to conduct data science research projects during the academic year. HDSI funded the scholarships that drew students from more than 15 different academic majors, from political science to computer science.



Death and Data Science: Discovering the Reality vs the Reporting

How much do people obsess about violent death, and how did HDSI Fellow Brad Voytek’s students use data science to prove those obsessions defy reality?

The surprising answers discovered by Voytek’s students were delivered in their final class project, “Death: Reality vs. Reported.” Based on their analysis of millions of data points over a 17-year period, they found the reality is that passive death like heart disease may be among the most common killers. And yet rarities like lethal terrorism grab most of the headlines. They found that terrorism, compared to the number of deaths it causes, was 3906-times over-represented in the media.

The team of four undergraduates used data science tools like Python to zero in on the nexus of: What do Americans actually die from? What do they search the most for on Google about death? What does the major media report about death? Their findings proved so dramatic that their school project on the media itself drew headlines. And their work recently drew endorsement from the world’s most famous tech guru, Bill Gates.

“I’m always amazed by the disconnect between what we see in the news and the reality of the world around us,” Gates tweeted in mid-June, linking to their work. Considering the philanthropist claims 47.5 million Twitter followers, that’s pretty impressive recognition.

UC San Diego rising junior Owen Shen reacted with cautious pride to this latest praise. “This thing really started snowballing,” said Shen from his family’s home in Northern California’s East Bay. The recognition from Gates is especially meaningful to him because this week Shen started his summer internship working for Microsoft, the tech giant co-founded by Gates.

Shen calls himself lucky for the widespread attention their data science Death project has garnered from luminaries like Gates. Although Shen, being 20 years old, seemed more impressed with the earlier praise his team gained from his own more Millennial and GenX heroes like Liv Boeree, a British professional poker player and philanthropist (with a degree in astrophysics), as well as Jeff Dean, a computer scientist who leads the Google Brain project in the company’s Research Group.

Shen and team did the work for Voytek’s class, Data Science in Practice. Voytek is a neuroscientist and professor in the Department of Cognitive Science who helped pioneer the Institute undergraduate education program, and the interdisciplinary draw of more than 200 faulty members affiliated with HDSI, the university’s data science hub.

Shen is a computer science major at UC San Diego who took Voytek’s class last year as a freshman because of his interest in analyzing data as well as coding. Shen teamed with classmates Hasan Al-Jamaly, Maximillian Siemers, and Nicole Stone. Together they searched death records from the Centers for Disease Control and Prevention for 12 causes of death from 1999-2016, including cancer, stroke, overdose, car accidents, homicide and terrorism. They searched headlines from 1999-2016 in both the New York Times and the Guardian news publication from the United Kingdom. Then they looked at Google search trends for their 12 cause-of-death markers from 2004-16 (starting with the first year available).

Once they got the raw counts, one of the challenges was to clean up the data, explained Shen, by relativizing all the columns so they could compare the correct proportions. In data terms, comparing apples to pineapples doesn’t deliver the sharpest results. Among their findings: Cancer and suicide were by far the enduring leading search interests, with terrorism being perennially a Top 5 obsession.

Said Voytek, their professor, “It’s fantastic work.”

What intrigued Stone was how deeply divergent people’s fears were about death, what they searched Google for, in comparison to the reality of death they found. “The things people think they need to worry about are more far apart from what the data shows they should be worrying about. Why? We didn’t get into that, but someone should,” said Stone, who majored in Cognitive Science specializing in machine learning, and minored in Computer Science.

Among the Death Reporting vs. Reality team, Al-Jamaly just graduated and took a job as a software engineer in the Bay Area. And Siemers, who was a visiting international student, returned to Europe to pursue a Ph.D.

For his part, Shen didn’t stop with the final report for Voytek’s class, but kept developing their project into a more mainstream format complete with interactive color-coded graphics. As a result of his more user-friendly data presentation the wider recognition for their original work keeps growing, with milestones like the project carrying top honors on the popular Reddit internet site as one of the most popular data presentations of all time. Shen plans to continue working in data analysis at UC San Diego and beyond, saying, “I have more questions I would like to get answers for.”