Some of the most important, most interesting, and most difficult open problems in the modern world involve amazingly complex systems. Problems like: How do we develop artificially intelligent machines that can think and behave like humans? How do we design a vaccine that prevents cancer? And: How will a new economic policy influence human behavior? The systems and processes that these problems concern — the brain, the immune system, and the economy — involve a staggering number of moving pieces and variables. While science is gaining ground, our understanding remains limited. However, unprecedented progress has been made recently on these problems, largely driven by one factor: data.

As it turns out, the approach of gathering massive data sets and searching them for underlying patterns has been a surprisingly effective way to solve problems involving complex systems that we don’t yet fully understand. We might build an artificial intelligence, for example, by gathering billions of pieces of text written by humans and “training” a computer to recognize and mimic the patterns in human writing. Or we might develop a vaccine for cancer by gathering large amounts of data on the environments and genetics of people who seem to be less susceptible to the disease and searching that data for clues to what makes them so. While the use of data has long been a key part of science, using data at this scale has only recently been made possible due to things like the Internet, the proliferation of cheap sensors, and fast computers. 

In short, data science is the study of the mathematical techniques capable of finding meaningful patterns in messy, real-world data and the computational methods that enable their use on huge data sets. But it is also the study of the ethical implications of these methods and how to use them responsibly for the benefit of everyone.

Majors in Data Science at UCSD can expect a curriculum that balances theory and practice. In courses in the theoretical foundations of data science, students will analyze the mathematical underpinnings of data analysis techniques in a quest to answer: Why do they work? When are they applicable? What are their limitations? And can they be improved? In studying the application of these techniques, students will learn about the practical realities of working with incomplete and messy data from a variety of domains, how to communicate technical ideas clearly and concisely, and how to build and use computational systems that enable storing and processing very large amounts of data. Beyond this, students will have opportunities to learn how to apply data science ethically and responsibly. If learning about all of these things sounds interesting to you, you might find a major in data science rewarding!

Our alumni currently work as data scientists, data analysts, machine learning engineers, business analysts, artificial intelligence scientists, software developers, and more. Many enter the workforce directly after graduation, while others go on to graduate school.

According to the Bureau of Labor Statistics, the long-term job outlook for data science is very good. It is projected that the number of data science jobs in the US will grow by 36% in the next decade, compared to 17% growth for software developers and 4% growth for all occupations.

The exciting growth of the field means that more and more data science degree programs are starting every year, and you have an increasing number of options to choose from. But because data science as an academic discipline is relatively new, each program has its own unique take on what it is and how it should be taught. This variation means that, when comparing two different data science programs, you will want to look at the details to determine which is best for you and your goals. Questions you might ask include: What courses are required? What department offers the courses? Who are the faculty that I’ll be interacting with the most? And what opportunities will I have to get hands-on experience?

To that end, here are some of the details for UCSD’s Data Science undergraduate program:

  • Almost all of the major’s required courses are Data Science courses, taught by faculty appointed in the Halıcıoğlu Data Science Institute. This means that the Data Science program has control over the content of the courses and can ensure that they fit together into a cohesive curriculum. For more about our curriculum, including the required courses and electives offered, see our website.
  • Every one of our data science majors graduates with a portfolio of work that demonstrates their capabilities to potential employers and graduate schools. We offer several ways that students can add to this portfolio, but all students will have an opportunity to do so through our Senior Capstone. Our Capstone is a two-quarter project where students work as part of a small group to solve a real-world data science problem in a domain of your choosing. Projects are mentored by data science faculty and/or industry experts, and the student-to-mentor ratio is small (7:1 in 2024). See https://dsc-capstone.org/showcase-24/ for examples of previous projects.
  • The undergraduate Data Science program is housed in the Halıcıoğlu Data Science Institute (HDSI), an interdisciplinary organization which serves as a meeting ground for faculty from across campus whose work involves data science in some shape or form. Many faculty in HDSI have joint appointments with other departments, such as Mathematics, Computer Science, Electrical Engineering, Cognitive Science, Philosophy, Medicine, and Political Science. For students, this means that there is a wide array of interesting work going on in the Institute, there are more classes offered at the intersection of data science and whatever domain you are interested in, and there are more opportunities to interact with faculty working in your particular area of interest.
  • The program has strong ties to industry through HDSI’s Industry Partner Alliance. Industry Partners run capstone projects, teach classes, help shape the curriculum and ensure that it is up-to-date, and hold “office hours” where they advise students on career paths.

Data Science and Computer Science are related disciplines, and there is considerable overlap in both the topics that they cover and the career paths that their graduates follow. Their similarity means that it can be difficult to decide between them, but it also means that the choice may not be as critical as one might think. At the same time, there is also significant variation between computer science programs, and so it can be difficult to say something that is true of all of them. If you’re choosing between a computer science program and a data science program, we recommend that you look at the details and ask the same questions as when choosing between two data science programs. Questions like: What courses are required? Who are the faculty that I’ll be interacting with the most? And what opportunities will I have to get hands-on experience?

Speaking in general, the main difference between computer science programs and our Data Science program will lie in their focus. While machine learning, artificial intelligence, and data science are components of many computer science departments, they are at the center of the Halıcıoğlu Data Science Institute and our Data Science program. This difference might show up in the courses you take: Data Science students can generally expect to take more machine learning and statistics courses, whereas CS majors might take more courses in computer architecture, software development, and discrete mathematics. It may also appear in the opportunities available to you: Because all of HDSI’s faculty work in data science, machine learning, and AI, there might be more chances to find the sub-specialization within data science that you are most interested in. For example, while many computer science departments have faculty working in AI, fewer have faculty who work at the intersection of ethics and AI, or climate change and AI, or medicine and AI, etc. whereas HDSI has several. And remember HDSI’s Senior Capstone: all of our majors will complete a project in machine learning, data science, or artificial intelligence before graduating. Not all CS programs will have capstones, and those which do might have fewer project opportunities in these areas.

That said, majoring in computer science with a specialization in machine learning (or similar) can be the ideal choice for many people — for example, those who have a significant interest in software development, security, or another subfield of CS. Again, we recommend looking at the specific courses you would be required to take as part of both majors and deciding based on what interests you most.