Requirements for Doctor of Philosophy (Ph.D.) in Data Science

The goal of the doctoral program is to create leaders in the field of Data Science who will lay the foundation and expand the boundaries of knowledge in the field. The doctoral program aims to provide a research-oriented education to students, teaching them knowledge, skills and awareness required to perform data driven research, and enabling them to, using this shared background, carry out research that expands the boundaries of knowledge in Data Science. The doctoral program spans from foundational aspects, including computational methods, machine learning, mathematical models and statistical analysis, to applications in data science.

The goal of the preliminary assessment examination is to assess students’ preparation for pursuing a PhD in data science, in terms of core knowledge and readiness for conducting research. The preliminary assessment is an advisory examination.

The preliminary assessment is an oral presentation that must be completed before the end of Spring quarter of the second academic year. Students must have a GPA of 3.0 or above to qualify for the assessment and have completed three of four core required courses. The student will choose a committee consisting of three members, one of which will be the HDSI academic advisor of the student. The other two committee members must be HDSI faculty members with  0% or more appointments; we encourage the student to select the second faculty member based on compatibility of research interests and topic of the presentation. The student is responsible for scheduling the meeting and making a room reservation. 

The student may choose to be evaluated based on (A) a scientific literature survey and data analysis or (B) based on a previous rotation project. The student will propose the topic of the presentation. 

  1. If the student chooses the survey theme, they should select a broad area that is well represented among HDSI faculty members, such as causal inference, responsible AI, optimization, etc. The student should survey at least 10 peer-reviewed conference or journal papers representative of the last (at least) 5 years of the field. The student should present a novel and rigorous original analysis using publicly available data from the surveyed literature: this analysis may aim to answer a related or new research question.
  2.  If the student chooses the rotation project theme, they should prepare to discuss the motivation for the project, the analysis undertaken, and the outcome of the rotation. 

For both themes, the student will describe their topic to the committee by writing a 1-2 page proposal that must be then approved by the committee. We emphasize that this is not a research proposal. The student will have 50 minutes to give an oral presentation which should include a comprehensive overview of previous work, motivation for the presented work or state-of-the-art studies, a critical assessment of previous work and of their own work, and a future outlook including logical next steps or unanswered questions. The presentation will then be followed by a Q&A session by the committee members; the entire exam is expected to finish within two hours. 

The committee will assess both the oral presentation as well as the student’s academic performance so far (especially in the required core courses). The committee will evaluate preparedness, technical skills, comprehension, critical thinking, and research readiness. Students who do not receive a satisfactory evaluation will receive a recommendation from the Graduate Program Committee regarding ways to remedy the lacking preparation or an opportunity to receive a terminal MS in Data Science degree provided the student can meet the degree requirements of the MS program. If the lack of preparation is course-based, the committee can require that additional course(s) be taken to pass the exam. If the lack of preparation is research-based, the committee can require an evaluation after another quarter of research with an HDSI faculty member; the faculty member will provide this evaluation. The preliminary assessment must be successfully completed no later than completion of two years (or sixth quarter enrollment) in the Ph.D. program. 

The oral presentation must be completed in-person. We recommend the following timeline so that students can plan their preliminary assessments:

  • Middle of winter quarter of second year: Student selects committee and proposes preliminary exam topic.  
  • Beginning of spring quarter of second year: Scheduling of exam is completed. 
  • End of spring quarter of second year: Exam. 

A research qualifying examination (UQE) is conducted by the dissertation committee consisting of five or more members approved by the graduate division as per senate regulation 715(D). One senate faculty member must have a primary appointment in the department outside of HDSI. Faculty with 25% or less partial appointment in HDSI may be considered for meeting this requirement on an exceptional basis upon approval from the graduate division.

The goal of UQE is to assess the ability of the candidate to perform independent critical research as evidenced by a presentation and writing a technical report at the level of a peer-reviewed journal or conference publication. The examination is taken after the student and his or her adviser have identified a topic for the dissertation and an initial demonstration of feasible progress has been made. The candidate is expected to describe his or her accomplishments to date as well as future work. The research qualifying examination must be completed no later than fourth year or 12 quarters from the start of the degree program; the UQE is tantamount to the advancement to PhD candidacy exam.

A petition to the Graduate Committee is required for students who take UQE after the required 12 quarters deadline. Students who fail the research qualifying examination may file a petition to retake it; if the petition is approved, they will be allowed to retake it one (and only one) more time. Students who fail UQE may also petition to transition to a MS in Data Science track.

Students must successfully complete a final dissertation defense oral presentation and examination to the Dissertation Committee consisting of five or more members approved by the graduate division as per senate regulation 715(D).  One senate faculty member in the Dissertation Committee must have a primary appointment in a department outside of HDSI. Partially appointed faculty in HDSI (at 25% or less) are acceptable in meeting this outside-department requirement as long as their main (lead) department is not HDSI.

A dissertation in the scope of Data Science is required of every candidate for the PhD degree. HDSI PhD program thesis requirements must meet Regulation 715(D) requirements. The final form of the dissertation document must comply with published guidelines by the Graduate Division.

The dissertation topic will be selected by the student, under the advice and guidance of Thesis Adviser and the Dissertation Committee. The dissertation must contain an original contribution of quality that would be acceptable for publication in the academic literature that either extends the theory or methodology of data science, or uses data science methods to solve a scientific problem in applied disciplines.

The entire dissertation committee will conduct a final oral examination, which will deal primarily with questions arising out of the relationship of the dissertation to the field of Data Science. The final examination will be conducted in two parts. The first part consists of a presentation by the candidate followed by a brief period of questions pertaining to the presentation; this part of the examination is open to the public. The second part of the examination will immediately follow the first part; this is a closed session between the student and the committee and will consist of a period of questioning by the committee members.

Special Requirements: Generalization, Reproducibility and Responsibility
A candidate for doctoral degree in data science is expected to demonstrate evidence of generalization skills as well as evidence of reproducibility in research results. Evidence of generalization skills may be in the form of — but not limited to — generalization of results arrived at across domains, or across applications within a domain, generalization of applicability of method(s) proposed, or generalization of thesis conclusions rooted in formal or mathematical proof or quantitative reasoning supported by robust statistical measures. Reproducibility requirement may be satisfied by additional supplementary material consisting of code and data repository. The dissertation will also be reviewed for responsible use of data.

All graduate students in the doctoral program are required to complete at least one quarter of experience in the classroom as teaching assistants regardless of their eventual career goals. Effective communications and ability to explain deep technical subjects is considered a key measure of a well-rounded doctoral education. Thus, Ph.D. students are also required to take a 1-unit DSC 295 (Academia Survival Skills) course for a Satisfactory grade.

PhD students may obtain an MS Degree in Data Science along the way or a terminal MS degree, provided they complete the requirements for the MS degree.