Research Questions
The research component of the project aims to answer these questions: What is data science? What skills are essential for data scientists? How should we best teach data science?
Research Methods
The research component of the project primarily involves literature review of published studies on data science education and the analysis of data science degree syllabi that are freely available online. The empirical data consist of views solicited from real world data scientists and academics. All these participants are working professionals. We identify them by attending national and international conferences on data science and by searching public profiles of data scientists listed on institutional websites, such as those of the Alan Turing Institute, and its partner universities throughout the UK.
We ask those professionals questions about their job as a data scientist, their views on the skills that are essential for teenagers to learn, and the messages they may have to send to young people who are yet to embark on a degree in data science. These views will be collected as replies to open-ended questions in a survey sent to their institutional email addresses. However, we will also ask them if they would be happy to be interviewed via Skype or Zoom, or in person if they are in or near Exeter.
All interviews will be first recorded in password protected iPhones. They will then be saved in password protected laptops for further analysis and backed up in the University of Exeter’s secure cloud server.
We ask those professionals questions about their job as a data scientist, their views on the skills that are essential for teenagers to learn, and the messages they may have to send to young people who are yet to embark on a degree in data science. These views will be collected as replies to open-ended questions in a survey sent to their institutional email addresses. However, we will also ask them if they would be happy to be interviewed via Skype or Zoom, or in person if they are in or near Exeter.
All interviews will be first recorded in password protected iPhones. They will then be saved in password protected laptops for further analysis and backed up in the University of Exeter’s secure cloud server.
Participants
The participants will be real world data scientists and academics with teaching and research interests in data science. School teachers will take part in the project as well, however, their views mainly serve to help us design the course for the students they know best. That said, they are likely to have opinions on data science education that are equally, if not more, important for the research component of the study. We will therefore contact a few teachers first and have informal conversations with them, we then use that information to assess if we should recruit more teachers for formal semi-structured interviews. If that proves necessary, the format and questions of the interviews will be similar to those for data scientists and academics.
We aim to interview, as a minimum, ten data scientists, ten academics, and possibly five teachers for the research component of the project. The industries and disciplines data scientists and academics represent should be diverse enough and years of experience in their specific fields varied.
At this stage, we would hope for the outputs of the project to include the following:
We aim to interview, as a minimum, ten data scientists, ten academics, and possibly five teachers for the research component of the project. The industries and disciplines data scientists and academics represent should be diverse enough and years of experience in their specific fields varied.
At this stage, we would hope for the outputs of the project to include the following:
- Conference presentations
- Journal articles
- A dedicated website (www.dseducation.net) with course materials and major findings made freely available to the world wide web
Voluntary Nature of Participation
We will use the web to search and attend conferences to establish connections with data scientists and academics. In the Graduate School of Education at the University of Exeter, we work closely with local schools. Therefore, the recruitment of teachers will be through our professional networks.
All data scientists, academics, and teachers will be told that they are free to withdraw from the study at any stage without any penalty, should they choose to take part in the study in the first place. As most questions will be delivered to them via a University of Exeter email address, they can make their own decisions on, unless they choose to be interviewed, if and when to reply and how much to write in a time that is most convenient to them. We will give them two weeks to reply to those questions, with a reminder message sent half way should a reply is pending.
All data scientists, academics, and teachers will be told that they are free to withdraw from the study at any stage without any penalty, should they choose to take part in the study in the first place. As most questions will be delivered to them via a University of Exeter email address, they can make their own decisions on, unless they choose to be interviewed, if and when to reply and how much to write in a time that is most convenient to them. We will give them two weeks to reply to those questions, with a reminder message sent half way should a reply is pending.
Informed Nature of Participation
All participants will be able to view the Information Sheet for this project and receive a link in the email directing them to this website with more details about the project. They can raise any question they may have about the project at any time prior to participation. The email we use to communicate with participants will be monitored on a daily basis. We endeavour to answer any question within 24 hours, if not sooner. Prior to the interviews we are to conduct, we will summarise the key points about the research one more time at the beginning of each interview and give another opportunity for participants to raise any questions they may have.
Data Protection and Storage
All email and survey replies from participants are automatically recorded and archived in the University of Exeter's secure cloud server. The inbox is password protected. Any copies synced to laptops and desktops are also password protected. All interviews will be recorded on a password protected iPhone first and then uploaded to a secure laptop for further analysis. The interview data will be backed up in the University’s cloud server as well.
We will not use real names of participants that can be used to identify individuals and institutions in any publication, unless they choose to publicly address the target group of students and consent to the publication of their identifying information, such as names and images.
Digital recordings of interviews for the research component of the project will be deleted as soon as we have an authoritative transcript of each and every interview. However, anonymised interview data may be stored indefinitely.
Research assistants who transcribe the interviews will be briefed on the need to remove any identifying details of individuals and institutions. The data will be kept in the University’s secure cloud server for up to five years, as any publication that arises from the research component of the project might require the research data to validate any findings reported.
We will not use real names of participants that can be used to identify individuals and institutions in any publication, unless they choose to publicly address the target group of students and consent to the publication of their identifying information, such as names and images.
Digital recordings of interviews for the research component of the project will be deleted as soon as we have an authoritative transcript of each and every interview. However, anonymised interview data may be stored indefinitely.
Research assistants who transcribe the interviews will be briefed on the need to remove any identifying details of individuals and institutions. The data will be kept in the University’s secure cloud server for up to five years, as any publication that arises from the research component of the project might require the research data to validate any findings reported.
Assessment of Possible Harm
We do not anticipate any harm to participants. However, they could feel uncomfortable about their views being recorded. But we will reassure them that their privacy and anonymity will be respected and they are not identifiable in any subsequent publications. We will take measures to anonymise the institutions they are associated with as well. Nevertheless, the fact that their answers as email replies and views expressed in interviews and the online survey will be recorded and reviewed closely could distress some participants to a certain degree. This is why they will be given the opportunity to withdraw at any time, which implies that any answers they have provided will not be analysed and their answers and interview data deleted.
For the outreach component of the project, before we make the course materials available online, we will remove names, including those of the students who take the course with us either in person or via distance or both, that can be used to identify individuals. At the end of the outreach component of the project, we may take photos of the group of students who will come to campus for a one-day event, that is to justify the expenditure of the fund for the project. But we will ask for their written permission if we are to publish those photos online.
Throughout the teaching sessions, code demonstrations will be recorded as screencasts (no student images will be recorded) for educational purposes. Students will be made aware of this recording at the very beginning of the outreach component of the project, they can opt out if they do not feel comfortable with that. We will also encourage students to use pseudonyms in all classroom conversations. In all teaching sessions, lecturers and teaching fellows will pause screencasts whenever they think the recording would cause serious distress or perceivable risk. As an extra layer of caution, we will edit each and every screencast before we publish it online as educational materials for the wider youth community.
For the outreach component of the project, before we make the course materials available online, we will remove names, including those of the students who take the course with us either in person or via distance or both, that can be used to identify individuals. At the end of the outreach component of the project, we may take photos of the group of students who will come to campus for a one-day event, that is to justify the expenditure of the fund for the project. But we will ask for their written permission if we are to publish those photos online.
Throughout the teaching sessions, code demonstrations will be recorded as screencasts (no student images will be recorded) for educational purposes. Students will be made aware of this recording at the very beginning of the outreach component of the project, they can opt out if they do not feel comfortable with that. We will also encourage students to use pseudonyms in all classroom conversations. In all teaching sessions, lecturers and teaching fellows will pause screencasts whenever they think the recording would cause serious distress or perceivable risk. As an extra layer of caution, we will edit each and every screencast before we publish it online as educational materials for the wider youth community.