Data Ecosystem Infrastructure – BSc graduate assignment
Again I have the opportunity to have some graduate students helping me with my PhD. One of them is Eddy. But he will introduce himzelf in this guestblog:
Data Ecosystem Infrastructure LA4LD
(original posted on: https://eddyvandenaker.com/post/Data-Ecosystem-Infrastructure-LA4LD)
Hi, my name is Eddy van den Aker and I’m currently doing my graduate internship. My project is part of Marcel’s PhD research project (https://2bejammed.org/2017/01/03/the-basics-of-my-phd-research/) about learning analytics, learning dashboards, and learning design.
One of the problems faced by faculties, course designers, and teachers is the lack of insight into the student experience. Faculties are rated based on two factors: the time it takes for students to get their degree and the student experience. The first factor is obvious and easy to measure, but student experience is harder.
Currently the faculty of ICT within Zuyd University of Applied Science has two ways of measuring student experience. The first is the Nationale Studenten Enquête (NSE), which is a national questionnaire filled in by students from all Universities (of Applied Science). The second is a questionnaire at the end of every course, these are faculty specific.
The results of the NSE are not linked to a specific course, and the course questionnaires are done after the course has ended, so the results also come after the fact. The feedback toward the students on what is done with the results is also limited, which probably (based on anecdotal evidence) contributes to lower participation numbers. All in all, not enough data is available to improve student experience, and students are not seeing enough actionable feedback to be more engaged with the courses and the faculty.
To solve this problem Marcel has suggested creating a data ecosystem in which students, teachers and course designers participate to collect and make use off more and more useful data. Several projects have bin done and are currently going on to develop systems to collect data (for example the IoT projects https://2bejammed.org/2018/01/02/5-student-teams-working-on-classroom-iot/). Another project is looking at ways to present the data gathered in a collection of dashboards (LINK NAAR SANDERS POST).
My project fits neatly between all projects mentioned before. I will be developing an open-source infrastructure that can catch, clean, structure and store all data gathered while also delivering the underlying services needed to present the data to the users through dashboards.
Because this system sits at the core of the data ecosystem and must be able to support many different kinds of systems, both current and in the future, it is vital to make the entire infrastructure modular. During my internship, a couple of modules will be developed.
The first module will be an end-point for collecting information on student attendance. This system could be an RFID reader on which students swipe their student-card. Another module connects to the digital learning environment, in this case Moodle and collects data on how students use the provided course material. A third module imports student results from a file. And finally a last module will collect and store data from questionnaires.
As said before it has to be possible to develop more modules later down the line, adding for example environmental variables from the classroom or students study room at home. Another example would be to track the view of students in the classroom, where they are looking on the slides, what draws their attention.
Any system that collects this amount of data, especially potentially sensitive private data, has to consider the privacy of it’s participants and thus the security of the system. A way has to be sought to ensure that no one but the student themselves are able to see their own personalized data. Teachers and course designers will only see anonymised group data. The general security of the system also has to be considered.
During the development of the system I will be using a couple of different methodologies.
Design based Research Process
This project will be using the design based research methodology The first three phases (problem definition & motivation, objectives of solution, and design & development) will be completed during the project, the fourth phase (demonstration) will be started.
Systematic Mapping Review
At the start of the project, a systematic mapping review will be done to see in which fields data ecosystems have been suggested and maybe even deployed. It’s also interesting to know if any effect studies have been published in cases where data ecosystems have been deployed.
Scrum and GitHub
For managing the project I will be using a slightly modified version of Scrum. Slightly modified because I’m the only person in the development team. For tracking all Scrum related information I will be using the issues, pull requests, projects and wiki pages on the GitHub page for the project.
I wanted to figure out how to automate the entire Scrum workflow on GitHub. I have made some decent progress on it, good enough for this project, but I still have to move “to-do” items manually to “in progress” and after that to “in review”. If you read my post on converting exam questions to flashcard (https://eddyvandenaker.com/post/Converting-Exam-Questions-to-Flashcards/), you know I’m lazy (in a good way, I hope) and I will be looking to automate as much as the workflow as possible, so maybe I can find a solution to these two manual actions.
Test Driven Development
For the development of the system I’ll be using Test Driven Development (TDD). The basic idea of TDD is to make testing an integral part of the development cycle. By developing automated functional and/or integration tests first, then developing smaller unit tests. at first these tests should fail (it would be weird if they didn’t). Only after having done all that, you write just enough code to get the tests to pass (or at least progress to the next step). When you have some passing tests you can refactor (improve) the code while using the previously passing tests to make sure the program does not regress. This process is often called Red, Green, Refactor.
My internship lasts half a year (20 school weeks). The first 3 weeks are spend on clearly defining the project, choosing the methodologies, and planning the phases of the project. Week 4 and 5 are used for requirements analysis and the systematic review. From week 6 until week 16 the system will be designed & developed in a couple of Scrum sprints. The last 4 weeks are used to prepare for the presentation at the end of the internship and to finish up the project in general.
The design & development phase consists of a number of sprints:
- Setup (software architecture & base functionality like logins, database connections, etc.) – 2 weeks
- Importing student results from file – 1 week
- Student attendance – 1 week
- Moodle/xAPI connection – 3 weeks
- MSLQ or other questionnaire connection – 1 week
- Admin panel – 2 weeks
- Wrapping up (extended testing, deployment considerations, etc.) – 2 weeks
In about 5 or 6 weeks I’ll be posting a status update on where I’m at with the project. Another 5 or 6 weeks after that I will present my results. Finally when I’m (almost) done with my internship I’ll write a post about my experiences.
The repository for this project can be found on https://github.com/eddyvdaker/Zuyd-LA4LD-Dataecosystem
 Peffers, K.; Tuunanen, T. (februari 2006). The Design Science Research Process. Opgeroepen van
wrsc.org op 26 februari 2018 via:
 Kitchenham, B. (2007). Guidelines for Performing Systematic Literature Reviews in Software Engineering. Opgeroepen op 14 maart 2018 via: