Monday, June 18, 2018

Implementing Moodle's Privacy API in a Moodle Plugin - Part 4

Continuing the series on implementing Moodle's Privacy API in my questionnaire plugin, I will add code to handle all of the questionnaire data. I did my smaller test set, but now I need to get a fully working implementation.

To start with, I need to add all of the potential response data to my privacy data provider. For questionnaire, this includes multiple question and response tables. For the export, I also need to design an appropriate output structure. This post will work on fulfilling these functions.

I have a couple of options for an output structure. I can create one JSON structure with all of the responses, questions and answers, like this:
{
    "name": "Test Questionnaire",
    "intro": "A wonderful description of the questionnaire.",
    "responses": [
        {
            "complete": "Yes",
            "lastsaved": "Friday, 18 November 2016, 8:14 pm",
            "questions" : [
                {
                    "questionname": "Q1. Car ownership",
                    "questiontext": "Do you own a car?",
                    "answers": [
                        "No"
                    ]
                },
                {
                    "questionname": "Q2. Characters",
                    "questiontext": "Enter no more than 10 characters.",
                    "answers": [
                        "123456"
                    ]
                },
                {
                    "questionname": "Q3. Numbers",
                    "questiontext": "Check all that apply",
                    "answers": [
                        "1,3,5,Another number: 7"
                    ]
                },
                {
                    "questionname": "Q4. Rate course",
                    "questiontext": "Rate these",
                    "answers": [
                        "Formatting your course: Very easy to use",
                        "Laying out your course: Easy to use"
                    ]
                }
            ]
        }
    ]
}

Or I could use subcontexts, and create a directory structure. Something like this:


In this case, each response by the user would have its own directory, with a separate subdirectory for each question and its specific response. This would be done using subcontexts. I feel the first option is really the best choice for what I need. The second one seems like overkill.

Building the directory structure is done in the forum module. I'm not completely sure how it works, but if you walk through the forum's provider file, you can see that it is built through arrays, and exported through nested calls to export_data.

In any case, building a JSON structure like the one I planned above, is not too difficult. I already have a function in questionnaire that returns a structured data set that I use when sending responses via email or other notification methods. It isn't exactly what I need, but is close enough. So I'll modify it and make sure it still works for the code that uses it currently. The modified code is here, and I create another quick function to use in the privacy provider. The rewrite of my export function that provides all appropriate response data now looks like this.

The last thing I require is the full deletion functions. I do have some library functions that delete response data, but they also log events. The core plugins all seem to provide direct database deletions rather than using their deletion library functions. So, I'll do the same, creating one function that does most of the deletion work so that the two privacy API functions don't have to duplicate that code. The end result is here.

I have tested all of this code with the test code Moodle provided. It all seemed to work fine. I'll see if I can create some automated tests into the module's testing code as well, and do more testing before I release this.

If you have any questions about this work, please ask here or in the forums on Moodle.org.

Wednesday, June 6, 2018

Implementing Moodle's Privacy API in a Moodle Plugin - Part 3

Continuing the series on implementing Moodle's Privacy API in my questionnaire plugin, I will add code to handle deletion of user data.

The documentation indicates that there are two functions to implement. The delete_data_for_all_users_in_context handles deleting all users' data for a provided context when a defined retention period has expired. The retention period is part of the new privacy settings. The delete_data_for_user handles deleting user data for the provided contexts, when a user has requested to be forgotten.

Looking at the examples in the documentation and in the two modules I have been referring to, forum and choice, these functions determine the data records that need to be deleted and then delete them from the database. Doing it this way, instead of using a specific module's API, seems odd to me. I would have thought using the module API would be safer. But it also means that the data is deleted without leaving information about why and how it was deleted. Most API's would log a deleting event in order to have accountability for the activity. It's possible that logging this deletion violates the GDPR's "forget me" policy? I will need to look into this.

For now, I'll follow the same strategy, and create the record deletion code in these functions.

Continuing with my simplified example, using only the attempts table, these functions are very straightforward. The delete_data_for_all_users_in_context function needs to delete all of the questionnaire_attempts records with the questionnaire id of the context passed into the function. So, the code looks like this:
public static function delete_data_for_all_users_in_context(\context $context) {
    global $DB;

    if (!($context instanceof \context_module)) {
        return;
    }

    if ($cm = get_coursemodule_from_id('questionnaire', $context->instanceid)) {
        $DB->delete_records('questionnaire_attempts', ['qid' => $cm->instance]);
    }
}
The delete_data_for_user function needs to delete all data for each provided context for the specified user. The parameter passed in is a new structure, \core_privacy\local\request\approved_contextlist, which contains the user and the context information we need. It provides methods to get the user and context information. Knowing that, the code becomes very similar to the previous function, except that it will delete all of the attempt records with the contexts' questionnaire id's and the specified user id. The code looks like this:
public static function delete_data_for_user(\core_privacy\local\request\approved_contextlist $contextlist) {
    global $DB;

    if (empty($contextlist->count())) {
        return;
    }

    $userid = $contextlist->get_user()->id;
    foreach ($contextlist->get_contexts() as $context) {
        if (!($context instanceof \context_module)) {
            continue;
        }
        if ($cm = get_coursemodule_from_id('questionnaire', $context->instanceid)) {
            $DB->delete_records('questionnaire_attempts', ['qid' => $cm->instance, 'userid' => $userid]);
        }
    }
}
To test these functions, I get the script provided from the Privacy API Utilities. Executing this function allows me to specify a username which will have its data removed. Before I execute this on my test site, I backup a copy of the database. My functions are not complete at the moment and will only delete the "attempts" record, leaving other data intact. If my functions work, I can restore the database afterward.

Executing the test script, shows a lot of output. Searching through that output, I find:
Processing mod_questionnaire (42/515) (Monday, 4 June 2018, 8:44 pm)
which is good. And, when I check the questionnaire_attempts data table, I see that the records for that user have indeed been deleted. Looks like this part of the API is working.

Now that I have the basic version working, I'll go back and make sure I do the complete job.

Looking ahead, I may need to learn about subcontexts, which are used in the forum provider. On the API documentation page, you can see it referred to. I believe its the key concept in creating the directory like structure of an export, as shown in the image below:



Stay tuned for Part 4, where I will determine if this is needed, and figure out how to do it.

Monday, June 4, 2018

Implementing Moodle's Privacy API in a Moodle Plugin - Part 2


In part 1, I began implementing Moodle's Privacy API in my questionnaire plugin, in order to meet the requirements of the GDPR. In this post, I will add the specific code to do this.

I have a skeleton file in place, that includes all of the class and function specifications that I need. Next, I need to describe each data table that includes user data. The questionnaire has several tables that do this, namely:
  • questionnaire_attempts
  • questionnaire_response
  • questionnaire_response_bool
  • questionnaire_response_date
  • questionnaire_response_other
  • questionnaire_response_rank
  • questionnaire_response_text
  • questionnaire_resp_multiple
  • questionnaire_resp_single
It looks like I need to add each of these to the $collection variable. And, each table and relevant field will require a language string, as shown in the documentation example. To start with, I'll implement just the questionnaire_attempts table.

Adding this table to the get_metadata function means defining the relevant fields. In this case, this table stores the user id, the question id, the response id and the time stamp of when the latest submission for this attempt occurred. Each of these fields can be considered private data, although the question id points to the actual question which really only provides context for a specific question response. I'll stay on the side of providing too much information rather than too little and include it. My function now looks like:
public static function get_metadata(collection $collection) : collection {

    // Add all of the relevant tables and fields to the collection.
    $collection->add_database_table('questionnaire_attempts', [
            'userid' => 'privacy:metadata:questionnaire_attempts:userid',
            'rid' => 'privacy:metadata:questionnaire_attempts:rid',
            'qid' => 'privacy:metadata:questionnaire_attempts:qid',
            'timemodified' => 'privacy:metadata:questionnaire_attempts:timemodified',
        ], 'privacy:metadata:questionnaire_attempts');

    return $collection;
}
And, I add each of the privacy strings to the language file as:
$string['privacy:metadata:questionnaire_attempts'] = 'Details about each submission of a questionnaire by a user.';
$string['privacy:metadata:questionnaire_attempts:userid'] = 'The ID of the user for this attempt.';
$string['privacy:metadata:questionnaire_attempts:rid'] = 'The ID of the user\'s response record for this attempt.';
$string['privacy:metadata:questionnaire_attempts:qid'] = 'The ID of the questionnaire record for this attempt.';
$string['privacy:metadata:questionnaire_attempts:timemodified'] = 'The timestamp for the latest submission of this attempt.';
Now that I have added the metadata, I should be able to see them at the "Plugin privacy registry" page of the site. Navigating to that page, and opening the section on questionnaire, I do indeed see the definitions I just added:



Next, I need to provide a way to retrieve and return the list of contexts for which my plugin stores user data. For my plugin, the only context is CONTEXT_MODULE. And I can determine the context module id for each questionnaire a user has responded to by the qid field in the questionnaire_attempts table and joining tables back through the course_modules table to the context table using SQL. My function looks like this:
public static function get_contexts_for_userid(int $userid): \core_privacy\local\request\contextlist {
    $contextlist = new \core_privacy\local\request\contextlist();

    $sql = "SELECT c.id
             FROM {context} c
       INNER JOIN {course_modules} cm ON cm.id = c.instanceid AND c.contextlevel = :contextlevel
       INNER JOIN {modules} m ON m.id = cm.module AND m.name = :modname
       INNER JOIN {questionnaire} q ON q.id = cm.instance
        LEFT JOIN {questionnaire_attempts} qa ON qa.qid = q.id
            WHERE qa.userid = :attemptuserid
    ";

    $params = [
        'modname' => 'questionnaire',
        'contextlevel' => CONTEXT_MODULE,
        'attemptuserid' => $userid,
    ];

    $contextlist->add_from_sql($sql, $params);

    return $contextlist;
}
Next, I need to provide a way to export user data. The documentation doesn't provide an example, but I can find examples in the core code.

There are a number of data types that must be exported mentioned in the documentation, but questionnaire only needs to worry about the "data" part. The documentation section also describes using the \core_privacy\local\request\content_writer but the code examples in the documentation use \core_privacy\local\request\writer. Looking at the /privacy/classes/local/request/content_writer.php file, I can see that is an interface, while the /privacy/classes/local/request/writer.php is a class described as a "factory class used to fetch and work with the content_writer". So I think the "writer" class has been provided as a shortcut.

Looking at the exporter code for choice and forum, it appears that there is no specific format for the output of a module. The data is structured as JSON, but the elements seem to be up to the plugin. This makes sense, since any plugin can have very different data.

For example, a choice activity export looks like this:
{
    "name": "Choice One",
    "intro": "",
    "completion": {
        "state": 0    
    },
    "answer": [
        "Choice 2"    
    ],
    "timemodified": "Wednesday, 3 May 2017, 6:28 pm"
}
While a forum post looks like this:
{
    "subject": "My New Post",
    "created": "Friday, 1 June 2018, 3:15 pm",
    "modified": "Friday, 1 June 2018, 3:15 pm",
    "author_was_you": "Yes",
    "message": "<p>Hi. This is my new post. I hope you like it.</p>
}
Before I implement an exporter, I will need to decide what the data should look like. I'll stick with my simple attempts data for now. Since any questionnaire instance can have multiple attempts by a user, it makes sense to create a structure organized by the instance; in this case the course module id. So my structure should look like this:
{
    "name": "Questionnaire name",
    "intro": "Complete this questionnaire",
    "completion": {
        "state": 0    
    },
    "attempts": [
        {
            "responseid": "rid1",
            "timemodified": "Wednesday, 3 May 2017, 6:28 pm"
        },
        {
            "responseid": "rid2",
            "timemodified": "Thursday, 4 May 2017, 9:31 am"
        }
    ]
}
Looking at the choice activity code for the exporter, I create a function to create the JSON structure I am aiming for. You can see the code here. This code uses several functions provided by the API that are not documented in the wiki. The documentation is really in the class files themselves.

The following line displays the time and date in a readable form:
'timemodified' => \core_privacy\local\request\transform::datetime($attempt->timemodified),
You can find the datetime function in the /privacy/classes/local/request/transform.php file.

The following line gets a structure containing general data for the activity and user that can be merged with the data more specific to the activity:
$contextdata = \core_privacy\local\request\helper::get_context_data($context, $user);
This function is contained in the file /privacy/classes/local/request/helper.php. Following through that code, it creates the part of the JSON structure I need, prior to the 'attempts' array.

The following lines, merge the specific data I want to export with the general data and then writes that JSON data to the export function:
$contextdata = (object)array_merge((array)$contextdata, $attemptdata);
\core_privacy\local\request\writer::with_context($context)->export_data([], $contextdata);
The with_context function is contained in the file /privacy/classes/local/request/writer.php,  and calls the export_data function which is ultimately located in the /privacy/classes/local/request/moodle_content_writer.php file.

The end result of this is an exported structure in JSON form.

Now, to test this, Moodle has provided some scripts that can be created and executed from the CLI. The one I want to use is the "Test of exporting user data" script, provided on that page. So, I create that script on my test site, and execute it. When I execute it, there is a lot of output. Scanning through the output, I see:
"Processing mod_questionnaire (4/15) (Friday, 1 June 2018, 8:37 pm)"
which is positive.

And the last line says:
"== File export was uncompressed to /moodledevsite/moodledata/temp/privacy/3d5750c5-4d5b-4e96-9e86-663cbc9ed177".

This means that there is data located in my moodledata directory, that should contain the exported data. A visual structure of that area looks like this:


I have opened it to the questionnaire I am testing. The "data.json" file will contain the data I exported. When I open the JSON file, I see:

{
    "name": "Test Questionnaire",
    "intro": "<div><p>A wonderful description of the questionnaire.<\/p><\/div>",
    "completion": {
        "state": "1"    
    },
    "attempts": [
        {
            "responseid": "66",
            "timemodified": "Friday, 18 November 2016, 8:14 pm"        
        },
        {
            "responseid": "88",
            "timemodified": "Tuesday, 11 April 2017, 8:50 pm"        
        },
        {
            "responseid": "89",
            "timemodified": "Tuesday, 11 April 2017, 8:54 pm"        
        }
    ]
}
Which appears to match what I wanted.

That's some good progress. In Part 3, I'll add the delete data portion of the API.