0 votes
in SoSci Survey (English) by s108637 (205 points)

I am running a study which is both a research experiment and a class assignment. Students who chose to participate in the experiment are entered into a lottery for compensation. Those who don't will get a summary of the class results but will then have their data deleted and are not entered in the lottery.

The study is a multi-wave survey though there are just two waves. If students do the assignment but decline to participate in the research I need to find the stored survey data that matches their opt-in data so that I can delete it. I do this by looking at the personal id stored with the Opt-in variable in the first wave survey data. If the student declined to participate in the research I find the record in the second wave survey data with the personal id in the serial number variable and delete it from my downloaded records. I will also delete it from the server tomorrow when the study closes for good.

Unfortunately I have found four records where the personal ID stored in the first-wave data is the same as the personal ID stored in another record (i.e. four duplicates) In two cases the 2nd opt-in survey was done immediately after the first but in two cases they were not. Fortunately, I send each participant a final email which differs depending on whether they participated in the experiment or not I have been able to find three of the cases and delete them but I can not find the fourth. Not every participant has forwarded me the final email they receive.

I need to determine what data record belongs to the one student who declined to participate in the experiment, and who's records I have not been able to identify. If I can find the participant ID based on the case number of the opt-in survey I will be able to do this. Is this possible? If not is there some other way I can connect the opt-in survey data with the follow on data ?

by SoSci Survey (306k points)
> If students do the assignment but decline to participate in the research I need to find the stored survey data that matches their opt-in data so that I can delete it. I do this by looking at the personal id stored with the Opt-in variable in the first wave survey data.

At what time do students indicate, if they participate in the experiment? Do they select that in the questionnaire?
by s108637 (205 points)
Yes they select this in the initial opt-in questionnaire.  A specific consent question is asked and the answer is stored in the database associated with the personal ID (using dbset).  I refer to the personal ID as the study ID and it also matches the contents of the serial field.  In the second wave, this consent value is looked up based on the serial variable to find out what the user preferences were and to alter the way the survey runs (i.e no mention of compensation and a different form of acknowledgment).  This process seems to be working.  People who have sent me the non participant form of acknowledgment (which includes the study ID) also have first wave records showing that they declined to participate.  It may even have worked for the four records in question despite the fact that they have the wrong value stored with the opt-in variable in the collected data.  As mentioned there were three non-participant vouchers with study IDs that did not surface in the collected data.  I can't say for certain but I assume that these were among the four with the wrong personal id in the downloaded data.  This suggests that the underlying database is correct but that the downloaded data is somehow wrong.

1 Answer

0 votes
by SoSci Survey (306k points)

Yes they select this in the initial opt-in questionnaire.

You may consider to make a copy of the opt-in question, and display one of those questions, depending on the choice. Because than you have the option to assign different subgroups to the address entries. That makes it easier to identify the cases to be deleted.

In the second wave, this consent value is looked up based on the serial variable

If you had the information in the address entry, you may be ablte to use panelData() to simplify data transfer.

It may even have worked for the four records in question despite the fact that they have the wrong value stored with the opt-in variable in the collected data.

There is a known issue with the opt-in question: If a person enters an email address A, then repeats the page (due to a missing answer or a back button), then enters another email address B, but then confirms the first (email A) opt-in link, then you have one code A in the address entry, but another code B in the opt-in variable.

by s108637 (205 points)
This is helpful information, especially if I do this again, but unfortunately does not solve my problem.  I am also not sure that I have run into this bug as all of the opt-in surveys I am discussing have been marked as finished.  I would assume that if someone did run into the bug you describe, at least one of the two opt-in surveys would not be completed.

More importantly. I have to find the records to delete in existing data.  Recoding the survey can't help me as it has now ended.  So I still need an answer to the question as to how I can find the correct underlying serial number from a case so that I can identify the second wave data to be deleted.  If that can't be done, the next question is whether you can identify the second wave data connected with the first wave so that you can delete the record for me.
by SoSci Survey (306k points)
> at least one of the two opt-in surveys would not be completed.

If you continue normally after changing the email address, this will still result in a FINISHED=1 case.

> So I still need an answer to the question as to how I can find the correct underlying serial number from a case

Technically, the case is from a different address entry than the one that was (at last) registered in the opt-in questionnaire. Therefore, it may be impossible to find out which case belongs to the undefined entries. Especially, if the opt-in was some time ago.

I can go into the database and log files and see what I can do, but I cannot promise that will be successful. What I would need for that would be an email to info@soscisurvey.de, stating all data about the undefined records that you have (SERIAL, time stamp, when a mailing has probably been sent etc.)

The good thing is: If we cannot restore the connection between address entry and data, you at least do not have issues with GDPR, because that data is reliably anonymous, then. Of course, ethical restictions are something else. Regarding them, you might be forced to delete all 4 cases for which you do not have a reliable consent.

Willkommen im Online-Support von SoSci Survey.

Hier bekommen Sie schnelle und fundierte Antworten von anderen Projektleitern und direkt von SoSci Survey.

→ Eine Frage stellen


Welcome to the SoSci Survey online support.

Simply ask a question to quickly get answers from other professionals, and directly from SoSci Survey.

→ Ask a Question

...