Topics

GDPR - we think we're ready, let me know of any issues

David A. Wheeler
 

The EU General Data Protection Regulation (GDPR)'s official beginning enforcement date is 2018-05-25, which is just 11 days away.

As far as I know, we don't have any GDPR issues - but if you think we do, PLEASE let me know.

Below is a quick set of highlights of why we think we're okay from a GDPR viewpoint. This isn't a complete rationale for why we think we meet the GDPR, but hopefully it gives you a sense of the situation.

Now, a caveat. I'm a US citizen, who works for a US company, and I am *not* a lawyer. European law is *way* outside my field of expertise. What's more, the GDPR is intentionally worded in a very high-level aspirational way, making it a little hard for a non-lawyer to be sure we've addressed absolutely everything.

That said, I can say that we've honestly tried to meet and in many places exceed the GDPR requirements. We want the BadgeApp to respect user privacy, regardless of where the user lives. As always, please let us know if there's a problem.

Thank you!

--- David A. Wheeler

=====================================================================

There are many reasons we think we don't have any GDPR issues. From the very beginning, we have always considered user privacy very important. For example:
* We *never* give user data to anyone else unless we're legally required to do so. We don't sell (or display) ads. We don't sell tracking info or perform services for others who want users tracked.
* We only use personal data to perform badge-related functions, for example, to authenticate users, to determine if users are authorized to make changes, to log which user modified data, to communicate with users (e.g., via email) about badge-related issues (including reminder emails and password resets), to help users grant edit rights to others, to help users ensure that they are granting addition rights to the correct user, to display to others who "owns" the project entry, and to display to others which users are allowed to make modifications.
* We don't collect/store much. The main private data we store is user email addresses. Email addresses are *only* used for badging-related activities. We do send reminders to projects who don't have passing badge, but those are focused emails to specific users who already specifically told us that they want to actively pursue a badge & yet have not made any edits for a long time. If a user keeps pursuing a badge (via edits), or the project gets a passing badge, that user will never see a reminder message. Reminder emails are NOT sent as part of a mailing list.
* Users can always delete their accounts at any time if they want to (though we hope they won't want to). I think that meets the "right of erasure" aka "right to be forgotten".
* Unlike many web sites, we *intentionally* directly host files (like jquery), and our links to social networks (like Facebook) do *NOT* provide any tracking data unless the user actively clicks on a link that social network.
* We have a really good security story. See: https://github.com/coreinfrastructure/best-practices-badge/blob/master/doc/security.md

The big issue we dealt with months ago was user "data portability" - a GDPR requirement that users be able to get data about themselves in some standard format. It's not clear how *useful* this is, because we don't store much information about users. That said, we don't need to apologize for "not storing much information about users". In any case, I think we've completely met that GDPR requirement - a while ago we added the ability for users to get information about themselves in JSON format.

The system does store activity logs for all requests to the website. These logs are necessary to detect and fix erroneous behavior, as well to detect and counter malicious behavior. For logging to meet these requirements, it is necessary and important to record a variety of information, including the specific request, a summary of what action was performed on the request, the IP address of the requester, and also the user id of a logged-in user where relevant. Therefore, our logs (like most logs) record this data (IP addresses and user id numbers). We believe that being able to fix erroneous behaviors of the website, and counter malicious behaviors directed against this website, is a legitimate interest. We do not use the logs for profiling users for marketing or anything like that; we use the logs to help ensure that the site continues to work in spite of errors or network attack. We do not provide log data to external users, as that could breach others' privacy. We believe this is fine under the GDPR; the GDPR requires "data portability" where consent is granted or the data is provided in performance of a contract, but log data is recorded to support a legitimate interest (and thus is not subject to data portability requirements).

Georg Link
 

Thanks David,

It might be helpful to additionally document how long activity logs are kept and when they are either anonymized or deleted. Because the goal "to detect and fix erroneous behavior, as well to detect and counter malicious behavior" might not require the data for eternity.

Best,
Georg


On Mon, May 14, 2018, 15:14 Wheeler, David A <dwheeler@...> wrote:

The system does store activity logs for all requests to the website.  These logs are necessary to detect and fix erroneous behavior, as well to detect and counter malicious behavior.  For logging to meet these requirements, it is necessary and important to record a variety of information, including the specific request, a summary of what action was performed on the request, the IP address of the requester, and also the user id of a logged-in user where relevant.  Therefore, our logs (like most logs) record this data (IP addresses and user id numbers).  We believe that being able to fix erroneous behaviors of the website, and counter malicious behaviors directed against this website, is a legitimate interest.  We do not use the logs for profiling users for marketing or anything like that; we use the logs to help ensure that the site continues to work in spite of errors or network attack.  We do not provide log data to external users, as that could breach others' privacy.  We b
 elieve this is fine under the GDPR; the GDPR requires "data portability" where consent is granted or the data is provided in performance of a contract, but log data is recorded to support a legitimate interest (and thus is not subject to data portability requirements).

David A. Wheeler
 

Georg Link:
> It might be helpful to additionally document how long activity logs are kept and when they are either anonymized or deleted. Because the goal "to detect and fix erroneous behavior, as well to detect and counter malicious behavior" might not require the data for eternity.

 

Fair enough.

 

The log of activity records requests to the system and related activity.  Logs are rotated daily and log data is archived for 1 year.  After that, it’s gone.

 

Some bugs are intermittent, and some attackers use “low and slow” kinds of attacks.  Thus, we need to log things for a period of time to deal with those cases.  A year seems like a reasonable period of time.

 

Does that help?

 

--- David A. Wheeler

 

Sent: Monday, May 14, 2018 5:55 PM
To: Wheeler, David A
Cc: cii-badges@...
Subject: Re: [CII-badges] GDPR - we think we're ready, let me know of any issues

 

Thanks David,

 

 

Best,

Georg

 

On Mon, May 14, 2018, 15:14 Wheeler, David A <dwheeler@...> wrote:


The system does store activity logs for all requests to the website.  These logs are necessary to detect and fix erroneous behavior, as well to detect and counter malicious behavior.  For logging to meet these requirements, it is necessary and important to record a variety of information, including the specific request, a summary of what action was performed on the request, the IP address of the requester, and also the user id of a logged-in user where relevant.  Therefore, our logs (like most logs) record this data (IP addresses and user id numbers).  We believe that being able to fix erroneous behaviors of the website, and counter malicious behaviors directed against this website, is a legitimate interest.  We do not use the logs for profiling users for marketing or anything like that; we use the logs to help ensure that the site continues to work in spite of errors or network attack.  We do not provide log data to external users, as that could breach others' privacy.  We b
 elieve this is fine under the GDPR; the GDPR requires "data portability" where consent is granted or the data is provided in performance of a contract, but log data is recorded to support a legitimate interest (and thus is not subject to data portability requirements).

Georg Link
 

Sounds reasonable, thanks David.

On Mon, May 14, 2018 at 5:24 PM, Wheeler, David A <dwheeler@...> wrote:

Georg Link:
> It might be helpful to additionally document how long activity logs are kept and when they are either anonymized or deleted. Because the goal "to detect and fix erroneous behavior, as well to detect and counter malicious behavior" might not require the data for eternity.

 

Fair enough.

 

The log of activity records requests to the system and related activity.  Logs are rotated daily and log data is archived for 1 year.  After that, it’s gone.

 

Some bugs are intermittent, and some attackers use “low and slow” kinds of attacks.  Thus, we need to log things for a period of time to deal with those cases.  A year seems like a reasonable period of time.

 

Does that help?

 

--- David A. Wheeler

 

Sent: Monday, May 14, 2018 5:55 PM
To: Wheeler, David A
Cc: cii-badges@lists.coreinfrastructure.org
Subject: Re: [CII-badges] GDPR - we think we're ready, let me know of any issues

 

Thanks David,

 

 

Best,

Georg

 

On Mon, May 14, 2018, 15:14 Wheeler, David A <dwheeler@...> wrote:


The system does store activity logs for all requests to the website.  These logs are necessary to detect and fix erroneous behavior, as well to detect and counter malicious behavior.  For logging to meet these requirements, it is necessary and important to record a variety of information, including the specific request, a summary of what action was performed on the request, the IP address of the requester, and also the user id of a logged-in user where relevant.  Therefore, our logs (like most logs) record this data (IP addresses and user id numbers).  We believe that being able to fix erroneous behaviors of the website, and counter malicious behaviors directed against this website, is a legitimate interest.  We do not use the logs for profiling users for marketing or anything like that; we use the logs to help ensure that the site continues to work in spite of errors or network attack.  We do not provide log data to external users, as that could breach others' privacy.  We b
 elieve this is fine under the GDPR; the GDPR requires "data portability" where consent is granted or the data is provided in performance of a contract, but log data is recorded to support a legitimate interest (and thus is not subject to data portability requirements).


David A. Wheeler
 

I just realized that I should also add a weird special case: Temporarily-retained backups of logs or databases, which can make our theoretical maximum retention time 18 months (1.5 years).  Here’s the issue. We don’t normally do this, but it’s *possible* to make backup copies of logs, and we occasionally make copies of databases.  In all cases, the purpose is to detect defects and/or attacks – we don’t analyze individual user behavior (unless you consider “attacking our site” a valid user behavior).  We don’t retain this information for more than 6 months beyond its normal expiration (and that’d be an unusual case).  Of course, errors can happen, but that’s what we are actively trying to do.  So in an *outside* case, deleted private data can stick around internally for 18 months.  It’s not likely, but it’s *possible*.

 

Overall, I think we have a good story regarding privacy.  We do not share personal data with any third parties.  We do not have advertisements of any kind.  We do not process payments of any kind.  We do not use external tracking tools like Google Analytics.  We self-host our JavaScript, fonts, and images, so users do not trigger downloads from external third-party sites when they request our web pages.  We also set our cookies to “SameSite lax” which further mitigates the risk of cross-origin information leakage to third parties.  We do not allow users to set up loading of external images in the markup text that they provide (images are a common way to insert trackers). As a result, we believe there is no opportunity for a third party to track users (such as by using “third party cookies”), because we don’t load them.  We do use a cloud service (Heroku/Amazon) and content delivery network (CDN) (Fastly) to implement the site, but they simply provide the computation and network delivery service.

 

The BadgeApp front page does have hypertext links to well-known social media sites (including Twitter, Reddit, and Facebook).  However, these links are carefully designed so that viewing the BadgeApp front page does not notify the external sites that the user is viewing the BadgeApp front page, and the BadgeApp never shares personal data with those other sites.  Users must expressly click on those links to go to those other sites, and even in those cases we simply transfer generic information about the badging site; we do not provide any personal information about the user to those external sites.

 

I think we meet the other requirements too.  We don’t store a lot of private information about users, and it isn’t THAT sensitive - their email address is the most sensitive we get (which is not in the “most sensitive” category).  Users can see what we store, and can delete that information, whenever they want to.

 

Again, I’m not a lawyer, but I *think* we’re okay.  Of course, if someone sees a problem, PLEASE let us know.  We *want* to give everyone privacy.

 

--- David A. Wheeler

 

 

From: Georg Link [mailto:linkgeorg@...]
Sent: Monday, May 14, 2018 6:44 PM
To: Wheeler, David A
Cc: cii-badges@...
Subject: Re: [CII-badges] GDPR - we think we're ready, let me know of any issues

 

Sounds reasonable, thanks David.

 

On Mon, May 14, 2018 at 5:24 PM, Wheeler, David A <dwheeler@...> wrote:

Georg Link:
> It might be helpful to additionally document how long activity logs are kept and when they are either anonymized or deleted. Because the goal "to detect and fix erroneous behavior, as well to detect and counter malicious behavior" might not require the data for eternity.

 

Fair enough.

 

The log of activity records requests to the system and related activity.  Logs are rotated daily and log data is archived for 1 year.  After that, it’s gone.

 

Some bugs are intermittent, and some attackers use “low and slow” kinds of attacks.  Thus, we need to log things for a period of time to deal with those cases.  A year seems like a reasonable period of time.

 

Does that help?

 

--- David A. Wheeler

 

Sent: Monday, May 14, 2018 5:55 PM
To: Wheeler, David A
Cc: cii-badges@...
Subject: Re: [CII-badges] GDPR - we think we're ready, let me know of any issues

 

Thanks David,

 

 

Best,

Georg

 

On Mon, May 14, 2018, 15:14 Wheeler, David A <dwheeler@...> wrote:


The system does store activity logs for all requests to the website.  These logs are necessary to detect and fix erroneous behavior, as well to detect and counter malicious behavior.  For logging to meet these requirements, it is necessary and important to record a variety of information, including the specific request, a summary of what action was performed on the request, the IP address of the requester, and also the user id of a logged-in user where relevant.  Therefore, our logs (like most logs) record this data (IP addresses and user id numbers).  We believe that being able to fix erroneous behaviors of the website, and counter malicious behaviors directed against this website, is a legitimate interest.  We do not use the logs for profiling users for marketing or anything like that; we use the logs to help ensure that the site continues to work in spite of errors or network attack.  We do not provide log data to external users, as that could breach others' privacy.  We b
 elieve this is fine under the GDPR; the GDPR requires "data portability" where consent is granted or the data is provided in performance of a contract, but log data is recorded to support a legitimate interest (and thus is not subject to data portability requirements).