Most-missed criteria for projects
A number of projects are close to a badge, but don’t (yet) quite have it. Perhaps the criteria are too hard. Perhaps the criteria are fine, but we’re asking people to make reasonable improvements (and thus we’re making things better by pressuring for change). Or maybe some criteria are not adequately clear, and need some clarifications. We can at least look at the dataset and see what’s going on.
So I did an analysis just now of all projects at 90% or more that have *not* achieved a badge, and extracted out *only* the criteria that *MUST* be met. I present it below.
I’d like feedback. My hope is that by doing this analysis we’ll get a better idea of what the issues are.
Anyway, the most-missed ones (with “Unmet” or “?”) in order, starting with most-unmet or unknown, are as follows: ["tests_are_added_status", 13, 5, 2, 0], ["sites_https_status", 14, 5, 1, 0], ["crypto_certificate_verification_status", 4, 0, 5, 11], ["vulnerability_report_process_status", 16, 2, 2, 0], ["test_policy_status", 16, 3, 1, 0], ["know_secure_design_status", 17, 0, 3, 0], ["know_common_errors_status", 17, 0, 3, 0], ["delivery_unsigned_status", 17, 1, 2, 0], ["static_analysis_status", 15, 2, 1, 2] …
The numbers are counts of met, unmet, unknown, and N/A.
Clearly the biggest issues are adding tests for new functionality, HTTPS for sites, crypto certificate verification, and documenting the vulnerability report process. I don’t think those are unreasonable criteria, but clearly they’re the most common hurdles. There are many possible actions. We can keep the criteria as they are, make adjustments to the criteria, or perhaps find ways to help projects meet those criteria. So, let’s think about it – with the end goal of improving FLOSS projects (especially their security).
Below is the full list, and the code to create it.
--- David A. Wheeler
=== FULL LIST (sorted MUSTs, totals for projects >=90% and <100% ===
[["tests_are_added_status", 13, 5, 2, 0], ["sites_https_status", 14, 5, 1, 0], ["crypto_certificate_verification_status", 4, 0, 5, 11], ["vulnerability_report_process_status", 16, 2, 2, 0], ["test_policy_status", 16, 3, 1, 0], ["know_secure_design_status", 17, 0, 3, 0], ["know_common_errors_status", 17, 0, 3, 0], ["delivery_unsigned_status", 17, 1, 2, 0], ["static_analysis_status", 15, 2, 1, 2], ["report_responses_status", 18, 1, 1, 0], ["vulnerability_report_private_status", 15, 0, 2, 3], ["vulnerability_report_response_status", 18, 1, 1, 0], ["crypto_working_status", 10, 2, 0, 8], ["crypto_password_storage_status", 3, 1, 1, 15], ["crypto_random_status", 7, 1, 1, 11], ["dynamic_analysis_fixed_status", 18, 2, 0, 0], ["documentation_interface_status", 19, 1, 0, 0], ["warnings_status", 16, 1, 0, 3], ["warnings_fixed_status", 17, 1, 0, 2], ["crypto_published_status", 11, 1, 0, 8], ["crypto_keylength_status", 7, 0, 1, 12], ["vulnerabilities_fixed_60_days_status", 19, 1, 0, 0], ["static_analysis_fixed_status", 16, 0, 1, 3], ["description_good_status", 20, 0, 0, 0], ["interact_status", 20, 0, 0, 0], ["contribution_status", 20, 0, 0, 0], ["floss_license_status", 20, 0, 0, 0], ["license_location_status", 20, 0, 0, 0], ["documentation_basics_status", 20, 0, 0, 0], ["discussion_status", 20, 0, 0, 0], ["repo_public_status", 20, 0, 0, 0], ["repo_track_status", 20, 0, 0, 0], ["repo_interim_status", 20, 0, 0, 0], ["version_unique_status", 20, 0, 0, 0], ["release_notes_status", 20, 0, 0, 0], ["release_notes_vulns_status", 20, 0, 0, 0], ["report_process_status", 20, 0, 0, 0], ["report_archive_status", 20, 0, 0, 0], ["build_status", 16, 0, 0, 4], ["test_status", 20, 0, 0, 0], ["crypto_floss_status", 12, 0, 0, 8], ["delivery_mitm_status", 20, 0, 0, 0], ["no_leaked_credentials_status", 20, 0, 0, 0]]
=== CODE ===
$ cat get-criteria-info #!/bin/sh
rails console <<CONSOLE_END results = [] # my_criteria = Project::ALL_CRITERIA_STATUS my_criteria = Criteria.select { |c| c.must? } puts my_criteria my_criteria = my_criteria.map { |c| c.name.to_s + '_status' } puts my_criteria my_criteria.each do |criterion| data=Project.where('badge_percentage >= 90 AND badge_percentage < 100').select(criterion.to_s).group(criterion.to_s).unscope(:order).count results.append([criterion.to_s, data.fetch('Met',0), data.fetch('Unmet',0), data.fetch('?',0), data.fetch('N/A',0)]) end results.sort! { |x,y| -(x[2]+x[3]) <=> -(y[2]+y[3]) } puts 'criterion,Met,Unmet,?,N/A' results.each do |row| puts row.join(',') end
CONSOLE_END
|
|
Daniel Stenberg
On Sun, 24 Jul 2016, Wheeler, David A wrote:
I'd like feedback. My hope is that by doing this analysis we'll get a better idea of what the issues are.This is awesome data amd I'd *love* it if we could have this information generated on a regular basis and have it shown on the site. As we're slowly increasing the number of projects, I figure it'll also get clearer exactly which criterias many of us struggle to meet. -- / daniel.haxx.se
|
|
Kevin W. Wall
On Sun, Jul 24, 2016 at 11:55 PM, Wheeler, David A <dwheeler@ida.org> wrote:
A number of projects are close to a badge, but don’t (yet) quite have it.David, Sorry a bit late with this feedback. I hope it is the type of feedback that you are looking for. I can't speak for others, but I guess the one that I'm "struggling" with the most (translate "is the biggest time sink to date") is the [warnings] and the corresponding [warnings_fixed] criteria under Quality. Now, I could easily "cheat" and abide by the LETTER of the badging criteria...for [warnings] it says: "The project MUST enable one or more compiler warning flags ...". Technically, we could get by by using just one -Xlint flag...but *I* feel that is missing the SPIRIT of the criteria. I prefer to take the high ground and use "javac -Xlint:all", but as you can imagine with any Java project that dates back to at least JDK 1.4, if you haven't been doing that, there's a lot of squawking and much of it is noise. My preference is to actually examine each notification and not just to use @SuppressWarnings to shut them all up. (Not that I am against _judiciously_ applying that.) Last time I checked, I think -Xlint:all generated about 1300 warnings. And I do not have enough resources to address that many warnings in any realistic short period of time. So, if I set the high bar, like I think I should do, and use -X:lint:all rather than just something simple that I'm 99% sure would pass, such as -Xlint:divzero, I reach a point that makes extremely difficult to reach the criteria for [warnings_fixed]. Somehow, that just doesn't seem right. Rather than saying "MUST enable one or more compiler warning flags ..." I would prefer that your team research which are the critical ones that we should really care about. If I could get the # of warnings down to some reasonable # (say less than 100), I could begin to methodically work on them, but with 1300 or so (or however many there were), I just have lost all motivation. So, in one way, I feel as though I have let the "perfect become the enemy of the good", but on the other hand, I feel that if I settle for anything short of "-Xlint:all" (at least if *I* am choosing the X:lint flag(s)), I am trying to game the system. So that's the show stopper issue for me at the moment. -kevin -- Blog: http://off-the-wall-security.blogspot.com/ | Twitter: @KevinWWall NSA: All your crypto bit are belong to us.
|
|
Kevin W. Wall [mailto:kevin.w.wall@gmail.com]
Sorry a bit late with this feedback. I hope it is the type of feedback that youWell, we can discuss here what the criteria *should* say, but as written that those criteria specifically did NOT require "all warning flags". That was intentional, because of the problem you just noted - you just *can't* do that in reasonable time in nontrivial projects if those flags aren't already enabled. Those criteria simply say "one or more", not all, by intent. No need to be discouraged. Identify a few of the most important few warning flags, enable those, fix the problems they report, and shout hooray - you met those criteria. Now you're right that "-Xlint:all" would be even better. Hey, I won't stop you from dreaming big :-). But it's not required for this "passing" level. If you're starting a "green field" project I think you *should* turn on every imaginable warning flag from the beginning. But with existing projects you *have* to smart small and grow, or the number of warnings will kill you. The test criteria may be instructive. We don't have a test coverage criterion - instead, you have to have automated testing, & then keep adding to them. There's no argument that good test coverage is better. Heck, the BadgeApp itself has tons of warnings enabled and 98% test coverage. The "best practices" badge doesn't say that your project is done - instead, it shows that your project is doing the right kinds of things to keep improving. We're trying to create a baseline - once these criteria are met (e.g., having a test framework and enabling some warning flags), it's a lot easier for a project to improve. As always, comments welcome. --- David A. Wheeler
|
|