Date
1 - 4 of 4
[CII-badges] Ranking criteria
Sebastian Benthall
Hello! Thanks for inviting me to participate in this project. At Selection Pressure, we are looking at ways to incorporate project risk measurements into one of our products. The CII Census looks like a great start on this! I'm wondering what your plans are moving forward, especially with regard to the Risk Index. I see from the Wheeler and Khakimov paper that a lot of research went into possible metrics, and that the initial Risk Index score is a reflection of that. What sort of process do you anticipate using for including new features into that calculation, and scoring them? Do you have a plan for assessing empirically to what extent that Risk Index correlates with software risk? Thanks! Sebastian Benthall PhD Candidate / UC Berkeley School of Information Data Scientist / Selection Pressure On Thu, Jan 14, 2016 at 2:27 PM, Dan Kohn <dankohn@...> wrote:
|
|
Sebastian Benthall:
Thanks for inviting me to participate in this project.Thanks! I'm wondering what your plans are moving forward, especially with regard to the Risk Index. I see from the Wheeler and Khakimov paper that a lot of research went into possible metrics, and that the initial Risk Index score is a reflection of that.We run this as an open source software project - if you have an idea for an improvement, please propose it via pull request, issue tracker, or mailing list. A serious challenge for this project (and others like it) is a lack of 'ground truth'. If we knew ahead-of-time what the right answers were, we'd just use them :-). If we knew what the right answers were for a large data set, we could use that as a training set for statistical analysis and/or a learning algorithm. Since we lack ground truth, we did what was documented in the paper. Here's a quick summary. We surveyed past efforts, selected a plausible set of metrics based on that, and heuristically developed a way to combine the metrics. We then had experts (hi!) look at the results (and WHY they were the results), look for anomalies, and adjust the algorithm until the results appeared reasonable. We also published everything as OSS, so others could propose improvements. We presume that humans will review the final results, and that helps too. We're busy getting the CII badging program up-and-running (it's the same people), so we haven't spent as much time on the census recently. But this is definitely not an ignored project. You'll notice I already merged your pull request :-). --- David A. Wheeler |
|
Sebastian Benthall
We run this as an open source software project - if you have an idea for an improvement, please propose it via pull request, issue tracker, or mailing list. Glad to! A serious challenge for this project (and others like it) is a lack of 'ground truth'. If we knew ahead-of-time what the right answers were, we'd just use them :-). If we knew what the right answers were for a large data set, we could use that as a training set for statistical analysis and/or a learning algorithm. I see. That makes sense. One thing I'm trying to get a sense of (and I still need to read the paper very thoroughly to find out) is what exactly the "risk" you a measuring is risk of. That would make it easier to identify ground truth or proxies for it in existing data. For example, 'having a vulnerability to SQL' injection is a very different kind of risk from 'having a low bus factor'. Identifying when projects have died because of bus factor issues might be possible from observational data of open source communities. Since we lack ground truth, we did what was documented in the paper. Here's a quick summary. We surveyed past efforts, selected a plausible set of metrics based on that, and heuristically developed a way to combine the metrics. We then had experts (hi!) look at the results (and WHY they were the results), look for anomalies, and adjust the algorithm until the results appeared reasonable. This is great. Is there a record of the anomalies and the adjustments? Is there any sort of formal procedure for further expert review? I would be interested in designing such a procedure if there isn't one. We also published everything as OSS, so others could propose improvements. We presume that humans will review the final results, and that helps too. Thanks! and understood :) |
|
Sebastian Benthall:
One thing I'm trying to get a sense of (and I still need to read the paper very thoroughly to find out) is what exactly the "risk" you a measuring is risk of. That would make it easier to identify ground truth or proxies for it in existing data.The title of the supporting paper gives that away: "Open Source Software Projects Needing Security Investments". The CII project was started, in part, as a response to the Heartbleed vulnerability of OpenSSL. We're trying to determine what projects are more likely to have serious vulnerabilities and investment is needed. Is there a record of the anomalies and the adjustments?A high-level discussion is in the paper. See the git log for a record of many of the actual adjustments (the commit text should give you at least a brief reason as to *why* they were adjusted). I don’t think all adjustments we tried are recorded in the git log, since we weren't particularly trying to do that (sorry). But I think you'll find lots of useful information. Is there any sort of formal procedure for further expert review?No, there's no formal procedure. You can propose one. That said, we're happy to take good ideas from anyone, even if they're not perceived as experts. --- David A. Wheeler |
|