[CII-badges] Ranking criteria


Sebastian Benthall
 

Hello!

Thanks for inviting me to participate in this project.

At Selection Pressure, we are looking at ways to incorporate project risk measurements into one of our products.

The CII Census looks like a great start on this!

I'm wondering what your plans are moving forward, especially with regard to the Risk Index. I see from the Wheeler and Khakimov paper that a lot of research went into possible metrics, and that the initial Risk Index score is a reflection of that.

What sort of process do you anticipate using for including new features into that calculation, and scoring them?

Do you have a plan for assessing empirically to what extent that Risk Index correlates with software risk?

Thanks!

Sebastian Benthall
PhD Candidate / UC Berkeley School of Information
Data Scientist / Selection Pressure

On Thu, Jan 14, 2016 at 2:27 PM, Dan Kohn <dankohn@...> wrote:
Mailing list is at http://lists.coreinfrastructure.org/mailman/listinfo/cii-census but specific suggestions for improving the project are probably best through the issue tracker.

We encourage you to fork the project and suggest improvements with a pull request.

--
Dan Kohn <mailto:dankohn@...>
Senior Advisor, Core Infrastructure Initiative
tel:+1-415-233-1000

On Thu, Jan 14, 2016 at 5:18 PM, Sebastian Benthall <sbenthall@...> wrote:
I do not see a mailing list listed on the cii-census GitHub page.
Is there one?
Or should general discussion about that project take place on the issue tracker?

Thanks,
Seb

On Thu, Jan 14, 2016 at 7:43 AM, Sebastian Benthall <sbenthall@...> wrote:

Will do. Thanks for referring me to that!

On Jan 14, 2016 6:10 AM, "Wheeler, David A" <dwheeler@...> wrote:
On Wed, Jan 13, 2016 at 9:32 PM, Sebastian Benthall <sbenthall@...> wrote:
> I'm a grad student studying quantitative metrics on open source software projects, a OSS developer and former project manager, and a contracting data scientist at Selection Pressure.

Dan Kohn:
> Sebastian, I think you'll be interested in our sister project, the CII Census. https://github.com/linuxfoundation/cii-census

I agree, please take a look at the census project.  The census project itself does quantitative measures, and on that site you'll also find a paper that points to other related work (you'll find that useful if you're trying to do scholarly work on the topic).

--- David A. Wheeler






David A. Wheeler
 

Sebastian Benthall:
Thanks for inviting me to participate in this project.
At Selection Pressure, we are looking at ways to incorporate project risk measurements into one of our products.
The CII Census looks like a great start on this!
Thanks!

I'm wondering what your plans are moving forward, especially with regard to the Risk Index. I see from the Wheeler and Khakimov paper that a lot of research went into possible metrics, and that the initial Risk Index score is a reflection of that.
What sort of process do you anticipate using for including new features into that calculation, and scoring them?
Do you have a plan for assessing empirically to what extent that Risk Index correlates with software risk?
We run this as an open source software project - if you have an idea for an improvement, please propose it via pull request, issue tracker, or mailing list.

A serious challenge for this project (and others like it) is a lack of 'ground truth'. If we knew ahead-of-time what the right answers were, we'd just use them :-). If we knew what the right answers were for a large data set, we could use that as a training set for statistical analysis and/or a learning algorithm.

Since we lack ground truth, we did what was documented in the paper. Here's a quick summary. We surveyed past efforts, selected a plausible set of metrics based on that, and heuristically developed a way to combine the metrics. We then had experts (hi!) look at the results (and WHY they were the results), look for anomalies, and adjust the algorithm until the results appeared reasonable. We also published everything as OSS, so others could propose improvements. We presume that humans will review the final results, and that helps too.

We're busy getting the CII badging program up-and-running (it's the same people), so we haven't spent as much time on the census recently. But this is definitely not an ignored project. You'll notice I already merged your pull request :-).

--- David A. Wheeler


Sebastian Benthall
 

We run this as an open source software project - if you have an idea for an improvement, please propose it via pull request, issue tracker, or mailing list.

Glad to!
 
A serious challenge for this project (and others like it) is a lack of 'ground truth'.  If we knew ahead-of-time what the right answers were, we'd just use them :-).  If we knew what the right answers were for a large data set, we could use that as a training set for statistical analysis and/or a learning algorithm.

I see. That makes sense.

One thing I'm trying to get a sense of (and I still need to read the paper very thoroughly to find out) is what exactly the "risk" you a measuring is risk of. That would make it easier to identify ground truth or proxies for it in existing data.

For example, 'having a vulnerability to SQL' injection is a very different kind of risk from 'having a low bus factor'.

Identifying when projects have died because of bus factor issues might be possible from observational data of open source communities.
 
Since we lack ground truth, we did what was documented in the paper.  Here's a quick summary.  We surveyed past efforts, selected a plausible set of metrics based on that, and heuristically developed a way to combine the metrics.  We then had experts (hi!) look at the results (and WHY they were the results), look for anomalies, and adjust the algorithm until the results appeared reasonable. 

This is great.

Is there a record of the anomalies and the adjustments?

Is there any sort of formal procedure for further expert review?

I would be interested in designing such a procedure if there isn't one.
 
We also published everything as OSS, so others could propose improvements.  We presume that humans will review the final results, and that helps too.

We're busy getting the CII badging program up-and-running (it's the same people), so we haven't spent as much time on the census recently.  But this is definitely not an ignored project.  You'll notice I already merged your pull request :-).

Thanks! and understood :) 


David A. Wheeler
 

Sebastian Benthall:
One thing I'm trying to get a sense of (and I still need to read the paper very thoroughly to find out) is what exactly the "risk" you a measuring is risk of. That would make it easier to identify ground truth or proxies for it in existing data.
The title of the supporting paper gives that away: "Open Source Software Projects Needing Security Investments". The CII project was started, in part, as a response to the Heartbleed vulnerability of OpenSSL. We're trying to determine what projects are more likely to have serious vulnerabilities and investment is needed.


Is there a record of the anomalies and the adjustments?
A high-level discussion is in the paper. See the git log for a record of many of the actual adjustments (the commit text should give you at least a brief reason as to *why* they were adjusted). I don’t think all adjustments we tried are recorded in the git log, since we weren't particularly trying to do that (sorry). But I think you'll find lots of useful information.


Is there any sort of formal procedure for further expert review?
I would be interested in designing such a procedure if there isn't one.
No, there's no formal procedure. You can propose one.

That said, we're happy to take good ideas from anyone, even if they're not perceived as experts.

--- David A. Wheeler