David A. Wheeler
One thing I'm trying to get a sense of (and I still need to read the paper very thoroughly to find out) is what exactly the "risk" you a measuring is risk of. That would make it easier to identify ground truth or proxies for it in existing data.The title of the supporting paper gives that away: "Open Source Software Projects Needing Security Investments". The CII project was started, in part, as a response to the Heartbleed vulnerability of OpenSSL. We're trying to determine what projects are more likely to have serious vulnerabilities and investment is needed.
Is there a record of the anomalies and the adjustments?A high-level discussion is in the paper. See the git log for a record of many of the actual adjustments (the commit text should give you at least a brief reason as to *why* they were adjusted). I don’t think all adjustments we tried are recorded in the git log, since we weren't particularly trying to do that (sorry). But I think you'll find lots of useful information.
Is there any sort of formal procedure for further expert review?No, there's no formal procedure. You can propose one.
That said, we're happy to take good ideas from anyone, even if they're not perceived as experts.
--- David A. Wheeler