Re: repo_uri type is not in the data

Mark Rader


After seeing your statistics and looking at the 14 unknowns it looks like a binary option might be better. It could be git or other, with maybe a comment field for other.

On Jul 30, 2016, at 4:57 PM, Hanno Böck <> wrote:

On Sat, 30 Jul 2016 17:05:21 -0400
"Wheeler, David A" <> wrote:

That's obviously possible. However, I'm trying to limit the amount
of information users have to provide - every question increases their
effort. It seems to me that automated analysis should be nearly
perfect, since as you note there are relatively few options. If it's
on GitHub it's always git, if ends in ".git" it's git.
I'd more think of an automated way to just check the repos, url-based
guessing may be unreliable. I attached a quick and dirty script.
Alternatively you can have a dropdown box that defaults to git. Given
the monoculture of git almost nobody would have to change it :-)

Please identify the projects so we can fix them.
Output from my script grepped for UNKNOWN, which means no repo

json/112.json UNKNOWN
json/164.json UNKNOWN
json/197.json UNKNOWN
json/211.json UNKNOWN
json/212.json UNKNOWN
json/232.json UNKNOWN
json/234.json UNKNOWN
json/246.json UNKNOWN
json/26.json UNKNOWN
json/34.json UNKNOWN
json/54.json UNKNOWN
json/74.json UNKNOWN
json/98.json UNKNOWN

The cvs one is a challenge: it seems cvs lacks the concept of a repo
container-tools has removed the repo.
vmware/phon seems a typo which should be vmware/photon.
rest is mostly referencing git overview pages and not repo URLs.

Also given that this allows for some easy statistics:
2 bzr
3 nourl
4 svn
154 git

The git dominance is huge :-)

Hanno Böck

CII-badges mailing list

Join to automatically receive all group messages.