Re: repo_uri type is not in the data


Mark Rader
 

Humm

After seeing your statistics and looking at the 14 unknowns it looks like a binary option might be better. It could be git or other, with maybe a comment field for other.

On Jul 30, 2016, at 4:57 PM, Hanno Böck <hanno@hboeck.de> wrote:

On Sat, 30 Jul 2016 17:05:21 -0400
"Wheeler, David A" <dwheeler@ida.org> wrote:

That's obviously possible. However, I'm trying to limit the amount
of information users have to provide - every question increases their
effort. It seems to me that automated analysis should be nearly
perfect, since as you note there are relatively few options. If it's
on GitHub it's always git, if ends in ".git" it's git.
I'd more think of an automated way to just check the repos, url-based
guessing may be unreliable. I attached a quick and dirty script.
Alternatively you can have a dropdown box that defaults to git. Given
the monoculture of git almost nobody would have to change it :-)

Please identify the projects so we can fix them.
Output from my script grepped for UNKNOWN, which means no repo
identified:

json/112.json http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/ UNKNOWN
json/114.json https://github.com/open-infrastructure/container-tools
UNKNOWN
json/164.json https://git.opnfv.org/ UNKNOWN
json/197.json https://git.gnupg.org UNKNOWN
json/211.json https://www.tinc-vpn.org/git/ UNKNOWN
json/212.json https://github.com/vmware/phon UNKNOWN
json/232.json http://github.com/atom UNKNOWN
json/234.json https://git.videolan.org/ UNKNOWN
json/246.json https://github.com/openstack UNKNOWN
json/26.json http://trousers.sourceforge.net UNKNOWN
json/34.json https://git.kernel.org UNKNOWN
json/54.json https://git.openssl.org/ UNKNOWN
json/74.json https://gerrit.zephyrproject.org/ UNKNOWN
json/98.json http://kea.isc.org/wiki UNKNOWN

The cvs one is a challenge: it seems cvs lacks the concept of a repo
url.
container-tools has removed the repo.
vmware/phon seems a typo which should be vmware/photon.
rest is mostly referencing git overview pages and not repo URLs.

Also given that this allows for some easy statistics:
2 bzr
3 nourl
4 svn
14 UNKNOWN
154 git

The git dominance is huge :-)

--
Hanno Böck
https://hboeck.de/

mail/jabber: hanno@hboeck.de
GPG: BBB51E42
<badgeparse>
<badge-repo_url.txt.xz>
_______________________________________________
CII-badges mailing list
CII-badges@lists.coreinfrastructure.org
https://lists.coreinfrastructure.org/mailman/listinfo/cii-badges

Join CII-badges@lists.coreinfrastructure.org to automatically receive all group messages.