<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Life in Code</title>
	<atom:link href="http://code.ncultra.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://code.ncultra.org</link>
	<description>Thoughts on technology from a veteran programmer.</description>
	<lastBuildDate>Wed, 10 Feb 2010 21:32:50 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Some KVM development community stats</title>
		<link>http://code.ncultra.org/2010/02/some-kvm-development-community-stats/</link>
		<comments>http://code.ncultra.org/2010/02/some-kvm-development-community-stats/#comments</comments>
		<pubDate>Wed, 10 Feb 2010 21:28:00 +0000</pubDate>
		<dc:creator>mdday</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[kvm]]></category>
		<category><![CDATA[linux]]></category>

		<guid isPermaLink="false">http://code.ncultra.org/?p=23</guid>
		<description><![CDATA[Today I made a presentation (pdf) on the Linux Kernel Virtual Machine to the Red Hat Cloud Computing Forum. I enjoyed the format. All the presentations were short (30 minutes including Q&#38;A) and technical. This is the type of forum I enjoy attending, so its easy to prepare and I am comfortable with the audience.
KVM [...]]]></description>
			<content:encoded><![CDATA[<p>Today I made a <a href="http://code.ncultra.org/wp-content/uploads/2010/02/kvm.pdf">presentation (pdf)</a> on <a href="http://www.linux-kvm.org/page/Main_Page">the Linux Kernel Virtual Machine</a> to the <a href="http://www.redhat.com/cloudcomputingforum/">Red Hat Cloud Computing Forum</a>. I enjoyed the format. All the presentations were short (30 minutes including Q&amp;A) and technical. This is the type of forum I enjoy attending, so its easy to prepare and I am comfortable with the audience.</p>
<h2>KVM Developer Participation</h2>
<p>One of the topics I covered in the presentation is the level of KVM development activity in 2009. To measure the depth and breadth of participation in KVM development, I used activity on the developer mailing lists for the three primary components of the KVM Hypervisors: KVM, which provides the virtual machine monitor; <a href="http://wiki.qemu.org/Main_Page">Qemu</a>, which provides the virtual machine environment; and <a href="http://libvirt.org">libvirt</a>, which provides the low-level management interfaces.</p>
<p>I like to think that monitoring the traffic on an open-source project&#8217;s mailing list is a lot like gathering intelligence through <a href="http://en.wikipedia.org/wiki/Traffic_analysis">traffic analysis</a>. You can learn who is working on a project, what specific areas they are working in, and with whom they are working. The volume of traffic is also a good indicator of the weight behind a project and the overall development velocity. If you were really ambitious you could graph the relationships among various projects based on the participation of specific individuals.</p>
<p>Below is a summary of the raw statistics for 2009:</p>
<div id="attachment_30" class="wp-caption alignnone" style="width: 510px"><a href="http://code.ncultra.org/wp-content/uploads/2010/02/kvm-developers.png"><img class="size-full wp-image-30  " title="kvm-developers" src="http://code.ncultra.org/wp-content/uploads/2010/02/kvm-developers.png" alt="kvm development community statistics 2009" width="500" height="356" /></a><p class="wp-caption-text">kvm development community statistics 2009</p></div>
<h2>What can we learn from Raw Message Counts?</h2>
<p>These three mailings lists are dedicated to development activity. There are three types of messages included in the analysis:</p>
<ol>
<li>Source code. All source code changes are submitted as email messages. Per conventions for Linux kernel development, the subject line of these messages usually includes the tag &#8220;[PATCH].&#8221;</li>
<li>Source code review. When a developer submits some proposed changes, analysis and discussion of the source code generates replies to the original email. Developers use email clients that support message threading, which makes it easier to follow the discussion.</li>
<li>Bug reports.</li>
</ol>
<p>All three of the activities are part of what we typically consider the job description of a software engineer. There is another sub-category of messages that intermix design proposals with source code submissions. These messages usually include the tag &#8220;[RFC]&#8221; in the subject line.</p>
<p>Because of this, I believe that message counts for mailing lists dedicated to software development provide a good indication of health of a development community. For KVM, the statistics are impressive.</p>
<ul>
<li>Almost 400 organizations participated in KVM development, ranging from large corporations such as IBM, Intel, and Red Hat, to academic institutions and individual contributors.</li>
<li>Approximately 800 unique contributors. This is an extremely broad group of software developers.</li>
<li>A solid core of &#8220;super contributors,&#8221;  developers who form the top tier of the project contributions.</li>
</ul>
<h2>Top Individual Contributors</h2>
<p>It&#8217;s also good to look at the top individual contributors. These are the folks who are generally 100% focused on the project and are the most prolific programmers.</p>
<h3>KVM-Devel</h3>
<pre>3810    avi@redhat.com
1261    mst@redhat.com
851     gleb@redhat.com
799     mtosatti@redhat.com
507     ghaskins@novell.com
453     anthony@codemonkey.ws
410     lmr@redhat.com
394     agraf@suse.de
362     sheng@linux.intel.com
357     jan.kiszka@siemens.com
356     glommer@redhat.com
336     mgoldish@redhat.com
226     amit.shah@redhat.com
223     jan.kiszka@web.de
209     markmc@redhat.com
208     alex.williamson@hp.com
197     joerg.roedel@amd.com
178     mhiramat@redhat.com</pre>
<h3>Qemu</h3>
<pre>1839    anthony@codemonkey.ws
1457    kraxel@redhat.com
1447    quintela@redhat.com
961     avi@redhat.com
819     aurelien@aurel32.net
805     lcapitulino@redhat.com
805     blauwirbel@gmail.com
745     mst@redhat.com
617     yamahata@valinux.co.jp
565     agraf@suse.de
558     aliguori@us.ibm.com
540     jan.kiszka@siemens.com
493     markmc@redhat.com
468     paul@codesourcery.com
425     gleb@redhat.com
418     jamie@shareable.org
407     av1474@comtv.ru
402     armbru@redhat.com
391     glommer@redhat.com
371     kwolf@redhat.com</pre>
<h2>Comments on Individual Developer Counts</h2>
<p>Looking at the individual contributor counts for KVM and Qemu, it is clear that the top contributor on each list is quite a bit more active than next highest contributor. (On the Qemu list, anthony@codemonkey.ws and aliguori@us.ibm.com are the same person posting under two different addresses, which is not uncommon. This means anthony&#8217;s actual message count is clost to 2500 messages). Avi Kivity is the KVM maintainer, and Anthony Liguori is the Qemu maintainer. It&#8217;s the job of the maintainer to review and accept all code submissions, and to package and announce new releases of the code. So you expect the maintainer of a project to post the most messages.</p>
<p>You can also see from these message counts that there is a large overlap of top contributors to the KVM and Qemu projects. In fact Avi Kivity is a top contributor to Qemu, and Anthony Liguori is a top contributor to KVM.</p>
<h2>See for Yourself</h2>
<p>You can read the kvm-devel and Qemu mailing lists via the web using Gmane.</p>
<p><a href="http://http://news.gmane.org/gmane.comp.emulators.kvm.devel">kvm-devel</a></p>
<p><a href="http://news.gmane.org/gmane.comp.emulators.qemu">qemu</a></p>
<h2>Source code for Analysis Tools</h2>
<p>I wrote a couple of crude utilities to do this maling list analysis. The are:</p>
<p><a href="http://code.ncultra.org/string_search.c">string_search.c</a> a utility that understands email address strings and can process them and count instances of specific addresses in a file</p>
<p><a href="http://code.ncultra.org/mbox-filter.py">mbox-filter.py</a> a python utility that filters an mbox-formatted email file in a number of different ways. I use it, for example, to collect all messages that fall within a certain range of dates.</p>
<p>Perhaps in a future post I&#8217;ll document these utilities and enhance them. They are very crude at this point. Once I got the info I needed out of the mbox files I stopped working on them.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://code.ncultra.org/2010/02/some-kvm-development-community-stats/" target="_blank"><img src="http://code.ncultra.org/wp-content/plugins/add-to-facebook-plugin/facebook_share_icon.gif" alt="Share on Facebook" title="Share on Facebook" /></a><a href="http://www.facebook.com/share.php?u=http://code.ncultra.org/2010/02/some-kvm-development-community-stats/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://code.ncultra.org/2010/02/some-kvm-development-community-stats/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
