Detours on my Journey to Open Source

Last week, I reflected on my journey to open source. This week, I reflect on how that journey continued into my academic career. The focus of today’s story is on the detours I took that led me to where I needed to go.

By the end of my high school education, I had experience in open source, specifically the OpenOffice.org and Drupal communities. I was convinced that open source was a great licensing model and I valued the collaboration it enabled. I was building websites as a freelancer.

I had a noticeable passion for software development. Many years later, a high school friend told me how he found it hard finding time to meet with me because I often preferred getting into the flow of writing software.

In high school, everyone believed I would pursue a career in software and become the next Bill Gates, only with open source. This is not how it played out. I did end up in the open source software world, but I took several detours on the way.

Detour: B.A. in Business, Economics, and Banking

As I was considering career choices for after high school, I had all options open to me. I was very fortunate to pursue any like of work that interested me because I had supportive parents, the necessary mental capacity, education, as well as the social and economic status. I considered studying computer science and becoming better at what I had a passion for.

However, a piece of eye-opening advice I received pointed out that all software exists to solve a problem and if I wanted to create impactful software, I would need to understand the problem domain including the business and economics side of it.

For three years, I put writing software and participating in open source communities on the back burner as I followed the advice to build out my business and economics understanding. I joined a dual-studies program that had two components.

The first component was an apprenticeship at Bankhaus C. L. Seeliger, a local bank in Wolfenbüttel, Germany. The apprenticeship concluded with a certification by the Chamber of Commerce that would allow me to practice banking in Germany.

The second component was a 3-year bachelor program in business at the WelfenAkademie, a college in Braunschweig, Germany. The program concluded with a Bachelor of Arts degree in business with a concentration in banking.

The two components were well-coordinated and I took turns spending several weeks at the bank and attending classes.

During this time, I continued to read news about what was happening in the IT industry and I was especially interested in news about open source projects. I would also choose technology and software topics for my independent studies and research reports.

In the bank, I most enjoyed my time in the IT department. I enjoyed helping with PC issues. The only thing I enjoyed more was helping with the rollout of a new document management system.

In fact, I wrote my bachelor thesis on the subject of change management for the rollout of this system.

Detour: Overcoming a Career Blocker

Towards the end of the apprenticeship and bachelors, I was eager to back into IT. The bank did not have a job opening in the IT department and so I looked elsewhere.

Fun fact: Out of a love of traveling, I applied at Lufthansa as a flight attendant. Lufthansa rejected my application after the assessment center with kind words that I interpreted as: “Don’t waste your talents.” I redoubled my focus on a career in IT.

I approached the Technical University Braunschweig (TUBS) about joining their computer science master’s program.

I had falsely believed that the newly introduced bachelor’s and master’s programs allowed for movability between degrees. It turned out that I did not meet the enrollment criteria for computer science masters.

I was lucky to have talked with the program coordinator of the management information systems (MIS) degree. He showed me that I was only a few credit points shy of joining the MIS masters and suggested that I enroll as a bachelor student in the degree to earn the necessary credit points.

Within one semester, I enrolled in all foundational computer science and MIS classes. I felt like I just had to prove that I already had the skills necessary for pursuing the MIS masters and so I did.

Detour: Studying Abroad and Falling in Love

While enrolling in the MIS masters program, I found on the program website an unassuming link that promised information about an exchange program.

The PDF document I found there described a 1-year exchange program with the University of Nebraska at Omaha (UNO). I would transfer credit points between the TUBS and UNO, double-dipping on my course work. The promise was to earn the master’s in MIS from TUBS and an MBA from UNO without losing any time.

I was intrigued, asked how to enroll, and then learned that accepted students also received a scholarship to cover the expenses of the exchange program.

Long story short: I started the MBA program at UNO which set wheels of destiny into motion.

I fell in love and started considering living in the USA and Omaha, Nebraska specifically.

Looking at options to stay in Omaha, I explored the Ph.D. in IT program at UNO. I liked the makeup of the program. It was very interdisciplinary and open to me joining with a background in business and MIS. I felt that with the Ph.D. in IT would finally complete my move to a career in IT after my detour with a bachelor’s in business and banking.

To get into the Ph.D. program, I asked a professor if he would take me in as his Ph.D. student. I collaborated with him on my master’s thesis project and demonstrated my ability to engage in research. The research area we were working on was collaboration science. At the time, I was not even considering open source as a research option but in hindsight, it is stunning how closely related the topics are.

End of Detours: Coming to Research into Open Source

The professor I was planning to work with accepted a job at another university while I was still applying to join the Ph.D. program at UNO. He offered me to follow him there but I was set on living in Omaha, Nebraska, starting a family here, and getting married.

This left me without a mentor when I joined the Ph.D. program. I scheduled meetings with faculty who had interesting research topics.

When I met with Matt Germonprez, I learned that it was possible to do research into open source. I was immediately hooked.

My experience in the OpenOffice.org and Drupal community came back to me. I had never considered that open source could be a research option but now it was.

I was hesitant at first to work with Matt because he specializes in qualitative research. I was afraid that with English as a second language, I would not be equipped to do qualitative analysis where an in-depth understanding of language was necessary. Matt promised to train me, gave me the confidence that I could learn the skills, and so I gave it a chance.

The research approach I learned was built on engaged fieldwork. This means participating in open source projects, fully embedding myself in open source communities, and talking as much as possible to professionals in the space to learn their language and viewpoints.

Throughout the four years of my Ph.D., I got to know many people in the open source ecosystem and participate in different open source projects.

This is where my detours ended and I arrived in open source again. I had what was needed to dive in and become a contributing member of the open source ecosystem. There is plenty of fodder for more blog posts. I have already shared some of my stories and can recommend as further reading:

Today, my main focus in open source is on metrics, the Linux Foundation CHAOSS project that I cofounded, and Biteriga.

Let me close by saying this: It is my mission to help the open source ecosystem become more professional with how we use metrics. I do this through (1) my work in the CHAOSS project and (2) by helping organizations hire Bitergia to receive professional services for their metric needs.

My Journey to Open Source

I just realized that I have known about open source software for more than half of my life. Today, I look back to using open source software for 17 years and joining my first open source community 14 years ago. This blog post is about how I got started in open source and some early lessons learned.

Georg holding up the OpenOffice.org DVD case that he designed
One of my early open source contributions was designing this DVD for the OpenOffice.org Conference 2006 in Lyon, France. I couldn’t go and cherish this copy sent to me via mail.

I learned about open source software after I bought my first computer at the age of 13. I had made a deal with my parents that I would get money for a computer if I successfully spent one year in the USA. I went from not understanding English to having A’s and B’s in all subjects. Back in Germany, I found a computer I liked on eBay, bought a 19” CFT monitor, and Diablo II as my first video game. Oh, the memories.

Learning to develop software

At the time, I had a friend in school who was a few years ahead of me and already knew Delphi 6, a programming language taught in our school. I was amazed by the things he could make his computer do and my interest in software was born. I got a copy of the Delphi 6 and started combining recipes from an online cookbook to build software that was fun and entertaining. I still light up at the thought of an endless loop opening and closing the DVD drive — only today I don’t have a DVD drive anymore.

I found other friends who were interested in computers and software. One of them showed me web technologies and PHP specifically. On a youth group retreat, I devoured a PHP+MySQL book I had bought with my own money. I still remember creating one of my class project reports as a PHP application and having to explain XAMPP to my teacher. While learning more about PHP and web technologies, I was also learning about open source.

Joining my first open source community

I don’t recall reading The Cathedral and the Bazaar but I knew of the text and how it described the collaborative way that open source software was developed. As a high school student who was writing software for fun and sometimes to annoy my siblings, I was intrigued by the possibility to join a community that was developing software together.

I followed a recommendation to join an open source community of a software package that I was already using. This is how OpenOffice.org became my first open source community. I started by lurking on mailing lists. I learned that there were separate groups in the community. As a German boy, I decided to help with testing the German release of OpenOffice.org. I downloaded the release candidate, installed it, and reported issues when I found them.

Contributing non-code contributions

Testing a release candidate was my first contribution. The sense of being part of something bigger was exhilarating. I never contributed a single line of code to OpenOffice.org but found other ways that I could help. This first experience impressed on me the importance of having an open source community in which all types of contributions are welcomed and valued, not just code-contributions.

For example, I vividly remember rethinking mnemonics. What are mnemonics? They are the underlined letters in a menu that allow you to quickly select a menu entry by typing that letter on the keyboard. Over the years, menu entries had been added and in the German user interface, there was no apparent logic to the mnemonics. I volunteered to fix this situation. I printed all menus and while on vacation in Vermont, USA, I sat in the backyard and mulled over which letters would best be used for mnemonics. I was not the one implementing it but I provided a first draft that was accepted with few changes.

Learning from and experiencing an open source community

I’ll mention one last contribution to the OpenOffice.org community because I have a physical artifact and it taught me three important lessons. For the OpenOffice.org conference in Lyon in 2006, I designed the DVD label and cover.

The first lesson I learned from creating the DVD label and cover is the importance of attributing the work of others. I had taken a scalable vector graphic (SVG) image created by someone else and added specific information about the event. When I submitted my draft, I was called out for changing the “author” field in the SVG, because I had not created the original version but only modified it.

The second lesson I learned from creating the DVD label and cover is the value of iterating with the community. I went through several iterations of the design and always got good feedback from other community members. To be honest, I had never designed a DVD label and cover before and so I learned a lot in the process. To this day, I think of that same feedback when I am designing anything graphical.

The third lesson I learned from creating the DVD label is the power of thanking community members. I still know the name of the person who asked me for my address to send me a copy of the DVD and I still cherish this little trinket.

Moving on to other open source communities

A traumatic event occurred in the OpenOffice.org community when Oracle acquired Sun and with it the trademark and employed maintainers of the OpenOffice.org project. The fallout that followed and led to the founding of The Document Foundation and LibreOffice have impressed on me the power that a vibrant community has, even over a large corporation that basically “owned” an open source project. I first hand experienced and understood the implications of forking, a core feature resulting from the open source licensing model.

I will wrap up my journey into open source by highlighting other communities that I have been part of and what they have impressed on me.

I joined the Drupal community because I was developing web pages as a freelancer and I found Drupal to be a flexible platform that suited my needs. Drupal was the first open source project that I contributed code to and I was super excited to be involved in the development of Drupal 7.

Toolkit for YNAB is an unofficial browser plugin that adds features to the You Need A Budget (YNAB) web service. I had contacted the YNAB support about features I wanted and ways in which YNAB was not intuitive for me. The YNAB support kindly referred me to the open source community that was implementing those features in Toolkit for YNAB. The features are not part of the web service and break with new releases. I am amazed by the ingenuity of this project.

The Core Infrastructure Initiative (CII) Best Practices Badge Program is a self-certification that open source projects can obtain to signal that they have established best practices. I joined the conversation during the development of the Silver and Gold badges. Then, I started translating the CII Best Practices Badge web app to German. Funny story: When a friend of mine visited from Germany and asked for something to do, I recruited him to help with the translation. He ended up translating more than me. To this day, I maintain the translation and update it when the web app gets updated.

At the Open Source Summit North America 2017 in Los Angles, I spoke with Don Marti learned about Bugmark. I was intrigued by the idea to have a futures market for open source software issues as a way to provide sustainable funding to open source software contributors. I helped think through and shape the core elements of the futures marketplace. You can read more about it in the Journal of Cybersecurity and HMD Praxis der Wirtschaftsinformatik (German).

I joined the SustainOSS community after attending the summit in 2018 in London. This is not a typical open source community because it is not developing software but establishes a network of professionals around an important issue, the sustainability of open source communities. I volunteer to be a forum moderator.

Lastly, I am a co-founder and co-lead of the Linux Foundation CHAOSS project. This is a much larger story for a different blog post, one part already shared on The New Stack and mentioned here on my blog.

In closing, I am very fortunate to have learned early in my life important lessons about how open source software is developed and how open source communities work. Today, I am happy to have my professional home among the wonderful people who work in open source.

Open Leaders Interview About CHAOSS Project

This fall, I participate in the Mozilla Open Leaders 6 program. This program is for helping community leaders to adopt more inclusive and open work practices. I participated for my engagement with the CHAOSS project. Chad Sansing from Open Leaders interviewed me about my experience.

You can read the full interview here: https://medium.com/@MozOpenLeaders/the-chaoss-project-open-leaders-6-37c2799c2dd3

How to measure the impact of your open source project

We published this article originally on Opensource.com.

This article was co-authored by Vinod Ahuja, Don Marti, Georg Link, Matt Germonprez, and Sean Goggins.

Conventional metrics of open source projects lack the power to predict their impact. The bad news is, there is no significant correlation between open source activity metrics and project impact. The good news? There are paths forward.

Let’s start with some questions: How do you measure the impact of your open source project? What value does your project provide to other projects? How is your project important within an open source ecosystem? Can you predict your project’s impact using open source metrics that you can follow day to day?

If these questions resonate, chances are you care about measuring the impact of your open source project. On Opensource.com, we have already learned about measuring the project’s health, the community manager’s performance, the tools available for measuring, and the right metrics to use—and we understand that not all metrics are to be trusted.

While all these factors are critical in building a comprehensive picture of open source project health, there is more to the story. Indeed, many metrics fail to provide the information we need in a timely fashion. We want to use predictive metrics on a daily basis—metrics that are correlated with, and that act as predictors of, the outcomes and impact metrics that we care about.

Most open source project metrics focus on project metadata, such as contributor and commit counts, without addressing whether the project impacts a broader open source ecosystem. Unfortunately, a project that has a great number of contributors and an active flow of contributions may not be, and might never be, relevant to other projects in an open source ecosystem. To better understand the impact of a project, it is important to consider the broader context of an open source ecosystem. This article introduces the V-index as a measure of impact (see Regression Analysis of Open Source Project Impact: Relationships with Activity and Rewards).

Who cares about project impact?

Sponsors of open source projects care about their impact. A foundation that’s hosting an open source project likely wants it to be widely used, for example, or an organization that’s paying developers to work on a project will want to ensure that their efforts are making a difference. Consequently, software developers or project managers may need to use metrics to make the case that the time and effort spent on an open source project is creating real value for their employer.

Open source project members also care about the impact of their project. High-impact projects can be a source of pride and motivation for developers. Within the open source ecosystem, it means that people are interested in new development and ready to report bugs. High impact means that projects need the code base to be maintained and vulnerabilities to be addressed, which is an incentive to support project members.

Open source project impact

An effective way to understand an open source project’s impact is through its software libraries. A software library certainly impacts the projects in which it is used, and popular libraries have also changed the way software is developed by providing functionality across a variety of software projects.

For example, the Bootstrap library revolutionized website interfaces and has become a de facto standard. But Bootstrap depends on another widely used library: jQuery. jQuery simplifies the use of JavaScript in website development. The impact of jQuery on Bootstrap, and on web development as a whole, cannot be overstated, and this impact is evident in the library dependency relationship between the two.

The jQuery/Bootstrap example demonstrates how software libraries can have an impact. Within the open source ecosystem, jQuery is an upstream project to Bootstrap, which itself is an upstream project to many websites and web frameworks, as shown below:

downstream depedency depiction for jQuery and Bootstrap

Figure 1: An open source project dependency within an open source ecosystem: The jQuery project is the upstream project to Bootstrap and many other projects, which themselves may be upstream to more projects. (Graphic by Kevin M. Lumbard, licensed CC-BY-SA-3.0. River delta background by Messer Woland, licensed CC-BY-SA-3.0. Logos are property of respective right owners.)

Measuring impact

Many metrics are being developed to measure the impact of an open source project. These include the number of users, downloads, installs, mentions in media (e.g., blogs, news, YouTube videos, and job postings), the availability of commercial offerings, and the number of add-on products. But such metrics isolate impact within that specific project and don’t fully demonstrate the impact of a software library within an open source ecosystem.

To measure the impact of an open source project within the open source ecosystem, let’s borrow a metric from academia: the h-index. This determines the impact of an author through the relationship of how many publications he or she has produced, and how many other authors have cited these publications. We propose, therefore, that a project’s impact in an open source ecosystem can be determined by downstream dependencies (i.e., how many downstream open source projects use them and how often those downstream projects are themselves used).

V-index

A downstream dependency exists when a software library is used within another piece of software. The V-index, which encapsulates our proposed measure of impact, is the maximum number of first-order downstream dependencies that themselves have at least an equal number of second-order downstream dependencies. The first-order dependency is the number of open source projects that use the library. The second-order downstream dependency is determined by how often a first-order dependent project is used within other open source projects.

The V-index is elaborated in three different scenarios:

Scenario A

Scenario B

Scenario C

First-order dependencies Second-order dependencies First-order dependencies Second-order dependencies First-order dependencies Second-order dependencies
Dependency 1 0 Dependency 1 4 Dependency 1 40
Dependency 2 0 Dependency 2 4
Dependency 3 0 Dependency 3 4
Dependency 4 0 Dependency 4 4

Project A has a V-index of 0.

The project has four projects that depend on it. No other project depends on these projects. The V-index of Project A is 0 because zero first-order dependencies have any second-order dependencies.

Project B has a V-index of 4.

The project has four projects that depend on it. Each of these projects has four projects that depend on them. The V-Index of Project B is four because each of the four first-order dependencies have at least four second-order dependencies.

Project C has a V-index of 1.

The project has one project that depends on it. This project has 40 projects that depend on it. The V-Index of Project C is 1 because it has one first-order dependency that has at least one second-order dependency.

Looking at a practical example, jQuery has a V-index of 98. It has 13,848 first-order dependencies, of which Bootstrap is one, with 5,005 second-order dependencies. Of the 13,848, only 98 first-order dependencies have 98 or more second-order dependencies, as shown below:

V-Index graphical depiction

Figure 2: V-index of jQuery: The x-axis represents the downstream open source projects (first-order downstream dependencies) sorted by the number of their own downstream dependencies. The y-axis represents the number of downstream dependencies of each first-order open source project on the x-axis (second-order downstream dependencies). The V-index is the number of first-order downstream dependencies that have at least the same number of second-order downstream dependencies. (Graphic by Kevin M. Lumbard, licensed CC-BY-SA-3.0. Logos are property of respective right owners.)

Increase impact with new metrics

How do you increase your open source project’s impact? Well, you need to convince other projects to use your project. Unfortunately, there is no single activity that will make this happen. However, there are steps you can take to make a project impactful, and there are ways to measure how well you do them. Let’s look at which of these measures are correlated with impact.

We summarize the findings below based on previous correlation analysis. The correlation analysis used a sample of metrics for three kinds of open source metrics:

  1. Activity metrics measure metadata such as contributor or commit counts. Project contributors can increase these metrics by doing more work on the project and getting more people involved.
  2. Reward metrics measure how well the project is meeting contributor’s expectations. They may improve with faster acceptance of contributions.
  3. Impact metrics measure the impact on users and other projects.

The V-index was developed to measure impact metrics. The correlation was tested for 604 projects that were started in 2014 or 2015, that used the Rust programming language, that were listed in GHTorrent and Libraries.io (the data sources), and that had at least one downstream dependency.

The findings show that none of the conventional open source activity metrics correlate with impact. This lack of predictive activity metrics means that we have no good predictors to manage our open source projects.

Does this mean all is lost? We think not. Several open source projects are building next-generation metrics that project sponsors, maintainers, and downstream users might be able to rely on in the future. Here are four paths to finding the predictive metrics we need to boost the impact of our open source projects:

1. Add software quality metrics

The first idea is to combine open source activity metrics with conventional software engineering metrics, such as code coverage. Conventional open source activity metrics focus heavily on the development dynamics within the project. The focus on activity metrics excludes software quality factors, which might be more important for people choosing a software library. Conventional open source activity metrics make it difficult to distinguish productive activity from unproductive activity. Combining a software engineering metric with an open source activity metric could make the latter more valuable.

2. Understand the user community

The second idea involves using natural language processing to determine the sentiment within an open source project, especially where users of the software participate. Conventional open source activity metrics rely only on metadata. Knowing the number of interactions does not help us understand the quality and substance of community. FOSS Heartbeat, while currently not maintained, offers a solution.

3. Market mechanisms

The third idea is to draw a connection between impact and the value of a software library. Existing valuation methods focus on the project itself (i.e., development costs) rather than the value others derive from it. A problem that open source faces is the absence of price signals that can inform the value users receive from a software library. To draw a connection between impact and value, we need new market mechanisms, like the ones proposed by Bugmark.

4. Shared understanding of metrics

The fourth idea is to build more knowledge in the open source ecosystem about how metrics can help us understand the impact and health of open source projects. The Linux Foundation initiated the CHAOSS (Community Health Analytics Open Source Software) project to bring open source projects and other stakeholders together to build a shared understanding of metrics and of the software tools to capture and analyze said metrics. This blog post is based on research conducted as part of the CHAOSS project.

Acknowledgments

This article is based on the whitepaper Regression Analysis of Open Source Project Impact: Relationships with Activity and Rewards by Vinod K. Ahuja. Graphics were prepared by Kevin M. Lumbard. This work is supported by Mozilla and the Alfred P. Sloan Foundation.