ANZLIC's investment in GeoNetwork has the potential to be a world class showcase for the successful adoption and sponsorship of open source solutions, provided stakeholders address current development issues. By engaging key sponsors, tightening project management, applying resources appropriately and further engaging the GeoNetwork community ANZLIC will improve its long term return on investment.
ANZLIC’s backing of free, Open Source Software reduces commercial barriers to sharing data both internally and externally, and achieves ANZLIC’s goal of increased data access to facilitate effective decision making.
Open Source offers many advantages, including free licensing and engagement of an international pool of developers, but it does require appropriate investment and management to capitalise on Open Source’s offerings effectively.
Australia and New Zealand have many agencies interested in contributing to a Robust Spatial Data Infrastructure, some with suitable funding to address long term core infrastructure issues, others focused on localised customisation and infrastructure deployments. We have suitable funding, know-how and developers to build a great success story.
ANZLIC and its member organisations agreed that GeoNetwork addressed ANZLIC’s functional requirements assuming some minor issues were addressed. Eight months later, Bruce Bannerman vocalised community feeling:
"I'm concerned at how ANZLIC's adoption of the GeoNetwork open source application as our spatial metadata tool is being handled and also at a perceived lack of progress in getting a production version of this application out."
Bruce proceeded to engage key stakeholders in a constructive email discussion to funnel the extensive good will and capabilities of our community into making ANZLIC’s backing of GeoNetwork to be an exemplary success story.
Open Source Sponsorship
Open Source Spatial Data Infrastructures
A key challenge faced by Spatial Data Infrastructures is that the organisations who gain value from the data are different to the organisations serving the data.
The value of a Spatial Data Infrastructure is measured is the quantity of usable data it contains – or most specifically, how much data from other organisations can I get my hands on.
A Spatial Data Infrastructure becomes valuable to me when everyone else puts their data online so that I can use it. It costs me money to put my data online and I don’t gain anything because I have my data already.
Spatial Data Infrastructures (SDI) are deployed to allow organisations to properly store, document, index, distribute and analyse spatial data. Most data is collected to support decision making. These decisions range from the simple, “How do I get from A to B?” to the complex, “Which management strategy is most effective at preserving a particular environment?” To support these decisions, government departments work to a triple bottom line: Financial, Community and Environmental. While balancing these factors is critical to making decisions, the decisions are only as good as the data they are based on.
Improving the triple bottom line of national and international SDI programs is dependent on a number of factors:
- Data Quantity and Integration
A national dataset, aggregated from local and state datasets, is significantly more valuable than the constituent datasets.
- Data Quality
Inaccurate or imprecise data can lead to incorrect conclusions and ultimately poor decisions.
- Data Currency
Some data changes rapidly, such as traffic congestion or transit demand. Other datasets, such as geologic zones, will theoretically never change, though changes in surveying methods or resolution can add to the accuracy and extent of the dataset. Making decisions based on last year’s data serves only to solve last year’s problems a year too late.
- Efficient Data Collection and Maintenance
The costs and timelines of collecting data and maintaining existing datasets must be managed in order to achieve acceptable results in the previous three categories.
- Data Availability and Licensing
Simply storing the data is not enough. Data must be accessible when and where it is needed, formatted in such a manner that it can be used with the tools at hand, and it must be licensed in such a manner to permit the analysis and publication of results or derived products as required.
SDI programs often seek to address these factors by developing a centralized SDI to service a network of related departments and organisations, ranging from a number of departments within a ministry, to integration of datasets across the country. As Paul Ramsey explains (here and here):
“… a key funding challenge faced by SDI programs is that while sharing data in a distributed SDI reduces the overall cost for everyone, not everyone is equally better off”.
For data custodians, publishing data is a cost centre and doesn’t provide a substantial business benefit.
Many, including Ben Searle from the Australian Government Office of Spatial Data Management, realize that:
“… an effective way to increase access to other agencies’ data is to sponsor free, *Open Source tools which will reduce the cost barrier to sharing data.*”
Open Source offers many opportunities, which can significantly enhance the investment of organisations prepared to capitalize on them.
Opportunity Management is the inverse of Risk Management. With risk management you quantify what can go wrong then identify mitigation strategies to avoid or reduce the impact of the risks. With opportunity management you list potential windfalls and deploy strategies to enable and benefit from the windfalls. The table below shows an example opportunity management matrix.
|Use data from external agencies.|| Agencies are given access to open source tools to reduce their barrier to sharing data. |
Use Open Standards for tools to facilitate communication.
Use Open Standards for data schemas so data can be integrated.
|External Agencies extend our toolset.|| Use and share our tools as Open Source Software so that others can use and extend them. |
Support the Open Source development processes to reduce the barrier of entry to potential development sponsors.
After selecting Open Source sponsorship to achieve cost effective data access, agencies are now faced with a relatively new business model, open source sponsorship. Agencies need to align purchasing policies, based upon deliverables and milestones, with Open Source community development.
Under a proprietary business model, a company builds and markets a product. Multiple customer sales cover the cost of development, supporting infrastructure, marketing, support, future enhancements and hopefully include a profit. While Open Source business models incur the same costs as the proprietary models they generally distribute the costs to the end users differently, charging for the implementation of specific functional or usability improvements.
Initial investment in communities, infrastructure, and marketing for an Open Source project is often the most effective way to ensure a long term return on investment as these areas are commonly neglected in favour of feature enhancements. Proper promotion and infrastructure support, instead of a sole focus on missing features, will encourage project growth and ultimately lead to open source Nirvana: hundreds of developers building your application using someone else’s budget.
There are a number of key elements that a potential sponsor should consider when evaluating an open source project in order to ensure maximum return on investment. These include:
- Solves a specific need effectively.
- Has an active, diverse and inclusive community.
- Enjoys support from multiple sponsors.
- Established development processes including:
- Issue tracking
- Communication channels like email lists and IRC
- Quality control
- Clear and comprehensive documentation and marketing material.
The OpenLayers project is a good example of a commercial entity driving the creation of a thriving open source project. OpenLayers is an open source, browser based web-mapping client which provides a front end to various proprietary and open data sources like Google and Yahoo Maps, WMS and WFS. In three years OpenLayers has grown from nothing to be the dominant open web-mapping client, attracting the majority of the users and developers in this space.
OpenLayers was initially sponsored by MetaCarta who needed a browser based application to support their mapping services. Rather than focusing on features, MetaCarta focused much of their investment on infrastructure and community support. In particular their effort was spent answering developer and user questions on email and IRC, monitoring the quality of code contributions, and setting up automated testing. Many of MetaCartas engineers have developed a personal interest in OpenLayers which MetaCarta encourages by allowing the engineers to spend some work time on the project.
Today, OpenLayers has an incredibly active developer community requiring minimal support from MetaCarta and have provided functionality significantly greater than MetaCarta’s original scope. Key to the success of OpenLayers has been the long running, dedicated community support provided by Chris Schmidt from MetaCarta. GeoServer, another Open Source project, has recently introduced a similar community liaison role, dedicated to community support and marketing.
The role of Community Liaison has always been key to Open Source and often is filled by volunteer enthusiasts, however commercial deployments of Open Source creates a workload volunteers can’t maintain and hence industry hires these volunteers instead. Ensuring that the community is supported in this fashion promotes the uptake of the project, increases the user base, which in turn attracts more sponsors and more developers. This leads to the situation where many developers are employed by a variety of sponsors to create new features and improve the performance and stability of the project.
There are a number of tasks and roles that need to be addressed in order to ensure a successful open source project. These are described below:
- Community Support
A person or team is required to answer user and developer questions, review submitted code from external developers to ensure quality control and ensure that all submissions meet the project requirements in terms of test coverage and documentation. This is one of the most effective investments in a project.
- General Project Processes
All projects should invest in tools and processes such as automated build systems, issue trackers, concurrent versioning systems as well as ensuring that releases are performed smoothly and regularly.
Good, current design and implementation documentation lowers the learning curve for developers supporting and extending software and greatly increases productivity. Good user documentation engenders confidence in project reviewers which in turn will lead to greater adoption.
While Open Source benefits significantly from community generated promotion, it is enhanced by prudent investment in web pages and presentations for targeted conferences.
- Commercial Support
One of the main reasons given for avoiding Open Source is not being able to call someone to fix problems. Offering commercial support for a project you use will go far in encouraging adoption by other organizations.
- Integrate and bundle with related software
Microsoft Office has been especially successful because it integrates a suite of related products and bundles them all together in one easy install. Open Source products improve their attractiveness in the same way.
- Open Standards
Due to the release-early/release-often approach of most open source projects, they are often leveraged to develop, test and extend open standards. This makes open source projects among the earliest adopters of emerging standards, encourages the uptake of open standards and makes the projects attractive to those interested in sharing data between agencies.
- Project Management
Just like proprietary software, a sponsor’s software development should be managed using standard software development processes . This includes estimation; planning resources, work activities, schedules, budgets, deliverables; monitoring schedule, quality, risk, issues, contractors, configuration management.
Measurement is a key tool used during proprietary Project Management, as good metrics enable good management decisions. Good measures highlight whether specific business goals are being met and enable management to alter their strategy early if issues arise.
Metrics are under-utilized in many open source projects as developers usually drive their own agendas, are self motivated, and spend less time on Project Management. However metrics based decision making can be equally effective for Open Source projects especially for sponsors who will need to answer to commercial milestones and targets.
Standard software development metrics should be complemented by measures to monitor the health of an Open Source community. The Community MapBuilder project tracks many of these metrics and can be viewed at the URL provided below. There are now a number of dedicated tools which automate many of the common software metrics. http://communitymapbuilder.org/display/MAP/Strategic+Direction#StrategicDirection-Metrics. Typical measures are discussed in the following sections.
Earned Value Management (EVM ) provides an effective way to track progress over time and make adjustments to scope, schedule or resources as required. When employed EVM measures the planned value of the project as estimated in the original schedule, the earned value calculated from the percentage completion of all tasks at a given time, and the actual cost of the project at a given time as derived from time logs.
Milestones provide an easy means of determining whether deliverables are on time or not. By decomposing a project into a number of smaller milestones, stakeholders can monitor these deliverables to determine if the schedule is likely to be met and adjust their planning accordingly, before the expected completion of the project.
By using an issue tracking system, such as Trac or JIRA, to record, track and report on the progress of feature and improvements, management is able to determine the progress of the project at a finer resolution than would be provided through milestones alone. This process also assists in the planning and prioritisation of features and encourages a flexible and agile development methodology.
The frequency of commits to the projects source code repository is a strong indicator of the activity experienced by the community. While a high level of activity could indicate anything from a pending code freeze prior to release, to the discovery of a large and pervasive security vulnerability, it does show that the community is responsive and the project is undergoing active development. Commits to documentation repositories, or changes to the project wiki, provide similar indications of community activity.
Contrary to intuition, a large number of bug reports usually indicates a healthy project. It is an indication that the community is actively identifying bugs and endeavouring to fix them. Many issue tracking applications allow the reporting of bugs reported and fixed over time, or relating to specific releases. Many bugs indicates a strong user community that is testing and reporting issues or feature requests to the project.
Many projects have a minimum requirement for the percentage of source code covered by automated tests that must be met before a new feature may be added to the project. The test coverage of a project or a specific module of the project is a strong indicator of the quality and stability of the source code. Projects with minimum requirements are indicating to the community that code quality and stability are more important that a long list of features that may or may not be stable.
Code reviews are audits of newly written or modified source code performed by a developer or developers other than those that are responsible for the code. The presence and availability of code reviews is indicative of a commitment of the project community to following good development processes. This is another indicator of the quality of the project.
The activity of the project email lists, IRC channels, forums and other public means of communication is the best indication of the health of the community. In order to promote the use of the project, this activity should be balanced between discussions on the direction of the project, questions from new users or developers and in particular answers from knowledgeable members of the community.
The number of downloads of binary releases, developer kits or source code provides an indication of the size of the user community. A large user community provides a large pool of people that may be interested in sponsoring additional development on the project, thus sharing the costs of the project.
Indicators such as web page hits and related blog entries provide a means of estimating the interest in the project. While downloads provide a good indicator of the current size of the community, these web metrics are more of an indicator of growing interest and awareness in the project, and provide a means of forecasting medium-term growth in the project.
The number of sponsors is a strong indicator of the size of the sponsored community. While this may sound obvious, open source projects suffer in this regard, since many sponsors will not advertise themselves as such. Instead they simply offer code patches, or hire existing developers within the community without informing the community at large. Unlike proprietary software projects, there is no centralized authority to track the number of licenses or authorised vendors/developers associated with the project, so the advertised number of sponsors will generally be a subset of the actual sponsors.
Assessing GeoNetwork Investment
Assessing ANZLICs Investment
Using the criteria described above, the following sections discuss the effectiveness of ANZLIC’s investment in GeoNetwork. Most of this section draws upon an email thread where suggested improvements to ANZLIC’s investment in GeoNetwork were discussed.
The issues discussed include:
- GeoNetwork provides most of the functionality required by the stakeholders, making it a good basis to start from.
- Software developers have noted the design and software is fair, but a number of improvements to the design, documentation and testing regime would greatly improve the extensibility and maintainability of the code-base.
- There is concern over the disjoint between sponsors and developers knowledge. The cost of feature development is understood by the developers for specific parts of the code but has not been communicated to the sponsors, limiting their ability to make effective decisions.
- The standard infrastructure and liaison costs associate with the project are being incurred by the developers and are neither visible nor acknowledged by the sponsors.
- To date there have been long delays in expected deliverables.
- There is concern that multiple forks of the GeoNetwork code base are being maintained, which will ultimately increase the costs of managing the project and keeping the forks up to date.
- The current release is still being labelled as a beta release, indicating that it is currently not ready for a production environment despite assurances that it is.
- Development progress is currently not being monitored against any schedule.
- The requirements and scope of ANZLIC’s investment are unclear, and no milestones have been established.
While this list is long and varied, these concerns can be largely addressed with three changes to the process:
- Employ software development project management techniques. Management of software development is a refined art with established processes which extend the usual management processes already established in government purchasing processes.
- Accurately assess the scope of the project and resource accordingly. The scope should include GeoNetwork infrastructure development and community support to ensure the long-term health and opportunity management of ANZLIC’s investment.
- Monitor the software development progress using techniques like Earned Value Management.
Australian and New Zealand already have a strong community of GeoNetwork developers both within government agencies and commercially that we can draw upon. These developers can be pooled together under a common project and project manager but still answerable to their respective organisations’ goals. The project will answer to the Aust/NZ steering committee.
There are two identified funding structures readily available for GeoNetwork, described in the following sections.
Programs like the National Collaborative Research Infrastructure Strategy (NCRIS) managed by Australia’s Commonwealth Scientific and Industrial Research Organisation (CSIRO) have long term funding to build a robust spatial data infrastructure and have substantial funding to apply to infrastructure – automated build and test suites, core design changes, documentation, community building etc.
Many of the other stakeholders have project specific requirements related to collecting and managing metadata within their organisation. These projects will typically focus on configuration and integration with existing systems.
A number of key roles have been identified that, once filled, would contribute significantly to the success of the project.
A role is created to manage ANZLIC’s software development in accordance with standard software development processes. These include accurately assessing and prioritising scope, monitoring progress (using standard techniques like EVM), reporting progress, liasing with stakeholders (including ANZLIC members, greater GeoNetwork community and standards bodies), and managing risks and opportunities.
Funding a community liaison officer to sit on email lists and IRC and answer developer and user questions is a very effective investment in engaging future developers and sponsors. This role is usually someone who has been involved in the project for a while and has a good understanding of the technology and people involved in the project.
One conclusion of community discussions was that ANZLIC’s investment in GeoNetwork is insufficiently resourced. Increasing the number of developers available to work on core infrastructure is essential in ensuring the project will remain stable and extendable as well as speeding the incorporation of missing functionality as needed. This has the added benefit of increasing the pool of developers available to perform or assist in large-scale deployments as the project nears completion.
The following tasks need to be addressed:
- Identify sponsors and their business drivers.
- Provide list of desired features.
- Review the state of the software, design, and infrastructure and recommend updates.
- Provide cost/benefit analysis
GeoNetwork StakeholdersThis section provides details of currently identified stakeholders. Please update or add the details of your organisation below.
Point of Contact: Kate Roberts
Bluenet aims to provide a virtual data centre to support long term curation and management of data for Australia 's marine science researchers.
Bluenet has taken a lead role in extending GeoNetwork by sponsoring Simon Pigot to build the MEST extension to GeoNetwork.
Metadata Entry and Search Tool (MEST])
Extends the latest GeoNetwork release (2.2)
- Adds ANZLIC profile
- Is labelled Beta, but is of a stable quality
- Managed by Bluenet
The BRS have hired LISAsoft to improve the User Interface for entering data to be user centric – putting the most important fields for the user first.
Point of Contact: Rob Woodcock
Business Drivers as described by Rob Woodcock
… For a number of years my team has been working with others towards the creation of an open standards based interoperable geoscience infrastructure for Australia. Collaboration with both Australian and International organisations resulted in the formation of the SEE Grid community, a number of testbeds (e.g. CGI interoperability experiments with GeoSciML, Minerals Council of Australia and Geological Surveys Geochemistry, ebXML registry and repository) and various information models and tools (e.g. ANZLIC ISO metadata profile, GeoSciML, OGC Observations and Measurements, GeoServer community schema support, Fullmoon and Hollow World GML application schema modelling tools). Most of these outcomes have completed their “testbed” phase and some are moving to ISO standardisation or broader uptake.
The reason I say this GeoNetwork discussion is timely is NCRIS has provided an opportunity to make the step change from testbeds and demonstrators to production grade services. To date many of the activities have been, as Cameron noted, “for the work being done, …under-resourced”. This is particularly true as a move from testbeds to production grade services requires considerable investment and appropriate staff to achieve quality assurance, branch management, help desk support, deployment, and so forth. It is a credit to the NCRIS process and the Auscope board and AeRIC, that this investment is actually being made (to the tune of nearly $10 million by mid 2011) and the strategic objective, in an open standards/source way, is to achieve production grade infrastructure for geospatial & geoscience information.
To this end, the NCRIS activities I am involved with (Auscope and SISS) are:
- Seeking feedback and engagement with the broader community on where best to target the available resources to achieve the production grade services infrastructure – fill in the gaps to production services and complement/support the existing activities. Flexibility and cooperation is a key ingredient
- Establishing a quality assurance framework around the Spatial Information Services stack including – packaging/installation, regression testing/unit-test suites
- Performing development on core open source technologies in the stack so they are interoperable, in sync with the open source community developments
- Establishing a maintenance and support environment including help desk, priority bug fixes in the Australian and New Zealand context, deployment assistance, training, sample deployments
- Developing features necessary to support the Australian and New Zealand geospatial communities – in particular those areas represented in NCRIS noting that is a very large group of Government and non-government organisations already.
- Seeking to facilitate/assist organisations and communities that might be able to sustain the stack beyond the lifetime of the NCRIS investments so that the organisations that deploy have a sustainable technology base – with my CSIRO hat on success is defined as my not having a job at the end of the activity!
On a more technical note, the SISS is currently based on the following open source technologies:
- !GeoServer - with community schema extensions
- THREDDS, Hyrax
- Web Portals and Desktop clients – various samples are being made available particularly for training and regression testing purposes (e.g. Googlemap portal, uDig, sample java desktop clients)
- OGC standards
- !GeoSciML standards for geoscience information
Due to our previous work we already have reasonably good links with the open source communities involved and broadly the Australian and New Zealand activities around GeoServer. Geospatial and Geoscience information standards and the Web Portal and Desktop clients. We are less well connected with the GeoNetwork community (something we are actively seeking to improve) though we have a strong involvement in registries, metadata standards and the ANZLIC profile.
Whilst I believe the strategic intent of these activities, our collaborations, and the investment level are capable of contributing to the broadly desired outcomes Bruce mentioned in his initial e-mail, the move to production services and actually having a large investment does create some additional challenges both in project management and the, more important, social interaction side of the community.
Flexibility and communication are clearly keys to achieving our shared objectives and I welcome any feedback or suggestions on how the activities and resources represented by the Auscope and SISS investments could serve the ongoing development of GeoNetwork , GeoServer and more broadly the spatial information services stack. We do have a plan to keep things moving but it is not set in stone and there is flexibility in the resourcing to “grease the wheels” so to speak to ensure the necessary gaps can be filled – you may just find we change the plan to resource the need.
Point of Contact: Peter Woodgate
The CRC-SI wishes to support the development of a robust Australian Spatial Data Infrastructure. This should be preceded by a National Strategy paper developed under the guidance of ANZLIC.
- By end of 2008
- National Strategy Policy
- Funding provided for a SDI
Point of Contact: Jeroen Ticheler, Geocat
Through a project with the European Space Agency the ebRIM model will be implemented in GeoNetwork. One of the prime goals of this is to improve the internal handling of metadata and make sure other interfaces and GN as a whole benefit of some of the ebRIM advantages. The project runs until the end of 2008 and should be stable by the end of January 2009.
Organisations often use quantitative measures to review employee effectiveness. But, beware the Heisenberg Uncertainty Principle of Human Measurement. “The act of measuring a human affects the quality of the metrics being collected”.
- January 2009
- Stable code included in GeoNetwork
Australian Spatial Data Directory (ASDD)
GA are migrating the ASDD over to using GeoNetwork and have resources allocated to this.
Point of Contact: Byron Cochrane.
Byron is using the trunk version of GeoNetwork and is developing automated Spatial Metadata Extraction Tool (SMET) to be used for harvesting and validating metadata. As yet, he is unclear how to incorporate SMET into the GeoNetwork trunk.
There are a number of others in the GeoNetwork community investigating this problem.
Point of Contact: Jim McLeod
New Zealand mini SDI pilot
A consortium of New Zealand regional councils aim to set up a pilot to set up a mini-Spatial Data Infrastructure pilot to facilitate sharing of data.
- August 2008
- Councils meet to determine key requirements
Point of Contact: Ben Searle
OSDM have been taking a facilitating role for Australian GeoNetwork development, coordinating sponsors involvement. In particular, OSDM is sponsoring the migration of The Australian Spatial Data Directory (ASDD) to Geonetwork. ASDD provides search interfaces to discover geospatial dataset descriptions (metadata) throughout Australia.
Point of Contact: George Percivall
The annual Open Web Services (OWS) testbeds provide international, practical testing of current and upcoming OGC standards, and covered many of the strategic objectives of Australian/New Zealand geospatial programs.
By aligning with OGC testbeds we gain:
- Our development is aligned with existing and future OGC standards, increasing the longevity of our solutions.
- Alignment with similar international programs
- In kind and/or financial contributions toward our projects.
- Access to world developments in this area.
The 2008 OWS6 testbed themes are:
- Sensor Web Enablement (SWE)
- Aviation Information
- Geoprocessing Workflow (GPW)
- Geo Decision-support Services (GDS)
- Compliance Testing (CITE)
- June 2008
- Release RFQ
- August 2008
- RFQ responses
- September 2008
- March 2009
Point of Contact: Cameron Shorter
OSGeo supports the development of the highest-quality open source geospatial software. The foundation's goal is to encourage the use and collaborative development of community-led projects.
The Australian/New Zealand Chapter of OSGeo will host the international conference for OSGeo, FOSS4G , in 2009 and the GeoNetwork success story will be an ideal showcase study.
- November 2009
- International conference for Open Source Geospatial Software, FOSS4G, in Sydney.
GeoNetwork Open Source community
Point of Contact: Jeroen Ticheler, GeoCat
GeoCat are strongly engaged with the European communities and are a good point for engaging and coordinating with projects like INSPIRE, as well as aligning with future roadmaps for GeoNetwork.