Thursday, 3 December 2020
State of The Good Docs Project
Wednesday, 2 December 2020
Halfway status - glossary pilot
This interim status report outlines achievements, early findings and outstanding tasks for our cross-organizational glossary pilot project.
Pilot Overview
Glossaries are easy to set up for simple examples but extremely hard to scale - especially when a project wants to inherit terms from other organizations. This pilot has been set up to test cross-domain management of glossaries. We started in August 2020, and plan to have tested pilot goals by March 2021.
We are testing glossary software, standards, and processes, and applying them to cross-organizational use cases within the geospatial mapping domain.
For more details, refer to our manifesto.
Pilot contributors: Cameron Shorter, Alyssa Rock, Ankita Tripathi, Naini Khajanchi, Ronald Tse, Reese Plews, Rob Atkinson, Nikos Lambrinos, Erik Stubkjær, Brandon Whitehead, Ilie Codrina Maria, Vaclav Petras
Current status
Our interim status as the start of December 2020 is as follows:
Task | % Complete |
Define glossary goals | 90% |
Establish implementation Plan | 80% |
Establish a healthy community | 70% |
Implement/adopt software platform | 70% |
Establish schemas for terms | 70% |
Define sentence structure for terms | 70% |
Connect external glossaries | 10% |
Collate and clean Open Source Geospatial (OSGeo) terminology | 60% |
Document template governance processes | 10% |
Goals
Task: Define glossary goals.
Understanding the problem is the first step needed to then address it.
Figure: Connected glossaries, source
Status:
- We feel we have articulated the problem and associated use cases reasonably well. Refer to the Glossary Pilot Manifesto and accompanying presentations.
Implementation plan
Task: Establish implementation Plan
Status:
- Within the Glossary Pilot Manifesto we articulated steps for standing up a cross-organizational glossary pilot. We have been steadily working against this plan.
Community
Task: Establish a healthy community:
Apache, one of the leading open source foundations prioritizes “community over code”. A strong community will solve any technical challenges faced.
Status:
- We have attracted a motivated, competent, and cross-functional team of 5 to 10 people (depending on how you count), who are steadily working through our backlog of tasks. Collectively we have decades of experience with glossaries, tech writing, standards, software, and the geospatial domain we are initially focusing on.
- We have a weekly status meeting, sometimes complemented by additional meetings, along with a slack channel, and email list.
Outstanding:
- We have only attracted one of the many OSGeo open source projects to sign up as a pilot. This is likely because we haven’t made the signup process easy enough yet, and our tools and processes need improving.
- After completing the pilot and releasing an alpha version, we’d want to scale our community into other domains (beyond our current spatial domain focus).
Glossarist platform
Task: Implement/adopt software platform
Status:
- We’ve adopted the glossarist open source software to manage terminology. This provides terminology management of terms, and publishing of terms via a standards based web service.
- Ribrose, who develop this software, has been working with us to update the software to address use cases and feedback we are finding during testing.
- Extra functionality is expected to be included during the remainder of this pilot.
Schema
Task: Establish schemas for terms
Figure: Glossary schema, sourcing from upstream glossaries, source
Status:
- We’ve adopted the Simple Knowledge Organisation System (SKOS) standard for selecting the fields to use when describing terminology.
Sentence structure
Task: Define sentence structure for terms.
Status:
-
For sentence structure, we’ve adopted 16.5.6 Definitions, ISO/IEC Directives, Part 2
Principles and rules for the structure and drafting of ISO and IEC documents - “The definition shall be written in such a form that it can replace the term in its context. …”
- Example: 2.1.17 die: metal block with a shaped orifice through which plastic material is extruded
- While aligning with uses in specific spatial ISO settings, many glossaries use alternative sentence structures, and this is something I’m expecting we’ll need to revisit.
Connect glossaries
Task: Connect external glossaries
Status:
- In theory, we are very close to connecting two glossaries. In practice, we still need to set this up, which is a focus for the rest of this pilot.
Governance
Task: Document template governance processes
Figure: Glossary governance, source
Status:
While we have been discussing and making use of our own unwritten governance process, we are yet to write this down and provide it as template guidance.
Spatial use case
Task: Collate and clean Open Source Geospatial (OSGeo) terminology
Status:
- We’ve collated OSGeo terminology from around ten OSGeo glossaries along with OGC and ISO terms.
- These terms have been aligned with writing rules.
- We’ve just started looking at terms from the GRASS project, and plan to integrate these too.
Tuesday, 22 September 2020
Tech Writing Patterns and Anti-patterns
Friday, 18 September 2020
Awards for open source tech writers Jo and Jared
Jo Cook and Jared Morgan have been presented with awards for Google's Open Source Peer Bonus Program. The award is a recognition and thank you to people who go above and beyond in their contributions to open source. It also includes a token financial contribution - enough to take the family out for dinner at a nice restaurant.
Well done Jo and Jared, you really deserve it:
Jared Morgan, Write the Docs podcast host |
Jared Morgan
Jared is a core contributor and community builder within The Good Docs Project. As an experienced technical writer, he has contributed to many of our initial set of writing templates and then helped absorb feedback from our community. He is well respected and well connected within the technical writing community, helping to inspire other thought leading technical writers to come and join us.
This year, 2020, he has signed up as a Season of Docs mentor for The Good Docs Project.
In a related activity, he has also helped spread knowledge within the technical writing community, by co-hosting the Write the Docs podcast.
Jo Cook, explaining docs at DevRel conference |
Jo Cook
Jo is an enabler of open source communities. She commits large chunks of her volunteer time to working on the hard problems that others don't tackle. She is someone you can rely upon when needed, and she steps back when her skill-sets are more valuable elsewhere.
A few highlights of her volunteer activities over the last couple of decades include:- Help set up The Good Docs Project, doing much of the grunt work in setting up open source processes.
- Mentoring a Season of Docs tech writer and other community contributors for the GeoNetwork project.
- Presenting at numerous conferences on topics of open source, documentation, and geospatial.
- Playing lead roles in setting up conferences for the Open Source Geospatial foundation.
- Building the Portable GIS distribution of Open Source Geospatial software.
- Serving on the board of the international Open Source Geospatial foundation (OSGeo).
- and more ...
Monday, 17 August 2020
Cross-domain management of glossaries
Image by Chris Dlugosz |
Glossaries are easy to set up for simple examples but very hard to scale - especially when you try to scale across use cases, across domains and across organisations.
We are kicking off a pilot project to address cross-domain management of glossaries and preferred word lists. The pilot will build processes and tools for the generic use case, developing and applying them within the geospatial mapping domain.
Communication about this document will be on the OSGeo Lexicon email list. (Check your spam for confirmation email after subscribing.)
Goals
We aim to achieve the following:- Create best practices in cross-domain glossary management, for adoption by the technical writing community.
- Build a community who define and manage geospatial terms across multiple geospatial communities.
Use cases
- As a general document reader, I want to find definitions for the terms and acronyms in the document I am reading. There may be multiple definitions for a term, determined by context, or having multiple upstream source definitions.
- Ideally identified terms can be highlighted, and readers can hover over terms to get a popup with more information.
- As an advanced document reader or term maintainer I want to understand the inheritance path back to upstream source definitions.
- As a technical writer, I want to find the preferred spelling, capitalization and word choice for a term.
- As a technical writer, I want a tool which builds a glossary of the organisation specific terms used within my document, by searching my organisation’s glossary.
- As a minimum, this will search a document for acronyms, and match against provided glossaries.
- Missing or duplicate terms will be flagged.
- Existing terms will be highlighted according to [preferred, alternative, deprecated] status.
- As an organisation, I’d like a tool which seeds my organisation specific glossary by searching my organisation’s docs for terms in a superset of reference glossaries.
- As a document translator, I want glossary terms to be translated into my target languages, so I can consistently translate a source term to the same target term.
- As a project, I want a glossary which includes terms specific to my project, as well as terms sourced from multiple external glossaries.
- As a foundation, I want a glossary which sources terms from my many sub-communities.
- As a glossary owner, I want to ensure my glossary is continuously updated to align with updates in my source glossaries.
- As a glossary owner, I want a governance framework to help resolve terminology management conflicts between terminology sources and stakeholders.
- As a software developer, I want terms and relationships between glossaries in a machine-readable form so that I can integrate glossary functionality into software.
- As a data modeller, I want the terms described in my model to use existing term definitions, (from APIs, standards, etc), as defined within my domain, so that I can share my terms with others, and seamlessly integrate datasets from multiple sources.
- As an application using a glossary, I want terms defined with a consistent schema which facilitates machine readability and interoperability
- As a researcher, I want to be able to find related information even if it uses different terms for the same concepts.
- As a content manager, I want to apply a preferred term as metadata to enable retrieval of digital content across disparate source repositories.
- As a search engine or software algorithm building knowledge graphs, I want to use glossaries to help extract meaning from textual information sources.
Scope
The pilot project will focus on the geospatial domain, which has an advanced ecosystem of stakeholders and technologies. However, the templates and processes will be developed to be broadly applicable for all software ecosystems managing glossaries.Schedule
- August 2020: Kickoff
- December 2020: Soft launch at the Write the Docs Australia/India conference
- February 2021: Hard launch
Deliverables
The following deliverables are likely to be achieved within the pilot’s timeframe:Schemas
- Schema for a preferred word list. For example: Word list | Google developer documentation style guide.
- Schema for a term definition. For example: ISO/IEC Directives art 2, Principles and rules for the structure and drafting of ISO/IEC documents - 16.5.5 Term and 16.5.6 Definitions.
- Schema for translation of terms between languages
Processes
- Process for a central glossary owner to accept, triage, and refine proposals for new terms from their community:
- As single terms.
- As a curated set such as a word-list from an authoritative source.
- Process for deciding the point of truth for a term within a glossary, be that from a central glossary or from a leaf glossary.
- Process for handling multiple definitions of a term, which may differ across contexts.
- Process for notifying updates to a glossary.
- Process for managing the lifecycle of terms.
- Version management of individual terms.
- Version management of schemas.
- Version management of a glossary as a whole.
- Version management of processes.
- Version management of tooling.
Guides
- How-to guide for a project to set up their own glossary by selecting a subset of master glossaries plus adding their own terms.
- How-to guide for setting up a master glossary.
- How to handle aggregation and overlaps between sets of terms and mappings between similar terms used in different contexts.
Tools
- Tool for managing glossary terms through a standards based lifecycle such as OGC’s statuses:
- Submitted
- Accepted
- Experimental
- Valid
- Superseded
- Deprecated
- ...
- Tools for publishing a glossary in human readable form,
- Including key relationship visualisation and navigation support.
- Tool for publishing a glossary in machine readable form,
- Facilitating features such as mouse-over popups.
- Tools for exporting from a master glossary into other publishing mediums, such as a page within another website or Content Management System (CMS).
Geospatial domain deliverables
- Term management committees:
- OSGeo Lexicon committee managing OSGeo terms.
- OGC Naming Authority committee managing policies and final publication for OGC geospatial standards terms.
- ISO TC211 committee managing ISO geospatial standard terms.
- [Optional] Specific software project committees managing their project’s terms.
- Definitions for OSGeo terms
- Translations of OSGeo terms
- Glossary Web Services:
- OSGeo Foundation Glossary Web Service.
- OGC Glossary Web Service.
- ISO TC211 Glossary Web Service.
- [Optional] Multiple OSGeo projects standing up their own glossary instance. For instance, set up one for QGIS and/or PostGIS.
- Publishing glossaries:
- Publishing of Glossary within Geoforall Newsletter
- Publishing of Glossary within OSGeoLive documentation
Why now? Why Geospatial?
As of August 2020, many things are lining up to enable us to collectively solve the tough challenges around the cross-domain management of glossaries.Aligning activities include:
- The Good Docs Project has been making progress tackling technical writing problems. We have recently built a How to apply/customise a writing style guide for software projects. A next step is to explain how to apply word lists and glossaries. And we have volunteers willing to push this forward.
- The geospatial community is very advanced at trying to solve terminology management challenges:
- Through the OSGeo Foundation, we have relationships with ~ 50 geospatial open source projects who all need glossaries, and through the OSGeoLive project we have contact points with each of these projects as well as access to volunteer translators for OSGeo documentation. In the 2019 Season of Docs program we connected with all these communities and updated their quickstarts. We can do it again for glossaries.
- We have experienced volunteers from the OGC and ISO TC211 standards bodies keen to bring their expertise to advance this challenge. These volunteers are already working on this problem.
- From the ISO TC211 and OGC communities, we have access to open source software for term management and access to the people who wrote it.
- Through the Geolexicon working group, we have OSGeo volunteers who have been maintaining a glossary of terms. They will be able to apply these terms and add more.
- The Good Docs Project is starting a sprint of work, aligned with Google's Season of Docs. We are shooting for a soft launch in December 2020, hard launch around February 2021. This helps frame a sense of purpose, timing, and scope which we can tap into.
- There are other initiatives within The Good Docs Project which will complement this work and facilitate cross-pollination of ideas.
- ISO/TC 211/TMG is redeveloping its Multi-Lingual Glossary of Terms (MLGT) as an ISO SMART project for machine-readable/interpretable terminology that encompasses management of life- cycle to the usage of such content.
- The ISO/TC 211 MLGT SMART work is performed in partnership with Ribose who supplies the Glossarist software and the Geolexica terminology web platform. Ribose volunteers to support OSGeo lexicon work and its workflow in both of these offerings.