from Hacker News

Ask HN: How do you organize software documentation at work?

by jilles on 2/14/24, 2:37 PM with 89 comments

Hi folks,

Recently I have ventured into technical writing. At the company I work for, documentation is scattered around ~4 different tools.

1. Google Docs 2. Confluence 3. GitHub (READMEs) 4. Slack

Each of those serves a purpose of course, Google Docs are very collaborative, Confluence is our source of truth, GitHub is mostly for engineering and finally Slack usually has some threads you can find if you run into certain issues.

I am not suggesting we should put all of this into a single tool, but I am wondering if there is a methodology for organizing documentation. I am aware of Diataxis, and want us to use this for certain services / products. What I am looking for in this ASK HN post though, is an overarching methodology of organizing all documentation.

by softwaredoug on 2/14/24, 2:45 PM
I hate, hate software documentation as a concept. It gets out of date and is hard to use. It's a last resort, only for specialized cases.
I prefer two types of documentation:
1. Executable documentation - tests, asserts, even things like Jupyter notebooks that can be tested and executed
2. Timestamped documentation - documentation that has a clear date on it of when it was valid. So the reader has an expectation "This was true at X date, but may not be true now". This includes detailed pull requests and git commit messages.
https://softwaredoug.com/blog/2023/10/13/fight-undead-docume...
by hiAndrewQuinn on 2/15/24, 8:33 AM
99% of people on the Internet are lurkers, and only 1% actually contribute anything ever. By defaulting to action, you can quickly end up wielding a disproportionate amount of influence on the resulting culture of your org. So, not a methodology, but an algorithm I often follow:
If I need to do a thing, and I don't know how to do it, I search for the most obvious sequence of words I can thing that is vaguely like my problem in Confluence. I do this maybe 3 to 5 times.
If I find something, I open it in edit mode and start reading through it. The instant I hit upon anything not obvious to me, add whatever obvious thing is missing.
If I don't find anything in there, I create a page in the Diataxis format (usually a HOWTO) and write it myself. I use short sentences, plenty of screenshots, and plenty of code blocks, to make it as copy-and-paste friendly as possible.
I never ask just how basic this thing actually is - most of my most viewed articles in any organization turn out to be the most basic ones. "How to make a network drive in Windows." "How to set up your Git credentials." These are very often much more popular than "How to build a custom VM inmage with QEMU and Ansible." I take my own confusion as an existence proof that this is sufficiently obscure enough to confuse one generally competent but non-expert person, and take faith that most people in my org are not experts in most things.
I trust other people to be able to look at the timestamps and the history of the docs and to figure out whether what they're reading is too outdated to be useful. I pretend, despite evidence to the contrary, that other people will follow roughly the same algorithm as me, and read pages and make updates on the fly as they work. If they don't, well, that's them ceding their cultural power, which they probably don't want anyway (and that is entirely fair).
by coldpie on 2/14/24, 2:49 PM
Never use a wiki for anything. Wikis are the number one worst form of documentation. They are worse than no documentation. Wikis explicitly destroy the concept of ownership and responsibility, and without those, what you get is a big pile of outdated, unorganized trash that no one maintains. Destroy wikis.
by simonw on 2/14/24, 2:59 PM
At a previous employer people constantly complained about the lack of documentation. Once I started digging in I realized we actually had LOADS of documentation, but it was spread across (genuinely) 11 different systems!
I span up a search engine that covered as many of those systems as possible (just SQLite FTS with Datasette, cron tasks that indexed various things and a simple custom search UI) and it helped a lot, because people at least had a fighting chance of finding stuff.
I believe there are off-the-shelf solutions for this kind of thing now, though I don't have experience with any of them myself.
I've since recreated aspects of the search system I built there as https://github.com/dogsheep/beta - you can see a working example of that system on the Datasette site here: https://datasette.io/-/beta?q=geojson
by YeahThisIsMe on 2/14/24, 4:47 PM
We add a new system of doing it every couple of months to years and then don't migrate everything over from the old system, so they all still see changes.
Documents in a file system, Confluence, a Wiki, docs in project repositories and a special documentation repo.
by PeterisP on 2/14/24, 3:00 PM
Our projects have a certain need for documentation that describes the exact meaning of various data fields in files/APIs/DB entries, etc.
We have decided that the best place for the "single source of truth" for that is right next to the appropriate code in git, with the various build/deployment scripts ensuring that copies (explicitly unmaintained, unmaintainable, read-only) of that get packaged with the actual systems, with the packaged libraries, linked in their web backends, etc. We don't care much about the format of the document, whatever fits the particular needs best - e.g. sometimes it's markdown, sometimes it's Excel.
The key factor here is to ensure that (a) there's a single source of truth; (b) you can have the same atomic commit/pullrequest/whatever altering both the system and the documentation at the same time; (c) every artifact has the appropriate version of the documentation, instead of going to some internal site or document which might have a different, newer version, you know what is supposed to be true for this release which actually is on this particular server.
by nwsm on 2/14/24, 3:11 PM
We use Confluence and markdown files in GitHub. I think we are moving a lot of our docs to Backstage [0] soon.
One process that ends up being really valuable for documentation purposes is our "Architecture Review Documents". This is a standard document that team leads fill out before starting work on a new Saga/Epic/Feature/whatever. It includes the scope and business value of a new feature or large block of work, high level technical architecture of implementation, the impact on existing database schemas and service APIs, etc. This document is presented in a meeting with technical leadership in our organization who deep dive on the topic and explore potential pitfalls in the plan.
The document and recording of that meeting live on forever, and this information is very useful when getting acquainted with a certain part of our product/codebase. You are able to read and hear clearly the intention of a certain service or module, and you can identify several relevant points of contact to ask questions to.
[0] https://backstage.io/
by mrmb on 2/14/24, 3:27 PM
Location: Separate Documentation repository in a GitLab group project. That way you have history, notes, membership, ownership, and processes (templates, merge permissions, reviewers)! This can also be included as a submodule in your software repo. And can be used to read/edit by devs in their chosen IDE, or R/W on GitLab via html, or can generate static html documentation via pipelines. Oh, and you can build pipelines to generate html outputs for different use cases, and perform checks, run scripts, etc.
Framework: Choose a framework, like Arc42 for general layout as a good starting point. Remember you have company, quality, project, program, product, process, user, internal how-to's, etc. documentation types not just... user-guides and systems-architecture..so this will be dependent on your org/product. Go for MVP and 80-20.
Plan: Write a plan as part of the documentation that details all of these facets and rules so all contributors understand it.
Formats: drawio.svg for complex diagrams, mermaid for simple diagrams or if important to change manage like code, asciidoc for complex documents, markdown for most/simpler docs. Tables in csv or asciidoc. Images in svg, or png. Everything aforementioned renders on GitLab.
Other rules: Automate everything possible to reduce [manual] documentation. Use text/code vs proprietary formats.
You can get more and more complex with the tech writers, dev, ops, systems, all under one roof and coordinating documentation and pipeline scripting.
by bmitc on 2/14/24, 3:22 PM
I generally like using a combination of the following:
* Wikis for general information, environment setups that are not project specific, etc.
* Repositories to host code and system specific information, usually in Markdown documents.
* Google Docs or Microsoft 365 for working documents that need to be collaborated on, commented on, and shared without the rigmarole of pull requests and the more static nature of wikis.
* Slack is for ephemeral information. If it contains documentation, specification, FAQ, debug steps, process explanations, etc., those should be captured and moved to the appropriate documentation location.
The one thing I really struggle with are diagrams. Cloud-based diagram tools like Visio and Lucidchart are great, but they are tough to save in a good location outside of the cloud environment. It requires exported the file and/or a PDF export. Then, these fit rather poorly into source-code control. There is the concept of "diagrams as code", but all of those systems are generally terrible at layout. There really is no good solution, as there are major trade-offs to both.
by perrygeo on 2/14/24, 5:55 PM
For technical docs, I always advocate for documenting in source-controlled markdown. For all the same reasons we source-control our code. It's the bare minimum requirement for quality control of professional software work; no one would take a programmer seriously if they refused to put their code into source control and insisted on pasting code snippets around in a dozens of various tools and live-patching prod! Yet we do this with docs all the time. It's no surprise that we struggle with doc quality since its treated as second class to code.
If you want quality docs, we have obvious tools for that. Treat it like code. If you want to keep pasting random thoughts around and calling it "documentation", don't act surprised about the dismal state of your wikis.
by dewey on 2/14/24, 3:00 PM
Code comments + Slack. GitHub and Slack search are great if you know how to use them and I have yet to find something I couldn’t answer with these tools.
People like to say “Slack isn’t documentation” but in reality it’s a better documentation than some outdated Wiki nobody is touching.
by fuzzfactor on 2/14/24, 3:47 PM
Organize? Software? Documentation? At work?
Don't make me laugh.
by torblerone on 2/14/24, 2:53 PM
I have yet to see a good documentation strategy. We have quite some awareness about the problem but I don't see an easy exit.
Currently, we have a similar dumpsterfire running. Project / guideline / bla documentation hanging around in Confluence, technical documentation snippets in git repositories "near" the code they belong, some folks scourging themselves with Sharepoint and there's no solution in sight.
Me and some colleagues have developed quite some tendencies against natural language documentation because it basically becomes stale as soon as you publish it into your org.
by wduquette on 2/14/24, 4:03 PM
I'm responsible for a number of Java products. I try to provide high-quality Javadoc for all public library interfaces, library user's guides where appropriate, and development guides for applications. The latter two take the form of MDBook documents (https://rust-lang.github.io/mdBook/), with the document source living in the GitHub repo so that it's tied to the particular software release in a natural way.
by cosmic_quanta on 2/14/24, 2:58 PM
My workplace uses Confluence.
I hate it for a very simple reason: the code (in BitBucket) and the documentation are disconnected.
I want my code and documentation to be coherent with each other. For small open-source projects (e.g. https://github.com/LaurentRDC/javelin), I love using doctests which ensure some level of coherence between documentation and code.
by gravypod on 2/14/24, 3:05 PM
(Opinions are my own)
I use (and work on) this: https://www.usenix.org/conference/srecon16europe/program/pre...
Basically docs live next to code or in a team-owned folder if there is no code. Code review happens whenever you change docs.
by paddy_m on 2/14/24, 3:53 PM
Who is the audience for your documentation?
If it's nontechnical internal I'd lean towards confluence more. If it's technical external target read-the-docs or the JS equivalent. If you have a venture funded startup, the polish expectation is higher so maybe some type of built website. All of this should run through CI.
Google docs are too loose for my taste to serve as documentation, they are an 80% effort, good for collaboration with non-technical stakeholders, good for live writing but that shouldn't be the end artifact. Slack is also no place for documentation.
--
Here is what I'm targeting for my open source project targeted at technical users
Tutorials are used to walk users through using a project. I frequently use Jupyter notebooks for this and record a video walking through the notebook. The markdown portions are rough talking notes for my narration. The video ends up as a dead artifact, but some people learn better that way. The video is also a lower effort way for people to check out your project. [2] I try not to let perfect be the enemy of good for the videos especially.
I try to incorporate documentation into the development process. Many times I will start documenting a feature and realize it includes too many caveats, then I will redesign the feature so it's easier to document. Often this means that the tutorial comes first and is the only part built.
For API documentation ideally I will have a gallery that renders well, with executable examples that walk through options. Hardcoded small examples are key (avoid faker libraries and excessive scaffolding). React-edit-list has one of the best examples of this I have ever seen [1]
I like to write narrative documentation and sometimes link to the related PRs. The PRs should include the "Why" of the design decisions in their description. Narrative documentation should connect the "what" of API docs. Narrative documentation should also highlight recommended usage patterns.
[1] https://mmomtchev.github.io/react-edit-list/#/simple
[2] https://www.youtube.com/watch?v=GPl6_9n31NE A walk through of how to extend a Jupyter notebook widget I wrote.
by stcroixx on 2/14/24, 2:53 PM
HTML excels at this. Also has the advantage of not depending on a third party with all the compromises that entails.
by Jeremy1026 on 2/14/24, 6:08 PM
Your company sounds very similar to everything I've ever dealt with. A mixture of public (API docs) and private (google docs, confluence, floating in random slack threads that you'll never find unless you materially participated in it) sources of documentation.
by saccharose on 2/14/24, 3:17 PM
arc42 [1], rendered into whatever format you prefer for reading. In our case we write asciidoc (sometimes markdown) and render it to HTML for each of the releases, so that a version of the documentation is delivered with the release. The authors of arc42 also encourage users to set documentation under version control to ensure one can keep the project and its documentation in sync [2].
[1] https://arc42.org/overview [2] https://faq.arc42.org/questions/G-1/
by Zigurd on 2/14/24, 2:55 PM
Depending on the complexity of the project, short text documents about architecture on Github are OK. Still you have to be careful to document only the most important things that are worth keeping up to date, and then actually doing that.
by epirogov on 2/14/24, 2:48 PM
I wrote important steps for every task to text file, commands, examples, good code snippets. then it simple to remember already researched information for often similar improvements for project codebase.
by therealfiona on 2/14/24, 3:25 PM
Everything in Markdown with the code.
Other parts of the company use Confluence, but the docs with the code are what I pour my heart and soul into.
Slack is always a good resource, but I'd hardly call it "documentation".
by hadas-a on 2/15/24, 6:32 PM
Documentation sucks. Try Swimm.io - keeps your docs connected to your code (so it automatically update as code change) and also lives in the ide
by Octabrain on 2/14/24, 2:52 PM
Where I work, we use Confluence and Backstage. Confluence sucks and Backstage, although I conceptually consider it appealing, sucks too.
by aWidebrant on 2/14/24, 3:57 PM
Documentation of interfaces between components that are owned by different organizations get my full attention and care.
Everything else is best effort.
by euroderf on 2/14/24, 8:40 PM
So far no mention of Docbook and just one of DITA. Revealing.
by troyvit on 2/14/24, 3:32 PM
Horribly. We use Confluence for much of our documentation, but we can't afford the license to give everybody access to the documentation who needs it, so often we'll be copying data out of Confluence and into google docs. There's an export for that in Confluence but it's buggy on larger documents and often it's just faster to do it manually.
Meanwhile over in google docs it's a trash fire. There's no organization, just documents. At least sharing is possible, and the collaboration is clutch, but documents are copied, those copied edited, then not shared with the originals. It just goes on and on.
Then we have an intranet based documentation system called Papyrs. At least it's a wiki, but nobody maintains it, and search is best described as enabling users to rule out what they're looking for rather than find what they are.
Whatever you do, don't do what we did :)
EDIT: mentioned collaborative nature of google docs
by joewrong on 2/14/24, 3:00 PM
we use Notion for anything that doesn't fit in one of the code repos, like process docs and platform architecture notes.
by quectophoton on 2/14/24, 2:55 PM
5. Google Meet or Zoom call.
by MilStdJunkie on 2/14/24, 3:51 PM
I'm pretty new to software docs, having spent most of my career with physical stuff, but my $.02.
A preliminary word about tooling. If you have reviewers and approvers using source control in the day to day, then Docs-As-Code (DaC) is all you need. If you have complex print requirements, or a need for transclusion or conditionals, I'd advocate Asciidoc over Markdown, but if you have a Python-heavy environment ReStructuredText is a heavy hitter once Sphinx is up. This whole paragraph is superseded by reviewer needs - jump down a few paras.
DocToolChain has a fairly well-integrated template for the Arc42 architecture template, with a focus on handling the whole thing Docs-As-Code (DaC) in Asciidoc on generic version control. However, I'm assuming you're talking about user[1]-facing docs, and Arc42 will be of extremely limited use there - although Arc42 could simplify the feeding of your architecture into that of the user facing docs. On that note . .
Is there a general methodology for software documentation? No. That's a DITA trap: thinking that there is a reified "information typing" system that applies to all knowledge. I emphatically disagree with that premise, with every fibre of my being.
Practically, the architecture of your doc setup will depend on a few things. I want to hit on the nuts and bolts without going into domain knowledge. I'm probably failing at that, but that's the intent.
First: your reviewers - what are they most likely to review the docs in? Because review churn is going to be 80% of your time, and doing formal reviews in PDF, while writing in Arbortext, and then making Word track changes out of the PDFs, is one of the more common and more stupid workflows I've had the misfortune to be a part of. Organize it so you're working as close to the review format as possible. Ideally, it's DaC using whatever (.md, .rst, etc), and if you have complex print and component content (CCS) requirements, using Asciidoc. But if your reviewers only touch things in Word, then seriously consider a Sharepoint pipeline. It'll hurt a lot less than using your Special Favorite Tool but having to pipe the edits back and forth for the rest of time. And if they want the Dead Tree Simulator (PDF), well, maybe open up your wallet and go for Framemaker/Adobe Experience Manager. It's going to cost a bundle, but have you ever tried setting up shared PDF reviews on a homebrew CMS with Windows authentication? While also working full time as tech writer? Yeah, it sucks.
To re-iterate: use what the org's using. Whatever efficiency gains come from using Golden Solution X will be completely lost if the rest of the business ignores it.
Second, how do requirements work? Are you just wireframing, pushing it out, then taking the issues that come back in and slapping them in milestones? If that's the case, there's probably not a whole bunch of analysis going on. On the other hand, if someone is really looking at requirements, figuring out which pieces of the codebase can get re-used, all that stuff, you'll be well-served mimicking the architecture your req anal team is working up. Either way, a pretty good architecture is to make some directories in your doc project that broadly cover the bases.
```
  000000_ReservedForPublicationsInternalUse;
  001000_LegalSnips
  001100_UnicodeDocAttributes 
  050000_BookMapsThatAssembleDeliverables; 
  051001_DefaultProductManual
  100000_Environment; 
  200000_Hardware; 
  300000_Installation; 
  400000_UserInterfaceDescription;
  401000_IndexScreenDescription
  401001_Login 
  500000_GeneralTasks;
  501000_AirTravelModule
  500101_AirportToAirportSegment
  500102_TripBuild
  etc. 
```
Figure out a useful filename convention and police it with githooks / actions. No one commits "newfile01.adoc" to the root directory. ANGRY BUZZER SOUND. Actually, while you're at it, you can hook up pre-commit vs a bunch of automated QA: grammar, cspell, link checkers, all that stuff.
If the requirements end up sharing a lot of material, consider chunking up the docs so that the parts can be re-used. Asciidoc transclusion and conditionals are in vanilla Asciidoc and they work well. But think very carefully before you go down the re-use road. SERIOUSLY. It's really not worth it unless your content deliverables are duplicating 60-80% of their content, and sometimes not even then, and if it gets borked up you end up with a system THAT MAKES NONSENSE. Please believe what I'm telling you here. You really do have to have the filename thing under control for this to work, and make sure your common repositories (glossary, warnings, legal, etc) are not getting worked by five different people. If you re-use stuff, make the directory structure ONE LEVEL DEEP, so relative paths are the same for both the "chunks" and the "books" that call (include) the chunks. Sure, you can do stuff with `:includedir:`, but it'll be a lot easier to just have a flat directory structure.
[1] "User" as in the sense of "audience consuming docs", not necessarily Joe User.
by pwb25 on 2/14/24, 5:07 PM
organize? We have a 3 year old outdated confluence page
by fuzzfactor on 2/14/24, 5:15 PM
Here's an example of how Microsoft sets an example, just a random discovery from just yesterday.
The REAgentC.EXE command is the configuration agent for the Windows Recovery Environment.
"Complete", "comprehensive", reference documentation is here:
REAgentC command-line options:
https://learn.microsoft.com/en-us/windows-hardware/manufactu...
Where the detailed syntax and command-line switches are each "fully" documented in "expanded" webform by default, but the short "header" alone is "unexpanded" and shows only pointers to the first 3 CLI switches:
>In this article
> REAgentC syntax
> /setreimage
> /enable
> /disable
> Show 5 more
until you click "Show 5 more" and then you get the "entire" list:
> REAgentC syntax
> /setreimage
> /enable
> /disable
> /boottore
> /setosimage
> /info
> /setbootshelllink
> Related topics
> Show less
Helpfully this page is dated from 2022 AUG 18, and there is a table of contents in the left-hand frame linking to other pages from the series, but this is the one page known and designated as "REAgentC command-line options" so you've got to figure that this page is core and at least mentions all the options even if further pages would be necessary to fully explain their implementation.
At the bottom of the page after all the /switches have been documented, for further info there is a link to "Related Topics; Windows RE Troubleshooting Features" which is from 2021 DEC 15. Good information there, but nothing more about the switches.
For that you need to click on "Add an update package to Windows RE" from the table of contents to the left:
https://learn.microsoft.com/en-us/windows-hardware/manufactu...
This how-to article appears earlier in the Recovery Environment documentation series, quite a bit before the command-line options are summarized on the final command-line options "reference" page. However this tutorial page is from 2024 FEB 09, so about as current as can be. This is key.
And that's how we can learn about the underdocumented Reagentc switches that can be used to mount and unmount recovery images on a running PC rather than an "offline" image.
The example mentions "ReAgentC.exe /mountre /path c:\mount", and "ReAgentC.exe /unmountre /path c:\mount /commit".
So at least two more reagentc switches are functional now, but not included in the above reference list:
/mountre
/unmountre
Maybe someday these apparently new features will be documented on the main options page like you would expect from a company that is supposed to value keeping up-to-date.
by cdchn on 2/14/24, 3:30 PM
Buddy, I'm in _EXACTLY_ the same situation (sub GitLab for GitHub though), so if you see any light at the end of the tunnel, I'd LOVE to hear about it.
by foxandmouse on 2/14/24, 3:02 PM
I've been loving craft, sadly it's another piece of native mac software keeping me from switching.
by ramses0 on 2/14/24, 3:26 PM
I forget the terminology, but there's a good "grid" breakdown of documentation types (I think this one: https://documentation.divio.com ) that I've simplified a bit for the internal documentation I'm involved with.
* README, HOWTO, INFO, PROJECT, DESIGN, NOTES, FAQ
When I pull down a `git` repo, I read the `README.md` (of course). I make my own `NOTES.md` (eg: `.gitignore`'d) of what commands, environment variables, useful blog posts, search results, whatever. Rarely do I share or encourage sharing of `NOTES.md` wholesale, but it's helpful to be able to pull out a few snippets or re-orient myself when coming back to that software/project.
Then, other documents get prefixed with "HOWTO-Do-Some-Specific-Thing.md", or "INFO-Some-Particular-Component.md".
"PROJECT-...", and "DESIGN-..." are "dangerous" ones in that they can quickly fall out of date, but they can be very useful while they're being actively managed. I guess personally I've started making sure to include dates or "eras" in the title, eg: "PROJECT-[2024-Feb]-Add-Foo-Support.md" or "DESIGN-[2024-02-14]-...". Stuff that's outlived its usefulness can probably be moved to an `ARCHIVE/...` in case you need it later, but keep it out of the way from confusing newcomers 1-3 years from now.
"FAQ-..." almost never comes into play (hopefully) b/c it should mostly get absorbed into "HOWTO-..." or product improvements, and few products seem to rise to the level of needing FREQUENTLY asked questions. Ideally FAQ's would "go away" with work on the product or other documentation, but I've had some success with it as like sales-oriented (and ideally: sales-managed) FAQ / Canned Customer Response learnings.
Putting it all together you get something like:
```
  * README.md
  * HOWTO-Backup-to-S3.md
  * HOWTO-Backup-to-BackBlaze.md
  * HOWTO-Manage-Existing-Backups.md
  * HOWTO-Exclude-Frequently-Changing-Files.md
  * INFO-Supported-Backup-Systems.md
  * PROJECT-[2024-Feb]-Backup-9000.md
  * DESIGN-[2024-Jan]-Auto-Backup-Detection-and-Failover.md
  * NOTES.md (private!)
  * FAQ-Residential-Customers.md
  * FAQ-Business-Customers.md
  * FAQ-Backup-Recovery-Issues.md
```
Generally, they wouldn't all be "git-adjacent", but `README.md` should link to the other sources, `HOWTO-...` and `INFO-...` is generally good for your wiki/confluence/"published" documentation. PROJECT, DESIGN, and FAQ are all best as "loose" shared docs. Multiplayer by default with low barrier to edit/contribute. Sometimes DESIGN could be INFO-Design-..., or DESIGN might even be a DIAGRAM. You'll know it when you see it.
Prefixing the documentation with the TYPE has been super-critical in adoption. It clarifies that it's not "DOCUMENTATION-About-Some-Thing.md", but instead "HOWTO...(you're gonna do something, goal-oriented)", or "INFO...(you're gonna learn something, no specific outcome)".
If you START introducing new prefixes, then you'll hopefully see them propagate (as appropriate), but ideally whatever vocabulary your business/team ends up using is small (~5-10 documentation types) would cover a good 80-90% of your use cases, and they should be brain-dead simple enough that it's clear their categorization is useful.