from Hacker News

If it is worth keeping, save it in Markdown

by stared on 2/22/25, 9:52 AM with 273 comments

  • by tengwar2 on 2/26/25, 12:38 AM

    The other major alternative to consider is RTF. I standardised on that about 10y ago, planning for a 30y horizon. It is a more complex format than Markdown, still text-based, but biased towards WYSIWYG presentation and editing, while Markdown is usually not WYSIWYG in the editor. Both formats suffer from a lack of standardisation, though Markdown seems to have more problems in practice - I've never had an issue caused by RTF incompatibility. Both are very widely supported. Both formats are very widely supported and it can reasonably be expected that this will continue.

    I prefer RTF for two main reasons:

    * I can't express simple formatting such as "make this text red" in Markdown. No, I don't mean "accentuate this text and leave the decision on how it looks to someone else", I really do mean "make this text red". I do a lot of public speaking, and I want to keep to certain conventions which are easy to read fast.

    * Most of the time I am writing text, not reading a version after it goes through a formatter, so I prefer to see it formatted on screen. That's really a limitation on Markdown editors, but it's almost universal so for my point of view, it counts.

  • by Evidlo on 2/25/25, 11:36 PM

    Just a note that the most common Markdown flavor (Commonmark) doesn't actually support frontmatter. The author is using presumably Obisidian-flavored Markdown (which is a mixture of Commonmark, GH-flavored Markdown, and Latex).

    For file-tagging, I would consider TMSU [0] instead of writing bespoke tools. (ideally we would just use xattrs, but the world isn't ready for that)

    [0]: https://tmsu.org/

  • by hartator on 2/25/25, 11:54 PM

    The only drawback of Markdown is images.

    GitHub-flavored Markdown is so popular because you can really easy inline them. You don't have to worry about storing them, linking them correctly, and you can even paste to the Markdown field.

    There is no elegant solution like this in actual Markdown.

  • by tqwhite on 2/26/25, 1:50 PM

    Also, as an old person, I will tell you that 1) I got my first personal computer in 1979 and have been trying to keep my bon mots archived ever since. I have tried a million things and have learned one key lesson: It's not really worth it.

    I literally have a footlocker filled with old disk drives (remember, since 1979!) and I have never, ever gone back more than a few years, hell, more than a year.

    Now that disks are big, I keep a lot of old stuff. I have, eg, screenshots dating back to 2015. Email before then. And so so much more.

    I have never gone back more than a few years.

    I will continue to archive because I must but, Old Person to Young People... Don't put too much effort into long term availability. It's not a good investment.

  • by kreelman on 2/26/25, 12:56 AM

    Hmmm. I see the use in this...

    For me, everything swirls in an enjoyable vortex towards org-mode.

    - Literate Programming, tangle/weave

    - Export to DocX, PDF, HTML

    - Org-Roam

    - Time Management.

    Several things mentioned above are day to day. I think spectacular things are often made up of collections of useful everyday things.

  • by jon_richards on 2/26/25, 12:24 AM

    The killer app for markdown would be a collaborative editor that displays the raw markdown and formatted markdown side-by-side and makes both sides editable. Tech people can use `#` and `*` on one side for formatting, product people can use normal text-editor buttons like "header1", "italics", etc.
  • by geokon on 2/26/25, 7:05 AM

    It feels like a long term solution would be to use a markdown that is both easy to write (not RTF or XHTML), but has a defined grammar in some standard format (ex: EBNF). Most platform/languages will have a parser and so you can whip up a "renderer" or converter trivially at any point.

    The only markup I'm finding with a grammar is MediaWiki (sort of..)

    https://www.mediawiki.org/wiki/Markup_spec

    Even Djot doesn't seem to have one. Weird..

  • by asielen on 2/25/25, 11:48 PM

    100% agree. I've been using markdown for a few years after moving away from proprietary note taking apps. Although this has led to me developing my own short hand for many things in my notes. And have been looking at a way to integrate a to-do list with my notes with some Python scripts.

    So while my notes may rely on some personal scripts to get there most value out of them, I strongly value that they are still plain text and I can always move them into a new workflow if I need to.

  • by normalaccess on 2/25/25, 11:30 PM

    I love markdown and use it for all my notes, however it really needs a native way to underline. I have been converting some older books and lectures to markdown and underline is used all the time.

    If anyone has a good solution I'm all ears.

  • by ambivalence on 2/25/25, 11:33 PM

    I wholeheartedly agree with this post. I also keep my notes in Markdown, I also have plenty of Python scripting around them, including automatic publishing of my website.

    I use FSNotes today on macOS and iOS. Both apps are open source, both use well-structured .textbundle directories that separate Markdown content from JSON metadata and binary attachments. Synchronization happens through Git. It's a very powerful combination.

    Ironically, I wrote a blog post some 8 years ago about this very subject. That blog post is now offline.

  • by xenodium on 2/26/25, 1:17 AM

    I built Markdown to web, drag and drop solution: https://lmno.lol

    Here's a demo https://www.youtube.com/watch?v=SykbiVweYH8

  • by roxolotl on 2/25/25, 11:41 PM

    I’ve been self hosting linkding[0] and it has archiving capabilities. Saves in html not markdown but that’s basically the same thing. It’s been very useful and then I back the folder up to R2 for free. I enjoy knowing that if I find something I want to remember it won’t go away. Plus it works great for recipe sites because I don’t have to deal with ads.

    [0]: https://github.com/sissbruecker/linkding

  • by jazz9k on 2/26/25, 12:43 AM

    Obsidian is the killer app for this. I spent a month converting around 3 years of security notes to markdown and now use obsidian to search/archive everything.
  • by headcanon on 2/25/25, 11:25 PM

    I've been doing this recently with every URL I've bookmarked over the last 15 years or so since I signed up for pinboard.in. http://spider.cloud has been really nice for crawling sites and saving the results as markdown. I plan on expanding it to transcribing youtube videos I've saved, github repos I've starred, HN posts, etc.

    Ultimately I'm trying to index my "window" to the web as embedded content in a vector store. Not sure exactly what I'm going to do with it yet but I imagine it will be a component of some kind of personal agent system I can use to reference old info and help as a writing tool or as an "idea generator" of some kind. I'll likely end up not using most of it but you never know.

    I've scraped about 10k markdown files which has created a ~10gb chromadb instance so far. Eventually I'll probably create separate collections based on domain, and filter down items that I care about more.

  • by hamsterbase on 2/26/25, 2:58 AM

    When it comes to web archiving, I've found that Markdown has some real limitations. Sure, it's great for basic text, but it struggles with things like embedded content and non-standard layouts. Try archiving a Twitter thread or an app-style webpage in Markdown, and you'll see what I mean. It just doesn't capture the full picture.

    That's why I've come to prefer formats like webarchive, mhtml, or single HTML files for archiving. They're incredibly faithful to the original content - you get almost perfect rendering of the original page, complete with styling and layout. Plus, they can capture stuff behind paywalls or on logged-in pages, which is a huge plus.

    The real challenge, though, isn't just about saving the content. It's about making that saved content useful. These archive formats are great for preservation, but they can quickly become a mess of unorganized files that are hard to search through or make sense of.

    I think the key is finding ways to organize and interact with these archives more effectively. Things like full-text search across all your saved pages, the ability to add notes or highlights directly on the archived content, and smart tagging systems could go a long way. And it'd be really powerful if we could integrate these archives with other knowledge management tools we use.

    I develop a tool called HamsterBase that seems to address a lot of these issues we've been discussing. t's a local-first app. That means all your data stays on your own device - no need to worry about your personal archives being stored on someone else's servers. There's no sign-up or registration required, which is refreshing in today's cloud-centric world.

  • by misterspaceman on 2/26/25, 12:54 AM

    I've landed on a workflow that I like a lot, and have shown to several people on my team. I use Google Drive for Desktop, which maps the G:\ drive to Google Drive. From there, I use VS Code for Markdown editing.

    Google Docs now supports Markdown files, so if I need to convert the Markdown file to Word or PDF, I just open it in Docs and download it in the format I need. (Pandoc also works for this, as the author mentions). Converting HTML to Markdown can also be done in Docs: copy and paste the web page text into Google Docs, and download the file as Markdown.

    For mobile, I use the DriveSync app to download my notes (Markdown) folder to my phone. Then I use Obsidian to open and edit the files.

  • by xyst on 2/26/25, 12:58 AM

    I adopted obsidian recently to replace notion and it’s been a refreshing change. In its basic state (no plugins), it’s just a bunch of markdown files.

    Very easy to search notes and even have a dedicated folder for diary entries.

  • by huqedato on 2/26/25, 12:18 AM

    My pain is that I couldn't find a decent md viewer for Windows: free, fast, simple, no distractions. Imagine notepad. I have to open my md files with VSCode or Notepad++ (nasty view).
  • by downut on 2/25/25, 11:40 PM

    The underlying purpose of org-mode is to manage this issue (the text part). It doesn't solve it, instead it is a tool for managing the steadily increasing archive organizational complexity within an ever evolving timeline. You reconfigure your archive's implicit schema well now you're in a world of heavy editing. That's life. If you don't have a solid backup strategy, you are going to lose stuff. That's also life. Big binary blobs are a different, equally important problem.

    Sure, keep your archive text in markdown (which one? a dumb person asks). But I'd recommend managing it with org-mode, it doesn't really care what format your text is in.

    (Yeah I saw the footnote mentioning org-mode but that reads to me that org-mode's reference there is entirely about the markup flavor.)

  • by aosaigh on 2/25/25, 10:37 PM

    I'm not surprised this post opens with a link to /r/DataHoarder. Hot take ... I understand the sentiment that you can't trust content on the web to be there forever, but there is also the other side of the argument which is: compulsively saving data is a waste of time and it introduces a cognitive overhead that you'd be better off without.
  • by kreelman on 2/26/25, 12:40 AM

    Hmmm. I see the use in this...

    For me, everything swirls in a lovely vortex towards org-mode. - Literate Programming, tangel/weave - Export to DocX, PDF, HTML - Org-Roam - Time Management.

  • by oneeyedpigeon on 2/26/25, 10:25 AM

    Markdown is a wonderful format (I use it all the time) but it's very narrow and I don't think it's appropriate for storing general 'things we might publish'. You lose a lot of semantics just replacing html with markdown. For a general purpose markup language, I don't think we can beat XML.
  • by jbd0 on 2/26/25, 2:58 PM

    The Markdownload browser extension is super useful for saving webpages as Markdown: https://addons.mozilla.org/en-US/firefox/addon/markdownload/
  • by drivingmenuts on 2/26/25, 2:46 AM

    Keep it in the format appropriate to the information. If just the text is important, Markdown is probably fine. If the structure is important, keep it in HTML. If the layout is important, PDF. You wouldn't store a Gutenberg bible in Markdown, would you?

    (Don't answer that - there's always one asshole who would)

  • by gerdesj on 2/26/25, 1:12 AM

    Mediawiki. Let's balance durability against functionality.

    MW gets you a massively scalable doc store that does not need much room. Most MW instances are MySQL/MariaDB backed and the schema etc is very well described.

    Keep it plain text for "notes" but a MW will be easily discoverable for quite some time from now.

  • by kjs3 on 2/26/25, 12:36 AM

    We never should have stopped using troff.
  • by whatever1 on 2/26/25, 9:24 AM

    Unless it’s math in which case you are screwed.

    Can we have a damn math keyboard and proper character encoding instead of doing shenanigans with latex / office equation editor ?

    Why in this exact text box I cannot type a differential equation ?

  • by runevault on 2/26/25, 12:52 AM

    I actually use a VS Code plugin for this called Dendron. It is in the same vein as Obsidian or Notion, markdown based, and just runs in VSC. Very handy and since plain text works wonderfully in a git repository.
  • by galkk on 2/26/25, 8:51 AM

    Markdown is great, but not a panacea.

    Tables, in particular, just suck, especially if you want to have even slight formatting inside of the cells.

    Unfortunately, it’s either plain-text-readable or rich representation. Pick your poison.

  • by ThinkBeat on 2/26/25, 12:23 AM

    When it comes to text (though I do include Word, PDF, text files, markdown, tex) I like to burn them to dvd.

    I have one of those big dvd "catalogs" that takes 4 discs per side of a page.

    Keep one at home and one at my parents' place.

    I trust them more than usb-sticks. Though that may be irrational.

    But the time for burning files to dvd seems almost over. It is hard /impossible to buy a computer with a dvd drive.

    That is no problem for me since I have a collection fo externals as well as internals. and life is good now since blank dvd media is cheap .

    But again, you need a dvd reader, and in the future, that may become difficult.

  • by t_mann on 2/26/25, 12:46 AM

    Can relate to that sentiment. What I'm still looking for is a simple solution that lets me use simple local files (eg plaintext/markdown; csv or single-page HTML would also be fine) as a backend for a web app (with login, obviously). Basically, I want to have something like a todo.txt that lives on my machine (in the folder that syncs to my cloud storage) but that I can also edit when I'm on my phone. Like using Google sheets as a backend but with a local file.
  • by Beijinger on 2/26/25, 3:50 AM

    Hm. Great.

    I save everything interesting. I have a data folder with letters a-z in it. Something interesting might be saved in HTML or PDF under data/a/ai/programming

    Folders have a problem because the same thing could be saved under data/p/programming/ai

    But it is a start. For everything else, there is recoll. https://www.recoll.org/

  • by kmarc on 2/26/25, 6:06 AM

    Indeed, I also realized that bookmarks are worthless on the long run. When choosing a note taking / knowledge management app, the main decision point was if it has a Firefox extension that can capture a web page into markdown and automatically save into my notes.

    I used to use Joplin, lately switched to Obsidian. Both offer this functionality.

  • by Cilvic on 2/26/25, 10:44 AM

    @OP super inspiring. I'm working on a universal capture SDK, a bit like rewind.ai that would make it easy to grab information from screen and then store as Markdown etc. Have you ever wished for something like that?
  • by abetancort on 2/26/25, 3:13 AM

    PDF/A... It was not that difficult. Don't reinvent the wheel, guys.
  • by mirawelner on 2/27/25, 11:02 PM

    Beware for if you continue down this road you will end up sitting in class taking notes in markdown… yes I did do this… I am afraid I am beyond salvation
  • by paulryanrogers on 2/26/25, 12:10 AM

    My favorite is WikiCreole, with (subset of) HTML as a close second. MD is alright, but too restrictive as a general purpose format for knowledge bases and such.
  • by TheMode on 2/26/25, 1:26 AM

    I have personally started to archive pages I find interesting through a browser extension. Its html/css not markdown but good enough for my needs.
  • by Sincere6066 on 2/26/25, 7:03 AM

    But I hate Markdown.
  • by todotask on 2/26/25, 5:41 AM

    My thought on custom Astro components is that they provide a flexible format that can be converted into MD, HTML, JSON and other formats.
  • by scubbo on 2/26/25, 1:20 AM

    > Even self-hosting isn't foolproof - your content can vanish when you forget to pay for hosting

    I know what they mean - "running applications that you maintain and deploy yourself, on hardware/platforms that you don't" - but this is strange, to my eyes. If it's running on someone else's hardware (whatever it is), then it's not self-*hosted*, surely? It's self-owned, but not self-hosted?

  • by hamsterbase on 2/26/25, 2:57 AM

    When it comes to web archiving, I've found that Markdown has some real limitations. Sure, it's great for basic text, but it struggles with things like embedded content and non-standard layouts. Try archiving a Twitter thread or an app-style webpage in Markdown, and you'll see what I mean. It just doesn't capture the full picture.

    That's why I've come to prefer formats like webarchive, mhtml, or single HTML files for archiving. They're incredibly faithful to the original content - you get almost perfect rendering of the original page, complete with styling and layout. Plus, they can capture stuff behind paywalls or on logged-in pages, which is a huge plus.

    The real challenge, though, isn't just about saving the content. It's about making that saved content useful. These archive formats are great for preservation, but they can quickly become a mess of unorganized files that are hard to search through or make sense of.

    I think the key is finding ways to organize and interact with these archives more effectively. Things like full-text search across all your saved pages, the ability to add notes or highlights directly on the archived content, and smart tagging systems could go a long way. And it'd be really powerful if we could integrate these archives with other knowledge management tools we use.

    It's an interesting problem space, and I think there's a lot of room for innovation in how we approach personal web archiving and knowledge management.

  • by gnuser on 2/26/25, 1:46 PM

    I suggest emacs org mode or asciidoc
  • by k__ on 2/26/25, 12:27 AM

    AsciiDoc is basically DocBook-Markdown, which makes it a medium-independent format.
  • by profsummergig on 2/26/25, 1:38 AM

    Windows here.

    I use VSCode for markdown.

    Obsidian's been coming up on the radar often.

    This post finally made me try it out.

    I like it a lot.

    But there's one reason I won't be using it as my main driver for markdown files: I can't open files that are not in a vault. I have markdown files everywhere on my drive. And I don't want to make the entire drive a vault (for various reasons).

    Obsidian configurable as...

    1) my default file handler for markdown files

    2) capable of opening and saving markdown files in any location on my PC

    ...would be sweet. (From my research, it can't do these currently.)

  • by yawnxyz on 2/26/25, 5:16 AM

    well, I wish I could have saved all my old Flash sound design and game experiments to Markdown, and still be able to play them
  • by croes on 2/26/25, 1:02 AM

    Why not Asciidoc instead of Markdown?
  • by briandear on 2/26/25, 6:15 AM

    I don’t like Markdown because I don’t want to remember a syntax. Most normal people I know have no idea what Markdown even is. The idea that I can’t see my formatting when I’m writing is annoying. What’s the point? It’s like MD is writing code and to “see” the document, you have to run it. In other words what you see is not what you get — you only see what you get when “previewing.”
  • by eviks on 2/26/25, 2:39 AM

    > The format deliberately avoids precise control over display details like font selection4. Following the rule of least power, I consider this limitation a feature. For contrast, consider PDF - a format so powerful that it can run Doom.

    Just pick a more relevant format for contrast to see that this is no feature! It's not like PDF is the only alternative

  • by g8oz on 2/26/25, 4:36 AM

    Markdown is great... But you know what else is great? OPML. We need more tooling around OPML. It's not being used nearly as much as it should be for Personal Knowledge Management.
  • by zackham on 2/26/25, 1:38 AM

    I've used or built more personal knowledge/task/project management tools than I care to list over the years, and adopted various methods along the way. I've ended up in a place where I know what I need day to day: A place to dump my ideas, plans, reflections, and tasks, along with methods of processing and accessing all this data. It's hard to compete with plain text files, a notebook, and structured daily/weekly rituals that process these notes into actionable tasks, meeting agendas, and project docs. It's not that time consuming, it's super effective, and most importantly, it's infinitely and freely customizable because instead of software, you just have checklists and processes to manually follow. You can execute GTD without touching a computer: https://gettingthingsdone.com/wp-content/uploads/2014/10/Wee...

    I can get by just fine with that system, but a handful of months back I started wanting software again. Reminders, task wrangling, workflows around taking meeting notes, taking and processing transcripts of talking through ideas, automated daily and weekly checkins with summaries, project work logs, managing lists of things to talk about with people, the list goes on....

    Same reasons I have always reached for software, and the same reasons I wrote my own system a few times over. But this time I had some new thoughts:

    - I want this to have a chance at being my last system. For that, I must be able to read/edit the data without special software. I settled on committing to building software that interfaces with folders of Markdown files exclusively. I could use Obsidian to cover any gaps and get work done immediately–I don't need my software to do it all right away.

    - I want to own as much of my recorded activity/thoughts as possible, so I can drop it into new AI models, giving them a ton of context about me and what I'm up to, and avoid getting vendor locked to OpenAI.

    - I want ubiquitous access to the system, which means it's gotta be easily used from a phone.

    7k LOC later and I've got a Telegram bot with a plugin architecture and a pile of plugins that implement everything I've described and more. The plugin arch means there's a defined interface and every new piece of functionality never ends up with more than 1k LOC in a file. My objective was to structure the project specifically so I could avoid the pitfalls of AI generated code as projects get large. Everything isolated with well defined integration points.

    I chose Telegram because they have a great API, supporting custom keyboards for quick actions, audio input for taking voice memos that my system transcribes, and reaching out to me with reminders/requests on whatever device I'm on.

    The result is thousands of messages that have translated into a nicely organized Obsidian vault. Couldn't be happier and think there's a chance I'll live with this thing for the foreseeable future–and I can always swap out the interface away from Telegram, build a proper frontend, or drop it altogether and be left with my Markdown files.

    If anyone is interested I'd be happy to share what I've got. Just my private project that I'm reaping a lot of benefit from.

    Here's a quick dump of some of my plugin commands to get a flavor of what I'm talking about: https://gist.github.com/zackham/3c2d061e6dd0127958c913329aa0...

  • by smm11 on 2/26/25, 1:57 AM

    WTF?

    text.txt

    Readable in everything, since forever.