from Hacker News

You've just inherited a legacy C++ codebase, now what?

by broken_broken_ on 2/29/24, 2:11 PM with 347 comments

by eschneider on 2/29/24, 8:48 PM
Some good advice here, and some more...controversial advice here.
After inheriting quite a few giant C++ projects over the years, there are a few obvious big wins to start with:
* Reproducible builds. The sanity you save will be your own. Pro-tip: wrap your build environment with docker (or your favorite packager) so that your tooling and dependencies become both explicit and reproducable. The sanity you save will be your own.
* Get the code to build clean with -Wall. This is for a couple of reasons. a) You'll turn up some amount of bad code/undefined behavior/bugs this way. Fix them and make the warning go away. It's ok to #pragma away some warnings once you've determined you understand what's happening and it's "ok" in your situation. But that should be rare. b) Once the build is clean, you'll get obvious warnings when YOU do something sketchy and you can fix that shit immediately. Again, the sanity you save will be your own.
* Do some early testing with something like valgrind and investigate any read/write errors it turns up. This is an easy win from a bugfix/stability point of view.
* At least initially, keep refactorings localized. If you work on a section and learn what it's doing, it's fine to clean it up and make it better, but rearchitecting the world before you have a good grasp on what's going on globally is just asking for pain and agony.
by Jtsummers on 2/29/24, 4:33 PM
I'd swap 2 and 3. Getting CI, linting, auto-formatting, etc. going is a higher priority than tearing things out. Why? Because you don't know what to tear out yet or even the consequence of tearing them out. Linting (and other static analysis tools) also give you a lot of insight into where the program needs work.
Things that get flagged by a static analysis tool (today) will often be areas where you can tear out entire functions and maybe even classes and files because they'll be a re-creation of STL concepts. Like homegrown iterator libraries (with subtle problems) that can be replaced with the STL algorithms library, or homegrown smart pointers that can just be replaced with actual smart pointers, or replacing the C string functions with C++'s on string class (and related classes) and functions/methods.
But you won't see that as easily until you start scanning the code. And you won't be able to evaluate the consequences until you push towards more rapid test builds (at least) if not deployment.
by vijucat on 2/29/24, 10:51 PM
Not mentioned were code comprehension tools / techniques:
I used to use a tool called Source Navigator (written in Tcl/tk!) that was great at indexing code bases. You could then check the Call Hierarchy of the current method, for example, then use that to make UML Sequence Diagrams. A similar one called Source Insight shown below [1].
And oh, notes. Writing as if you're teaching someone is key.
Over the years, I got quite good at comprehending code, even code written by an entire team over years. For a brief period, I was the only person actively supporting and developing an algorithmic trading code base in Java that traded ~$200m per day on 4 or 5 exchanges. I had 35 MB of documentation on that, lol. Loved the responsibility (ignoring the key man risk :|). Honestly, there's a lot of overengineering and redundancy in most large code bases.
[1] References in "Source Insight" https://d4.alternativeto.net/6S4rr6_0rutCUWnpHNhVq7HMs8GTBs6...
by Night_Thastus on 2/29/24, 4:38 PM
>worry not, by adding std::cmake to the standard library and you’ll see how it’s absolutely a game changer
I'm pretty sure my stomach did somersaults on that.
But as for the advice:
>Get out the chainsaw and rip out everything that’s not absolutely required to provide the features your company/open source project is advertising and selling
I hear you, but this is incredibly dangerous. Might as well take that chainsaw to yourself if you want to try this.
It's dangerous for multiple reasons. Mainly it's a case of Chesterton's fence. Unless you fully understand why X was in the software and fully understand how the current version of the software is used, you cannot remove it. A worst case scenario would be that maybe a month or so later you make a release and the users find out an important feature is subtly broken. You'll spend days trying to track down exactly how it broke.
>Make the project enter the 21st century by adding CI, linters, fuzzing, auto-formatting, etc
It's a nice idea, but it's hard to do. One person is using VIM, another is using emacs, another is using QTCreator, another primarily edits in VSCode.. Trying to get everyone on the same page about all this is very, very hard.
If it's an optional step that requires that they install something new (like commit hook) it's just not going to happen.
Linters also won't do you any good when you open the project and 2000+ warnings appear.
by keepamovin on 2/29/24, 4:04 PM
It's funny. My first step would be
```
  0. You reach out to the previous maintainers, visit them, buy them tea/beer and chat (eventually) about the codebase. Learned Wizards will teach you much.
```
But I didn't see that anywhere. I think the rest of the suggestions (like get it running across platform, get tests passing) are useful stress tests likely to lead you to robustness and understanding however.
But I'd def be going for that sweeet, sweet low-hangin' fruit of just talking to the ol' folks who came that way before. Haha :)
by mk_chan on 2/29/24, 8:09 PM
I’m not sure why there’s so much focus on refactoring or improving it. When a feature needs to be added that can just be tacked onto the code, do it without touching anything else.
If it’s a big enough change, export whatever you need out of the legacy code (by calling an external function/introducing a network layer/pulling the exact same code out into a library/other assorted ways of separating code) and do the rest in a fresh environment.
I wouldn’t try to do any major refactor unless several people are going to work on the code in the future and the code needs to have certain assumptions and standards so it is easy for the group to work together on it.

by summerlight on 3/1/24, 1:57 AM

This thread has lots of good advice. I'll add some of mine, not limited to C/C++. If you have luxury of using VCS, make a full use of its value. Many teams only use it as a tool merely for collaboration. VCS can be more than that. Pull the history then build a simple database. It doesn't have to be an RDB (it's helpful though); a simple JSON file or even a spreadsheet file is a good starter. There are so many valuable information to be fetched with just a simple data driven approach, almost immediately.

  * You can find out the most relevant files/functions for your upcoming works. If some functions/files have been frequently changed, then it's going to be the hot spot for your works. Focus on them to improve your quality of life. If you want to introduce unit tests? Then focus on the hot spot. Suffer from lots of merge conflicts? The same.
  * You can also figure out correlation among the project and its source files. Some seemingly distant files are frequently changed together? Those might suggest an implicit structure that not might be clear from the code itself. This kind of information from external contexts can be useful to understand the bird's eye view.
  * Real ownership models of each module can be inferred from the history. Having a clear ownership model helps, especially if you want to introduce some form of code review. If some code/data/module seems to have unclear ownership? That might be a signal for refactoring needs.
  * Specific to C/C++ contexts, build time improvements could be focused on important modules, in a data driven way. Incremental build time matters a lot. Break down frequently changed modules rather than blindly removing dependencies on random files. You can even combine this with header dependency to score the module with the real build time impact.

There could be so many other things if you can integrate other development tools with VCS. In the era of LLM, I guess we can even try to feed the project history and metadata to the model and ask for some interesting insights, though I haven't tried this. It might need some dedicated model engineering if we want to do this without a huge context window but my guts tell that this should be something worth try.

by bArray on 2/29/24, 5:09 PM
> 3. Make the project enter the 21st century by adding CI, linters, fuzzing, auto-formatting, etc
I would break this down:
a) CI - Ensure not just you can build this, but it can be built elsewhere too. This should prevent compile-based regressions.
b) Compiler warnings and static analysers - They are likely both smarter than you. When it says "warning, you're doing weird things with a pointer and it scares me", it's a good indication you should go check it out.
c) Unit testing - Set up a series of tests for important parts of the code to ensure it performs precisely the task you expect it to, all the way down to the low level. There's a really good chance it doesn't, and you need to understand why. Fixing something could cause something else to blow up as it was written around this bugged code. You also end up with a series of regression tests for the most important code.
n) Auto-formatting - Not a priority. You should adopt the same style as the original maintainer.
> 5. If you can, contemplate rewrite some parts in a memory safe language
The last step of an inherited C++ codebase is to rewrite it in a memory safe language? A few reasons why this probably won't work:
1. Getting resources to do additional work on something that isn't broken can be difficult.
2. Rather than just needing knowledge in C++, you now also need knowledge in an additional language too.
3. Your testing potentially becomes more complex.
4. Your project likely won't lend itself to being written in multiple languages, due to memory/performance constraints. It must be a significantly hard problem that you didn't just write it yourself.
5. You have chosen to inherit a legacy codebase rather than write something from scratch. It's an admittance that you don't have some resource (time/money/knowledge/etc) to do so.
by delta_p_delta_x on 2/29/24, 10:37 PM
> So what do I recommend? Well, the good old git submodules and compiling from source approach.=
It is strange that the author complains so much about automating BOMs, package versioning, dependency sources, etc, and then proceeds to suggest git submodules as superior to package managers.
The author needs to try vcpkg before making these criticisms; almost all of these are straightforwardly satisfied with vcpkg, barring a few sharp edges (updating dependencies is a little harder than with git submodules, but that's IMO a feature and not a bug—dependencies are built in individual sandboxes which are then installed to a specified directory. vcpkg can set internal repositories as the registry instead of the official one, thus maintaining the 'vendored in' aspect. vcpkg can chainload toolchains to compile everything with a fixed set of flags, and allows users to specify per-port customisations.
These are useful abstractions and it's why package managers are so popular, rather than having everyone deal with veritable bedsheets' worth of strings containing compile flags, macros, warnings, etc.
by tehnub on 2/29/24, 4:54 PM
I enjoyed the article and learned something. But I've been wondering: When people say "rewrite in a memory-safe language", what languages are they suggesting? Is this author rewriting parts in Go, Java, C#? Or is it just a smirky, plausibly deniable way of saying to rewrite it in Rust?
by Kon-Peki on 2/29/24, 10:04 PM
The article doesn't mention anything about global variables, but reducing/eliminating them would be a high priority for me.
The approach I've taken is, when you do work on a function and find that it uses a global variable, try to add the GV as a function parameter (and update the calling sites). Even if it's just a pointer to the global variable, you now have another function that is more easily testable. Eventually you can get to the point where the GV can be trivially changed to a local variable somewhere appropriate.
by bluetomcat on 2/29/24, 4:46 PM
Been there, done that. Don't be a code beauty queen. Make it compile and make it run on your machine. Study the basic control-flow graph starting from the entry point and see the relations between source files. Debug it with step-into and see how deep you go. Only then can you gradually start seeing the big picture and any potential improvements.
by VyseofArcadia on 2/29/24, 8:19 PM
> Get out the chainsaw and rip out everything that’s not absolutely required to provide the features your company/open source project is advertising and selling
Except every legacy C++ codebase I've worked on is decades old. Just enumerating the different "features" is a fool's errand. Because of reshuffling and process changes, even marketing doesn't have a complete list of our "features". And even it there was a complete list of features, we have too many customers that rely on spacebar heating[0] to just remove code that we think doesn't map to a feature.
That's if we can even tease apart which bits of code map to a feature. It's not like we only added brand new code for each feature. We also relied on and modified existing code. The only code that's "safe" to remove is dead code, and sometimes that's not as dead as you might think.
Even if we had a list of features and even if code mapped cleanly to features, the idea of removing all code not related to "features your company is advertising or selling" is absurd. Sometimes a feature is so widely used that you don't advertise it anymore. It's just there. Should Microsoft remove boldface text from Word because they're not actively advertising it?
The only way this makes sense is if the author and I have wildly different ideas about what "legacy" means.
[0] https://xkcd.com/1172/
by ecshafer on 2/29/24, 4:34 PM
This is pretty great advice for any legacy code project. Even outside of C++ there is a huge amount of code bases out there that do not compile/run on a dev machine without tons of work. I once worked on a Java project that due to some weird dependencies, the dev mode was to run a junit test which started spring and went into an infinite loop. Getting a standard run to work helped a ton.
by grandinj on 2/29/24, 5:42 PM
This is generally the same path that LibreOffice followed. Works reasonably well.
We built our own find-dead-code tool, because the extant ones were imprecise, and boy oh boy did they find lots of dead stuff. And more dead stuff. And more dead stuff. Like peeling an onion, it went on for quite a while. But totally worth it in the end, made various improvements much easier.
by pvarangot on 2/29/24, 11:10 PM
Besides what everyone else told you make sure you are making at least 250k/y
by ilitirit on 3/1/24, 6:02 AM
I've had to do this several times in the past. Honestly, my best advice would probably be make several backups, then to do as little as possible. If you need to make a small change, fine. Bigger changes? Consider if you can't do the bulk of the work in a technology or stack you understand and only make a small change to the legacy code base.
Most of the time I spend with C++ code revolves around figuring out compile/link errors. Heaven forbid you need to deal with non-portable `make` files that for some reason work on the old box, but not yours... Oh, and I hope you have a ton of spare space because of some reason building a 500k exe takes 4GB.
Keep in mind, this advice only applies to inherited C++ code bases. If you've written your own or are working on an actively maintained project these are non-issues. Sort-of.
by sjc02060 on 2/29/24, 4:25 PM
A good read. We recently did "Rewrite in a memory safe language?" successfully. It was something that shouldn't have been written in C++ in the first place (it was never performance sensitive).
by hgs3 on 2/29/24, 11:23 PM
Rewriting is questionable. Joel Spolsky has a famous blog post about this from two decades ago that's still relevant today [1].
[1] https://www.joelonsoftware.com/2000/04/06/things-you-should-...
by sk11001 on 2/29/24, 4:49 PM
Is it worth getting more into C++ in 2024? Lots of interesting jobs in finance require it but it seems almost impossible to get hired without prior experience (with C++ and in finance).
by jcarrano on 2/29/24, 9:58 PM
Good points, but this is not something you can solve with a recipe. Investigate, talk to people and make sure you are solving actual problems and prioritizing the right tasks.
This is an extremely crucial step that you must do first: familiarize yourself with the system, its uses and the reasons it works like it does. Most things will be there for a reason, even if not written to the highest standard. Other parts might at first sight seem very problematic yet be only minor issues.
Be careful with number 4 and 5. Do not rush to fix or rewrite things just because they look like they can be improved. If it is not causing issues and it is not central to the system, better spend your resources somewhere else.
Get the team to adopt good practices, both in the actual code and in the process. Observe the team and how they work and address the worst issues first, but do not overwhelm them. They may not even be aware of their inefficiencies (e.g. they might consider complete rebuilds as something normal to do).
by happyweasel on 3/1/24, 12:34 PM
If you have a codebase with lots and lots of tests, you are not in a bad place. Remember legacy means a codebase that works and solved and still solves problems over decades. In a sense,a successfull software project implies it will be marked as legacy. Always prefer legacy over Hype.
by bobnamob on 2/29/24, 10:04 PM
This article (and admittedly most comments here) doesn't emphasize the value of a comprehensive e2e test suite enough.
So much talk about change and large LoC deltas without capturing the expected behavior of the system first
by rurban on 3/1/24, 7:01 AM
I just went this very same dance with an old project, smart, which evaluates string matching algorithms. Faster strstr(). From 2013. It was in a better shape than zlib, but still.
Their shell build script was called makefile, kid me not. So first create a proper dependency management: GNUmakefile. A BSD makefile would have been prettier, but not many are used to this. dos2unix, chmod -x `find . -name *.c -o name *\.h`, clang-format -i All in seperate commits.
Turns out there was a .h file not a header, but some custom list of algorithm states, broken by fmt. Dontg do that. Either keep it a header file, or rename it to .lst or such.
Fix all the warnings, hundreds. Check with sanitizers. Check the tests, disable broken algorithms, and mark them as such.
Improve the codebase. There are lots of hints of thought about features. write them. Simplify the state handling. Improve the tests.
Add make check lint. Check all the linter warnings.
Add a CI. Starting with Linux, Windows mingw, macos and aarch64. Turns out the code is Linux x64 only, ha. Make it compat with sse checks, windows quirks.
Waiting for GH actions suck, write Dockerfiles and qemu drivers into your makefile. Maybe automake would have been a better idea after all. Or even proper autoconf.
Find the missing algorithms described elsewhere. Add them. Check their limitations.
Reproducible builds? Not for this one, sorry. This is luxury. Rather check clang-tidy, and add fuzzing.
https://github.com/rurban/smart
by rayiner on 3/1/24, 3:18 PM
This is excellent advice, especially the list of what not to do. I don’t think it’s just C++, it’s just C++, it’s working with any legacy code base. You gotta approach it on its own terms, and analyze and fully understand what’s happening before you start changing things.
I observed from afar when the Gwydion Dylan folks (the Dylan successor to the CMU CL compiler) inherited Harlequin’s Dylan compiler and IDE and decided to switch to that going forward: https://opendylan.org. The work (done out in the open in public mailing lists and IRC) is a very nicely done case study in taking a large existing code base developed by someone else, studying it, and refactoring it to bring it incrementally into the present. They started with retooling the build system and documenting the internals. Then over time they addressed major pain points, like creating an LLVM backend to avoid the need to maintain custom code generators.
by t43562 on 3/1/24, 9:41 AM
I think the very best thing one can do is reduce the amount of variation you have to support. The burden of change is thus vastly reduced and the number of possible avenues for improvement explodes.
We could have left customers with old operating systems on the older versions of the product. A lot of them never upgraded anyhow. We absolutely destroyed our productivity by not making this kind of decision. We also really hurt ourselves by supporting Windows - as soon as there are 2 or more completely different compilers things turn to **t. I'm not even sure we made much money from it.
Given the ability to use new tools (clang, gcc and others) that are only available on newer operating systems we could have done amazing things. All those address sanitizers etc would have been wonderful and I would like to have done some automated refactoring which I know clang has tools for.
Most of the problems were just with understanding the minds of the developers - they were doing something difficult and at a level of complexity that somewhat overmatched the problem most of the time but the complexity was there to handle the edge cases. I wanted to go around adding comments to the files and classes as I understood bits of it. I was working with one of the original developers who was of course not at all interested in anyone understanding it or making it clearer and this kind of effort tended to get shot down.
If you don't have good tests you're dead in the water. I have twice inherited python projects without tests at all and those were a complete nightmare until I added some. One was a long running build process in which unit tests were only partially helpful. Until I came up with a fake android source tree that could build in under a minute I was extremely handicapped. Once I had that everything started to get much better.
My favorite game ... is an open source C++ thing called warzone2100 - no tests. It's not easy to make changes with confidence. I imagine to myself that one day my contribution will be to add some. The problem is that I cannot imagine the current developers taking all that kindly to it. Some people get to competence in a codebase and leave it at that.
by myrmidon on 2/29/24, 5:04 PM
Really liked it! Especially the "get buy in" is really good advice-- always stressing how the effort spent on refactoring actually improves things, and WHY its necessary.
Something that's kinda implied that I would really stress: Establish a "single source of truth" for any release/binary that reaches production/customers, before even touching ANY code (Ideally CI. And ideally builds are reproducible).
If you build from different machines/environments/toolchains, its only a matter of time before that in itself breaks something, and those kinds of problems can be really "interesting" to find (an obscure race condition that only occurs when using a newer compiler, etc.)
by sega_sai on 2/29/24, 8:53 PM
To be honest a lot of recommendations apply to other languages as well. I.e. start with tests only then change, add autoformatting etc. At least I had experience of applying a similar sequence of steps to a python package.
by cljacoby on 3/1/24, 3:34 PM
Despite being framed as something for legacy C/C++ codebases, this is pretty good advice for setting up testing and CI automation around any project.
I recently started on a new Rust project, and despite not having to worry about things like sanitizers as much, I followed a similar approach of getting it to compile locally, getting it compile in a docker container, setup automated CI/CD against all PRs.
Although I would order the steps as 1, 3, 4, 2. Don't get out the chainsaw until you have CI/CD tests evaluating your code changes as you go.
by joshmarinacci on 2/29/24, 10:26 PM
You need to install linters and formatters and security checkers. But you need to start using them incrementally. Trying to fix all the issues found at once is a quick recipe for madness. I suggest using clang-tidy with a meta-linter like Trunk Check
docs:
https://docs.trunk.io/check/configuration/configuring-existi...)
by jxramos on 3/1/24, 8:45 AM
I like to use cppdepend to navigate a large and unfamiliar codebase https://www.cppdepend.com. The interactive dependency graph and integration to the editor to jump back and forth in diagrams to actual code and the many logical constructs in the source certainly accelerates getting a quick sense of the layering and a bit of the architecture of the project.
by cesaref on 3/1/24, 2:36 PM
I think the approach suggested in 'Working effectively with Legacy Code' (https://www.oreilly.com/library/view/working-effectively-wit...) is the right one. It's all about testing, and the confidence to make changes.
So buy the book, read it, apply the ideas.
by w10-1 on 3/1/24, 1:55 AM
Once you have the SCM in order, and before you make any changes:
Structure101 is the best way to grok the architecture er tangles of a large code base. They have a trial period that would give you the overview, but their refactoring support is fantastic (in Java at least).
https://structure101.com/products/workspace/
by FpUser on 2/29/24, 11:00 PM
And I have at some point inherited 5000+ files long legacy PHP code. I had to write Python program to parse that insanity to look for particular patterns and report those for manual fixing or do it automatically if possible. The example would be database access. That single software used 5 different methods to access it.
So no. I would not call C++ any special in this regards.
by AtlasBarfed on 3/2/24, 2:25 AM
1) Make sure you're getting paid either exorbitantly well, or at least hourly
2) Ask how long it would take to write from scratch? Only ask if the answer to #1 is "salaried well" btw.
Dumping an unsupported codebase on a new employee is a major dick move, and it requires hazard pay. They are doing that because they are desperate.
by Jean-Papoulos on 3/1/24, 7:23 AM
A lot of this assumes the codebase is testable, but most of those legacy applications rely on global state a lot...
by rwmj on 2/29/24, 4:07 PM
Surprisingly good advice. In a similar vein, Joel's 12 steps to better software: https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...
by dureuill on 2/29/24, 10:28 PM
> What do you do now?
Look for another job
> You’d be amazed at how many C++ codebase in the wild that are a core part of a successful product earning millions and they basically do not compile.
Wow I really hope this is hyperbole. I feel like I was lucky to work on a codebase that had CI to test on multiple computers with WError
by JoeAltmaier on 3/1/24, 2:33 PM
Read it. A little every day until you've passed your eyes over all of it.
Make notes about mysterious things. Check them off once you've figured them out.
Try to find the 'business case' database. You will fail, nobody has one. Make one then, while you're reading the code.
by victorronin on 3/1/24, 8:13 PM
https://medium.com/@victor.ronin/hn-inherited-the-worst-code...
by Kapura on 2/29/24, 4:57 PM
> Get out the chainsaw and rip out everything that’s not absolutely required to provide the features your company/open source project is advertising and selling
Great advice! People do not often think about the value of de-cluttering the codebase, especially _before_ a refactor.
by geoelectric on 2/29/24, 11:43 PM
If the blog author is lurking on here, I tried to bookmark the article since it's very relevant to my current situation, but it didn't have a title set for the page. NBD to copy/paste one in, but it took me by surprise.
by codelobe on 2/29/24, 9:43 PM
My first thing is usually:
```
    #0: Replace the custom/proprietary Hashmap implementation with the STL version.
```
Once upon a time, C++ academics brow beat the lot of us into accepting Red-Black-Tree as the only Map implementation, arguing (in good faith yet from ignorance) that the "Big O" (an orgasm joke, besides others) worst case scenario (Oops, pregnancy) categorized Hash Map as O(n) on insert, etc. due to naieve implementations frequently placing hash colliding keys in a bucket via linked list or elsewise iterating to other "adjacent" buckets. Point being: The One True Objective Standard of "benchmark or die" was not considered, i.e., the average case is obviously the best deciding factor -- or, as Spock simply logic'd it, "The needs of the many outweigh the needs of the few".
Thus, it came to pass that STL was missing its Hashmap implementation; And since it is typically trivial (or a non issue) to avoid "worst case scenario" (of Waat? A Preggers Table Bucket?), e.g., use of iterative re-mapping of the hashmap. So it was that many "legacy" codebases built their own Hashmap implementations to get at that (academically forbidden) effective/average case insert/access/etc. sweet spot of constant time "O(1)" [emphasis on the scare quotes: benchmark it and see -- there is no real measure of the algo otherwise, riiight?]. Therefore, the affore-prophesied fracturing of the collections APIs via the STL's failure to fill the niche that a Hashmap would inevitably have to occupy came to pass -- Who could have forseen this?!
What is done is done. The upshot is: One can typically familiarize oneself with a legacy codebase whilst paying lip service to "future maintainability" by (albeit usually needless) replacing of custom Hashmap implementations with the one that the C++ standards body eventually accepted into the codebase despite the initial "academic" protesting too much via "Big O" notation (which is demonstrably a sex-humor-based system meant to be of little use in practical/average case world that we live in). Yes, once again the apprentice has been made the butt of the joke.
by asah on 3/1/24, 11:07 AM
I love how HNers assume there are automated tests with any amount of test coverage. So cute!
...but grandpa, how did you know your code worked? Son, we didn't. <silence>
(wait until they hear that source control wasn't used...)
by davidw on 2/29/24, 4:53 PM
Well, tomorrow is the "who's hiring?" thread...
by Scubabear68 on 2/29/24, 5:53 PM
The “rip everything out” step is not recommended. You will break things you don’t understand, invoke Chesterson’s Fence, and create enormous amounts of unnecessary work for yourself.
Make it compile, automate what you can, try not to poke the bear as much as you can, pray you can start strangling it by porting pieces to something else over time.
by girafffe_i on 3/1/24, 2:48 AM
Rewrite it in Rust.
by bfrog on 3/1/24, 3:57 PM
How legacy we talking? Needs turbo C++ legacy?
by m_a_g on 3/1/24, 12:24 PM
I don't want to be that person, but I'd change teams or move to a different company.
by xchip on 3/1/24, 10:18 AM
Write unit tests to make sure your refactoring work.
It better, change job to do something more interesting
by huqedato on 2/29/24, 4:56 PM
Whenever I inherited a project containing legacy code, regardless of the frameworks, tools, or languages used, we always found it necessary to drop it and begin anew. Despite my efforts to reuse, update, or refactor it, we inevitably reached a point where it was unusable for further development.
by Cloudef on 3/2/24, 2:28 AM
1. Setup nix flake 2. Replace the build system with build.zig
by jeffrallen on 2/29/24, 8:06 PM
This was my job at Cisco. But it was a C code base, which used nonstandard compiler extensions, and so could not be built without the legacy compilers with their locally made extensions. Also the "unit tests" were actually hardware-in-the-loop tests. And the Makefiles referenced NFS filesystems automounted from global replicas, but none of them were on my continent.
Fun times. Don't work there anymore. Life is good. :)
by elzbardico on 2/29/24, 9:30 PM
1990 Windows C++ code? Consider euthanasia as a painless solution.
by btbuildem on 3/1/24, 2:44 PM
> Get the build working on your machine
Nope. Make a portable/virtualized env, and make it build there. That way it'll build on you machine, in the CI pipeline, on your co-worker's machine, etc etc.
> linters, fuzzing, auto-formatting
No, for at least two reasons:
1) Too risky, you change some "cosmetic" things and something will quietly shift in the depths, only to surface at the worst possible time
2) Stylistic refactors will bury the subtle footprints of the past contributors - these you will need as you walk through the tunnels on your forensic missions to understand Wat The Fuk and Why
Generally, touch as little as possible, focus on what adds value (actual value, like, sales dollars). The teetering jenga tower of legacy code is best not approached lightly.
Work with PMs to understand the roadmap, talk to sales and support to understand what features are most used and valuable. The code is just an artifact, a record of unhappy accidents and teeth-grinding compromises. Perhaps there's a way to walk away from it altogether.
by cratermoon on 2/29/24, 5:01 PM
Kill It with Fire https://nostarch.com/kill-it-fire
by cloudhan on 3/1/24, 2:40 AM
You run or your code runs, choose one and choose it wisely ;)
by Merik on 2/29/24, 9:08 PM
if you have access to Gemini Pro 1.5 you could put the whole code base into the context and start asking questions about architecture, style, potential pain paints etc.
by Dowwie on 3/1/24, 11:37 AM
Immediately rewrite everything in Rust.
by leecarraher on 2/29/24, 4:13 PM
create bindings and externalize function libraries for other languages, hope to your prefered deity nothing breaks
by professorTuring on 2/29/24, 8:20 PM
How good is AI refactoring the code? Haven’t tried it yet, but… as someone who has need to work on tons of legacy in the past… looks interesting!
by sealeck on 2/29/24, 4:38 PM
rm -r
Problem solved
by throw_m239339 on 3/1/24, 4:10 AM
I quit. Life is short.
by nottorp on 3/1/24, 11:06 AM
> Most people resort to using the system package manager, it’s easy to notice because their README looks like this:
... and goes on against using system packages.
Well if you're not using what your OS provides, why don't you statically link?
After all, it's the customer who pays for the extra storage and ram requirements, not you.
by bun_terminator on 2/29/24, 4:36 PM
> Rewrite in a memory safe language?
like c++11 and later?
by dirkc on 3/1/24, 8:08 AM
My take:
1. Source control
2. Reproducible builds
3. Learn the functionality
4. ...
If you don't understand what the code does, you're probably going to regret any changes you make before step 3!
by EvgeniyZh on 3/1/24, 1:20 AM
RiiR
by hesdeadjim on 2/29/24, 4:34 PM
Start grinding leetcode and find another gig?
by girafffe_i on 3/1/24, 3:05 AM
Lol no one reads, just RIIR.
by BlueTemplar on 3/1/24, 6:23 AM
> If you’re not doing [CI] already as a developer, I don’t think you really have entered the 21st century yet.
Ah yes, nothing like a veiled insult towards the people that might need that advice the most, blaming them for not using a developer paradigm that you even fail to name and for which you only give a hard to (directly) search two-letter acronym ! (/s)
( CI stands for Continuous Integration :
https://en.wikipedia.org/ wiki/Continuous_integration )
by beanjuiceII on 2/29/24, 8:05 PM
the whitehouse says i should RiiR
by legacybob on 3/1/24, 9:59 AM
Every morning I wake up in my legacy bed, before taking breakfast using a legacy coffee cup. I then take a shower using - you guessed it - legacy shower taps (after all, I do live in a legacy building).
I then sit on my legacy chair to browse the Internet and read about brand new programming things (through a legacy monitor).
by jujube3 on 2/29/24, 9:11 PM
Look, I'm not saying you should rewrite it in Rust.
But you should rewrite it in Rust.