from Hacker News

Did missing/corrupt dates in COBOL default to 1875-05-20?

by SeenNotHeard on 2/16/25, 11:56 PM with 499 comments

  • by Aloisius on 2/17/25, 3:24 AM

    The SSA's master records, the Numerical Identification (NUMIDENT) files, store dates in text as either CCYYMMDD or MMDDCCYY strings according to their archived versions.

    https://www.openicpsr.org/openicpsr/project/207202/

    I have a hard time believing the DB2 systems would convert it to days/seconds/whatever since 1875. It's not impossible, but I think whoever came up with the 1875 thing was simply wrong.

    That's not to say there are 150 year olds collecting social security either. Dates of birth are sometimes missing or entered wrong and sometimes death records don't get entered. It's also clear DOGE didn't understand that social security numbers can't be used as a unique identifiers (nor why it's unnecessary) which can lead to all sorts of issues when processing.

    Edit: It also seems the SSA presumes anyone over 115 is has died and halts payments which makes it even more unlikely there are 150 year old beneficiaries: https://secure.ssa.gov/poms.nsf/lnx/0202602578

  • by bagels on 2/17/25, 9:24 PM

    Social security office knows that there are records in that database without confirmed deaths.

    They use multiple techniques and data sources to determine who to send benefits to.

    This is not news to the SSA.

    https://oig.ssa.gov/assets/uploads/a-06-21-51022.pdf

    "AGENCY COMMENTS SSA disagreed with our recommendations. Agency officials stated that most of the records discussed in the report involve numberholders who do not currently receive SSA payment"

    So, they can do better, but sure, they are sending some relatively small number of checks out to dead people. That doesn't mean Musk needs to lie about the the program as an excuse to cut the whole thing, which is actually what we see playing out.

  • by cwbriscoe on 2/17/25, 1:52 AM

    I have been working on COBOL systems for quite a while now. Currently and for most of my career, we mostly always use DB2 compatible dates ("CCYY-MM-DD").

    Pre-Y2k a lot of dates were created with:

    ACCEPT WS-DATE FROM DATE.

    The above WS-DATE was in YYMMDD format, which is why there was a Y2K issue and needed to be resolved with windowing code. However, windowing code wouldn't work for somebody that was over 100 years old...

    Doing a little research there is also (which I have never used since we just use DB2 and date parameter input files for CCYYMMDD dates):

    ACCEPT WS-CENTURY-DATE FROM CENTURY-DATE.

    This date is in CCYYMMDD format. According to google, the epoch for this date is January 1st, 1601.

  • by hypeatei on 2/17/25, 1:10 AM

    It was posted in the comments that ISO 8601, at one point, mentioned 1875-05-20 as a reference date. According to Wikipedia, it was later omitted[0]. I guess it's possible that the social security system (EDIT: that we know today) was initially designed with that date as a sort of epoch. Either way, it seems nuanced and no one has the full story (including Elon)

    [0]: https://en.wikipedia.org/wiki/ISO_8601

  • by Devasta on 2/17/25, 9:32 AM

    Whether or not Musk is right about 150 year olds getting Social Security doesn't matter, he just wants to destroy the administrative state. So long as that happens, they'll be perfectly happy with the outcome even if they get proven wrong on some technicalities later.

    Prove Musk wrong on this and he'll just go about his day as normal; 20 minutes later there'll be someone tweeting an unhinged screed about how the US government is spending 10 trillion this year changing the name of the Department of Homeland Security to the Department of Homeland Inclusivity to which he'll quote tweet "Interesting" and then he'll set his little band of freaks to cause mayhem somewhere else.

  • by spullara on 2/17/25, 7:14 AM

    Y'all are trying to be very specific about this 150 year old thing when there are vast number of people with ages above 100 that are in the database:

    https://x.com/elonmusk/status/1891350795452654076

       ...
       100-109 4,734,407
       110-119 3,627,007
       120-129 3,472,849
       130-139 3,936,311
       140-149 3,542,044
       150-159 1,345,083
       160-169 121,807
       170-179 6,087
       180-189 695
       190-199 448
       200-209 879
       210-219 866
       220-229 1,039
       240-249 1
       360-369 1
  • by totallynothoney on 2/17/25, 4:04 AM

    It's gonna be hilarious if: this is a data entry error (1957?), the payments are to a surviving widow because the dates just fit [0][1], simply the query was wrong, or more probably Musk just lied.

    [0]. https://en.wikipedia.org/wiki/American_Civil_War_widows_who_... [1]. https://en.wikipedia.org/wiki/Ida_May_Fuller

  • by layer8 on 2/17/25, 2:10 AM

    This is the best debunk I could find: https://iter.ca/post/1875-epoch/
  • by jiggawatts on 2/17/25, 3:49 AM

    I like this comment:

    ----

    There was a time when data structures were made to fit purpose, not compilers. Having a look at the subject, shows clearly the constrains for valid dates:

        Social Security was introduced in 1935.
        To be eligible for benefits one had to
            pay in at least 40 quarters, that's 10 years
            be at least 65 years old
    
    This means the first regular beneficiaries of social security payments were 65 in 1945, aka of the 1880 cohort. Virtually noone participating in this system can be born before 1880. Anyone older will not most likely not be a beneficiary, and anyone younger (aka still paying in) will be, well, younger.

    So add another 5 years for wiggle room and we end at a nice round 1875 as earliest year for any birthday to be recorded.

    A perfect rational base for a date entry, isn't it?

    ----

    The "COBOL doesn't work like that" comments are missing the forest for the trees: This is a very old system with bespoke coding to match legislation, not legislation to match compiler default behaviour.

    Fundamentally, unless a government employee that has worked directly on this codebase speaks up, we're all just guessing.

  • by roshin on 2/17/25, 5:25 AM

    This is why I like hn. Many times I read about all of the horrible things that the current administration does. Due to so many cases where I know the news is wrong I stopped trusting anything. However, I feel like here I can let my guard down a bit and be more certain that a specific criticism is true.
  • by noobermin on 2/17/25, 5:59 AM

    I love how everyone is talking technical details whilst ignoring where it came from. This eliding of context and focusing on a technical question is a great way for people trying to cope with either stress if you're against it or with criticism if you're for it.
  • by bobnick on 2/17/25, 2:15 PM

    I worked with COBOL long before DB2. I also worked on the COBOL compiler in the 70s. Unless things changed drastically if a value wasn't initialized the compiler left garbage in the variable when the program was loaded. If you were lucky this caused the program to ABEND when you tried to use it, if no ABEND you got strange results. It was up to the programmer to set a default date if one was required by the application. Many applications started with VSAM which did not care if a date in a record was invalid. This caused many systems to set default dates when converting to DB2. DB2 does not allow garbage to be loaded into a date field. COBOL does not initialized data unless instructed to do so.
  • by paulsutter on 2/17/25, 5:32 AM

    Here’s a tweet from Elon with a table of ages of all people marked not deceased who are collecting social security

    (hint: there is a smooth trend of people of all ages up to 199, so the 1875 thing was pure misdirection)

    https://x.com/elonmusk/status/1891350795452654076?s=46&t=NN3...

  • by ZeroGravitas on 2/17/25, 9:48 AM

    Related document for how we used to handle this before the "oligarchs tweeting slander" approach gained support:

    https://oig.ssa.gov/assets/uploads/a-06-21-51022.pdf

    Numberholders Age 100 or Older Who Did Not Have Death Information on the Numident

    > The attached final report presents the results of the Office of Audit’s review. The objective was to determine whether the Social Security Administration had effective controls to annotate death information on the Numident records of numberholders who exceeded maximum reasonable life expectancies. Please provide within 60 days a corrective action plan that addresses each recommendation. If you wish to discuss the final report, please call me or have your staff contact Michelle L. Anderson, Assistant Inspector General for Audit

    They decided not to do anything about it (e.g. add a "presumed dead" field) because they thought it would be a waste of money!

    > In response to our 2015 report, SSA considered multiple options, including adding presumed death information to these Numident records. SSA ultimately decided not to proceed because the “. . . options would be costly to implement, would be of little benefit to the agency, would largely duplicate information already available to data exchange consumers and would create cost for the states and other data exchange partners.”16 SSA also believed a regulation would be required to allow it to add death information to these records, and adding presumed death information to the Numident would increase the risk of inadvertent release of living individuals’ personal information in the DMF.

    Submitted here in case anyone wants to discuss the SHOCKINGLY boring REVELATIONS contained within:

    https://news.ycombinator.com/item?id=43077199

  • by Sniffnoy on 2/17/25, 3:49 AM

    However, the MUMPS programming language (which is still commonly used in various medical stuff) does use December 31, 1840 as its epoch. (It doesn't have a separate date type, but it does have date-handling functions which operate on numbers and use this as the epoch.)
  • by hans_castorp on 2/17/25, 6:11 AM

    I worked on a COBOL system in the early 90s that stored a one-digit year :)

    However, the records were never stored for more than 4 years, so this was never a problem.

  • by blindriver on 2/17/25, 1:03 AM

    I keep getting gobsmacked by how much misinformation and straight up lies there are on the internet these days. And what's worse is that I keep falling for it like everyone else, even though I pride myself on being so skeptical about everything. I remember reading that last week and thinking "oh, interesting" and now I'm angry at myself for not questioning that more, especially since I worked at a bank.

    With so much manipulated information, AI-generated content, and straight up lying, I really can't tell what's real and fake anymore.

    I distinctly remember finally not being able to tell the difference between fake and real info during the Allen Texas shopping mall shooting. I went on Twitter to get more info and I couldn't tell what was real and what was fake for the first time because everything was so convincing. That feels like ages ago now because things are so much more sophisticated.

  • by blame-troi on 2/17/25, 1:15 PM

    Ignorant. When I learned COBOL, which would have been contemporary to many of the original systems, data types were numbers of various formats, characters, primitive fixed length strings, and bits. There was no data type for dates. It would have been roll your own. Pre UNIX being mainstream we used '7-4-5' dates in assembly based from 1900 (this was a financial business but not the IRS).

    This isn't a COBOL issue (if it's even an issue at all), it's a data design issue. As many have pointed out, there are reasons for this possible origin date.

  • by jmclnx on 2/17/25, 1:32 AM

    Not on the System (Wang VS) I worked on in the 80s. Plus in many cases dates used a 2 digit year. So 1875 would be seen as 1975.

    Also I never head of this default.

  • by SandyAndyPerth on 2/26/25, 6:38 AM

    As I just posted in a thread https://dev.to/mdchaney/cobol-dates-may-20-1875-and-disinfor...

    Nobody in this HN thread has used the word "sentinel" - see another HN about the concept https://news.ycombinator.com/item?id=36195425

    People got hung up on: - "COBOL defaults to..." rather than "banking practices are..." - epoch start dates - many pointing out COBOL didn't use epochs or counts, just much-damned YYDDD or YYMMDD actual strings.

    Also, Elon loves to stir with partial misinfo hence his tweet https://x.com/elonmusk/status/1891350795452654076 with the breakdowns by age bracket. "Death set to FALSE" means "Death date not known" but that's not clickbaity enough.

    That long tail looks awfully like data entered from historical records lacking death dates - there have been a few discussions of the cost of finding death dates and the decision to avoid spending $millions on it, as this is not data used to make payments.

    You would expect, in a system that's pulling data from many sources, to see historical jumps in data cleanup like this. Imagine a few large states finally get around to digital records of deaths, so their data is easily aggregated - you get a sudden flushing of people who would previously have been left on the list. However, this will only apply from a certain age onwards as those sources in turn don't have the time/budget/interest to digitise really old records.

  • by Calzifer on 2/19/25, 3:26 PM

    I'm late to the party but want to add that "reference date 1875-05-20 defaults to 150 years" explanation makes no sense to me.

    Assuming the reference date is correct and unknown birth date is stored as timestamp 0. Since we are before 2025-05-20 at the moment the reported age has to be 149 years, not 150. What I'm missing? Would be very unusual to round up age.

  • by n0denine on 2/20/25, 6:47 PM

    Perhaps there is a default date of 1875 if the date of birth isn’t known. When slavery ended just before 1875, a large number of them never received an actual birth certificate while enslaved, and likely didn’t know their date of birth… so it would go with the system default.
  • by throw0101d on 2/17/25, 1:48 PM

  • by MrCOBOL on 2/17/25, 3:17 PM

    No! It does not - Someone coded it to default or entered the data that way, and no edit prevented it - and it sounds like no regular maintenance (even a reporting process) is present to indicate ages > 100 (which may raise a flag to be looked into).
  • by hbarka on 2/17/25, 1:48 AM

    The question of unknown and how to default this value in databases. Since null is not usually possible, there have been hacks on how to do it. 1-1-9999 anyone?
  • by dashundchen on 2/17/25, 4:30 AM

    Why are we taking Musk at his word when his current MO is to cast doubt and mistrust on government spending?

    Musk, Trump and the admin have already been pushing so many outright lies via their propaganda channels. Despite being deunked the lies are repeated nonstop.

    The lie that USAID spent $50 million on condoms in Gaza (the money was for running hospitals) for example.

    Or the lie about funding an opera about a transgender woman (the money was for a university in Columbia, unrelated to a performance put on at the school).

    They are spewing lies left and right. Lies go twice around the block before the truth has put its pants on.

    Why should we believe him on this?

  • by Modulius on 2/18/25, 6:53 PM

    Probably will go unnoticed or downvoted because of twitter link but here is the main goal of spewing such a bullshit:

    https://x.com/KariLake/status/1891841704703013067

    Look at the comments. They know that COBOL defaults to 1875, point is that propaganda pundits look for any reason to spew toxic misinformation and rile 99.999% of uniformed sheeple that voted for orange felon.

  • by jedwards1211 on 2/19/25, 7:02 AM

    It's horrifying to see even the Associated Press repeating this claim a truth. AP links to the Wired article, which links to the same old ExTwitter post that originated the claim. Neither did any fact checking ffs
  • by UltraSane on 2/17/25, 8:43 AM

    Elon Musk is a liar with no credibility whatsoever. Don't believe anything he says.
  • by jamesrom on 2/17/25, 6:54 AM

    Musk said: “Crazy things like just cursory examination of Social Security and we’ve got people in there that are 150 years old.”

    He qualified this claim as a “cursory examination”. It’s clearly a comment about the quality of the data and systems. That this is the kind of thing that would be prone to fraud.

    Before you hit downvote, please provide evidence that you didn’t hallucinate Musk’s claims here.

  • by NoPicklez on 2/17/25, 3:24 AM

    The part of all of this which I have a problem with from the outside, is that it seems extremely irresponsible to address the White House and the public to say you have found people that are 150 years old receiving social security payments in such a provocative way. As if you have so simply stumbled across this error that has gone unnoticed.

    There’s been no formal inquiry, there was no mention of any of the checks and balances that may have been occurring, there’s been no nuance to the argument from him at all. Okay maybe there are people that look to be 150 years old, is there a reason why? Were they actually being paid the social security or was there a legitimate exception? Maybe people were but it was so few and far between and was an internal controls issue which all governments and companies have globally or was none of it true and your team simply assumed they were being paid.

    After reading an article where Gov security experts were worried because they were having to give Elon’s team access to Putty and SQL tools. It seems like people are going through this data and making inferences that may not be completely true or vetted.

  • by laurent_du on 2/17/25, 11:30 AM

    So the bottom line is that a left-wing activist put out some lies in order to discredit Musk, and every single reader of the original misinformation piece swallowed it without any attempt to analyze it critically. And they say that right-wingers are the ones who are more amenable to propaganda?

    Yet everyone's conclusion is, as usual, Musk bad. Nobody cares about left-wing fakes news if they are useful to the agenda being pushed forward.

  • by msie on 2/17/25, 10:43 PM

    Imagine all the wasted cycles dealing with Trump's/Elon's/DOGE misadventures.
  • by cjbgkagh on 2/17/25, 3:45 PM

    I assume this was in response to Elons claim of finding a number of people aged 150 years old and the response that this was due to COBOL default date and how this was an example of the people doing the datamining were incompetent.

    Someone would have to be really incompetent to find a spike at precisely 150 years old and not investigate it further. Elon tweeted ~ 10 hours ago the age breakdown and there does not appear to be a spike at 150 so if that information is correct then this is no longer evidence of incompetence.

  • by ggm on 2/17/25, 1:11 AM

    I find the stack exchange interesting, but I still think its post-hoc reasoning. I would prefer one of the people who worked on these systems to say it, than have a plausible, but none-the-less third hand take published as "the answer"