by Asparagirl on 1/13/25, 4:22 AM with 72 comments
Back in September 2017, our organization made a Freedom of Information Act (FOIA) request to the US Department of Veterans Affairs (the VA) asking for a copy of a database they maintain called "BIRLS", which stands for the Beneficiary Identification Records Locator Subsystem. While it's not exactly an index of every single post-Civil-War veteran of every branch of the US military, it's possibly the closest thing that exists to it.
BIRLS is a database that indexes all the known-to-the-VA-in-or-after-the-1970s *veterans' benefits claims files*, also called C-Files or sometimes XC-Files. Older veterans' claims files have been moved to the National Archives (NARA), such as the famous Civil War pension files. But 95% of the later benefits claim files, from the late nineteenth century up to today, are still held at the VA, in their warehouses, and still haven't been sent to NARA.
And even if you know these files exist, the VA really doesn't make it easy to get them. The Veterans Benefits Administration (VBA) group within the VA only seems to accept FOIA requests for copies of C-Files by fax (!) and also seems to have made up a whole new rule whereby you have to have an actual wet ink signature on your FOIA request, not just a typed letter.
Well, seven years and one very successful FOIA lawsuit in SDNY against the VA later, we at Reclaim The Records are very proud to announce the acquisition and first-ever free public release of the BIRLS database, AND that we built a new website to make the data freely and easily searchable AND that we even built a free FOIA-by-FAX-API system (with a signature widget, to get around the dumb new not-FOIA rules!) built into our website's search results, that makes it much, much easier for people to finally get these files out of the VA warehouses and into your mailbox. :-)
We also added the ability to do searches through the data for soundalike names, abbreviated names, common nicknames, wildcards, searches by date of birth or death, or ranges of birth and death years, or search by SSN, or by branch(es) of services, or by gender...
For a lot more information about our FOIA lawsuit against the VA for the database, including copies of our court papers and the SDNY judge's order:
https://mailchi.mp/reclaimtherecords/the-birls-database-goes...
As for the tech stuff, actually building the website, the search engine, and its FOIAing capability...well, it has been a pretty fun project to build.
The BIRLS dataset was eventually provided to us by the VA (several years after we originally asked for it...) as a large zip file which, when decompressed via the command line, yielded the hilarious file name of *Redacted_Full.csv*. I then loaded the cleaned CSV data into a MySQL database, and then used a modified version of the Apache Solr search engine to index the data, so that it could become searchable by soundalike names (using Beider-Morse Phonetic Matching), nicknames (using Solr's synonyms feature), partial names (using wildcards), with dates converted to ISO 8601 format to enable both exact date and date range searches, and various other search criteria.
The front-end of the website is built with Nuxt and hosted on Digital Ocean's App Platform, with backups of the FOIA request data on the cloud storage service Wasabi. The fax interface for submitting FOIA requests is powered by the Notifyre API. We use Mailchimp to send e-mail newsletters, and their product Mandrill for programmatic e-mail sending. We use Sentry for error monitoring, Better Stack for server logging, and TinyBird to collect FOIA submission analytics.
Enjoy!
by ldoughty on 1/13/25, 12:48 PM
1) May want to auto-magically handle input for things like apostrophes. E.g. "O'Hare"... It looks like somewhere in the process this data was not preserved/saved/sent, but people will probably try to search with it. Might also want to handle the accent marks and what not too
2) The terms & conditions for Step 3, the checkbox at the bottom doesn't have enough contrast when checked. I do not have a disability, and I still found it very faint. Someone with a disability would likely have a lot of trouble (not to mention, it requires scrolling to the bottom to check it in the first place, which isn't awesome for accessibility)
3) I appreciate the warning on the terms and conditions about seeing things you might not want to see. A good reminder for those that might not want to tarnish a memory of someone... Reminds me of the DNA tests for Christmas, or learning about Punnett Squares and genetics, sometimes you might not want to go looking :-)
by wtfssn38 on 1/13/25, 12:43 PM
by patwolf on 1/13/25, 1:57 PM
On one hand, if this works then I'll be happy to have the information I otherwise wouldn't have. But on the other hand, all these processes, no matter how convoluted, exist for a reason. It feels weird bypassing those.
by mattw2121 on 1/13/25, 1:26 PM
by necovek on 1/13/25, 1:41 PM
OTOH, if you have really succesfully worked to make this database public domain and do publish it somewhere (and you did, as I can see at https://archive.org/details/BIRLS_database), this wouldn't be of much help against any malicious actors out there.
But really, it seems the burden is on VA if there are non-deceased persons in the database since they have done a bad job of maintaining the data, and they would be liable for any leakage of information (unless Reclaim the Records was aware of any in particular). Even so, RTR might have put themselves out on the fence for some lawsuits against them too.
by fergbrain on 1/13/25, 1:10 PM
Reminds me a bit muckrock.com as well.
by ungreased0675 on 1/13/25, 1:01 PM
by archerjax on 1/13/25, 12:52 PM
by Bjartr on 1/13/25, 1:15 PM
> these materials were largely unknown and inaccessible to historians, journalists, and genealogists
I think it would be worthwhile to lead with that and include a little more detail too.
If there isn't a clear motivation, people will assume the worst.
by neilv on 1/13/25, 2:00 PM
* Intent is to sell the data, or otherwise "monetize" it, in the techbro sense.
* "Shell" effort of a specific company that wants the data.
* Shell effort of an organized crime group.
* Shell effort of a foreign intelligence agency, or terrorist group.
Awhile ago, there was a different project, which had the effect of making different US records, which were already reasonably accessible to US citizens and journalists, easily available to foreign adversaries, such as for espionage profiling and blackmail. When that project was promoted on HN, I caught the promoter seeming to use a sockpuppet account in the comments (accidentally using the wrong account to respond to themself), which I found additionally suspicious.
Even when a project is fully honest and with good intentions, we also have to consider the risks of likely other consumers of the data, which include all the possibilities above.
by redeux on 1/13/25, 12:28 PM
Veterans aren’t politicians, and they don’t deserve to have their lives put on display like this without their permission. Some vets signed up because they wanted to serve their country, some because they were running from people or poverty, but they were all just ordinary people trying to eke out a living.
I believe people, good people just trying to do their thing will be hurt by this information and that’s unfair. It’s just another example of people using veterans as pawns to achieve their ends.
What is the ends in this case? I couldn’t tell you. I do believe this will having a chilling effect on veterans seeking help from the VA at a time when they need it more and more.
by flippyhead on 1/13/25, 2:00 PM
by tivert on 1/13/25, 3:48 PM
by greentxt on 1/13/25, 4:59 PM
by tantalor on 1/13/25, 4:27 PM
Ha, the only device I have with DVD is a PS5, this should be fun.
by asacrowflies on 1/14/25, 12:52 AM
Only jarheads seem to think the parental tone of "you don't know what freedom is" actually works.... Maybe because they have been thru boot camp idk.