from Hacker News

I record myself on audio 24x7 and use an AI to process the information

by roberdam on 11/15/22, 12:43 PM with 400 comments

  • by twobitshifter on 11/15/22, 3:24 PM

    This is known as life logging with adjacency to sousveillance and it’s a fascinating topic.

    https://en.m.wikipedia.org/wiki/Lifelog https://en.wikipedia.org/wiki/Sousveillance

    We in general don’t want to be watched by others, but a managed record of our own activities can be extremely valuable, and even more so if you find yourself wrongly accused. Further it can be used to shine a light on corrupt officials, one example of this is the nycplacards exposes on twitter.

  • by roberdam on 11/15/22, 3:39 PM

    Since everyone is interested in the hardware:

    https://www.aliexpress.us/item/3256803349510543.html

    https://www.aliexpress.us/item/3256803085687061.html

    the particular choice was for the battery and the other for the size, both are generic and come with the same software and bios, several vendors, if I could buy something better I would look for one that can have a lavalier microphone

  • by Void_ on 11/15/22, 2:09 PM

    Not as hardcore as OP, but after Whisper came out, I quickly built an app that allows me to record from lock screen: https://whispermemos.com/
  • by jconley on 11/15/22, 7:19 PM

    This is a cool project. One of my pet ideas that I haven't done is to build a home assistant where all data is stored and processed by a home "server". The biggest benefit I see is that it could truly be omnipresent. There in the background, answering questions, jumping into your conversations without prompt. And it's much less creepy if all that data isn't going to someone else's computer.

    Also piping in and processing the data from my mobile would be cool, but I wouldn't want to invade other people's privacy if I'm in public.

  • by roberdam on 11/15/22, 10:09 PM

    MORE INFO ON THE DEVICES:

    https://www.aliexpress.us/item/3256803349510543.html

    https://www.aliexpress.us/item/3256803085687061.html

    both recorders are using the same generic bios, you have a .txt file called FACTORY.TXT, by changing the values of the file you configure the device, this is the content of the file.

    ---------------

    TYP:1 (0:WAV 1:MP3)

    VOR:0 (0:voice-activated off 1-7:voice-activated sensitivity,higher means record less)

    BIT RATE:2 (0:32Kbit 1:64Kbit 2:128Kbit 3:192Kbit 4:Translate ON 5:512Kbit 6:768Kbit 7:1024Kbit 8:1536Kbit 9:3072Kbit)

    GAIN:5 (0-7 record sensitivity 8 grades)

    SECTION:(30) (1-999 record time exceed this,file will auto save,uint minutes)

    DATE:2022-10-15 (year-month-day)

    TIME:08:36:24 (hour:minute:second)

    TIMER:1 (timer record 1:on 0:off)

    START:08:39:32 (timer record start time)

    TIMELONG:(120) (1-720,timer record length,uint is minute)

    CYCLE:(030) (1-999,how many dyas,0:everyday)

    --------------------------

    I got the 32gb version of the bigger one and the 16gb version of the smaller one.

    I configure the device to save a file each 30m, each 30m mp3 file takes 28.125kb, so around 56mb per hour at 128kbps

  • by justinlloyd on 11/15/22, 4:55 PM

    Interesting work, glad to see I am not the only crazy one left in the life logging scene after all these years. Have been lifelogging since 2004-ish, and built a few custom bits of software and hardware to support it. I don't record 24x7 anymore, but I used to. Now my recordings are limited mostly to my office environment, and when I am out and about using a Sensecam-like device with custom firmware. When in my office I capture video, audio and depth data from multiple view points, along with images of the desktop of whatever computer I am on, and process most of it on a Jetson.

    How's the audio quality on those devices you link to in other comments? I find I pick up a lot of ambient noise when outside of the office, and always struggled to come up with a viable algorithm and model to differentiate "background chatter" from the main conversation, and it is a problem I've never really managed to solve so I am interested in your experiences on the subject.

  • by bane on 11/16/22, 5:52 AM

    This is really interesting and many of the comments here go into the utility of this, however, verify that you aren't recording somebody else without their consent, in many places it is illegal to record to conversations without the other party's consent.

    I only ran across this problem years ago when, due to a serious potential workplace issue I suggested somebody basically "wear a wire" and record their workday to catch some HR problems. We found out that the state this was occurring in had a two-party consent law and violating it was not a great idea.

  • by AndrewKemendo on 11/15/22, 4:17 PM

    Expanding on the structure the OP created, this is how I see us getting to human level AI:

    1. Record video sound etc... (trajectories) egocentrically

    2. Analyze the data and assign reward labels (more/good, less/bad) to state and transitions actions

    3. Use the reward feedback and trajectories to build the policy for some set of actions in certain environments

    This is why I'm bullish on anything sousveillance - so AR cameras on your head, always on mics etc...

    The challenge is doing this democratically, without it being intermediated by a giant for-profit mega corp that doesn't care about you and wants to mess with your head

  • by cameronh90 on 11/15/22, 6:13 PM

    All of the rooms/corridors in my house except my bathrooms are covered by cameras. My initial motivation for installing them was to keep an eye on what my pets were doing when I'm not around, but I find in recent years that if I misplace something, I end up tracing back my history on the cameras and finding where I left it.

    It seems obvious that at some point, AI will be able to do that for me and I'll just be able to say "Alexa, where did I leave my glasses?", "Hey Google, where did I put my box of spare fuses?".

  • by habibur on 11/15/22, 3:18 PM

    Excellent idea. You can later search through your logs in the future for reference. As it's all in text.

    Prior solutions posted on the net, had this take photo / record audio 24/7 features, but then those were stuck there. What next? What would anyone do with these data?

    But this Hi Jarvis styled recording of text on the go is a very useful feature.

    Another step ahead.

  • by unsupp0rted on 11/15/22, 1:08 PM

    I remember an Asimov short story in which scientists developed a machine that could see backward in time.

    If I recall correctly, the upshot was the government became terrified because any machine that can see 1000 years into the past can also see 1000 milliseconds into the past and therefore functionally be used to spy on anyone in real time.

  • by warrenm on 11/15/22, 2:51 PM

    Sounds very similar to the guy talked about in Albert-László Barabási's book (either Bursts, or Linked ... don't recall which atm) - he was photoing/videoing his whole life, but never of himself - ie, the camera was always facing outward (like a policeman's bodycam)
  • by tegiddrone on 11/15/22, 2:04 PM

    I did an experiment where I lived for awhile with a sony recorder/mic on me 24/7. It was nice to be able to refer back to conversations and events when I wanted them. Biggest issue was sorting through the data-- timestamps and recorder bookmarks were OK but I really needed full text search on the audio. It would have been great to tag via `Robert, mark timestamp, end Robert`. AI seems to be required, especially when dealing with wind noise and other issues (like the mic twisting around and all of a sudden one channel is my heartbeat.)

    The sony voice recorder out there easily last 24 hrs on 1 AAA battery.. dumping to mp3 on a large sd card.

  • by specialist on 11/15/22, 4:00 PM

    Excellent. Just terrific.

    My future perfect system also logs my location and what I'm doing. And probably health metrics too, like heart and breathing rate.

    Instead of initiating my exercises, I just want to say "Robert, start jog". The "modal" nature of my Apple Watch's Activities really frustrates me.

    I don't want to take notes while I'm listening to a podcast. I'm generally doing something else at the time. I just want to say "Robert, bookmark". And magically a link will be made to whatever I'm listening to at the time. (Audio book, radio, stream, podcast, whatever.)

    Ditto identifying songs (Shazam!).

    I don't want to fart around with exchanging contact information. My hands are usually full or whatever. Just say "Robert, contact info" and then repeat out loud whatever I hear.

    I also want to rewind after the fact. When trying to recall a tidbit, I'll remember the song, where I was (eg while walking the dog), who I was with, what I was eating. So if I want to remember which podcast I was listening to while at the park, I'd just start with my location log and jump over to my podcast listening log.

    What could be more simple?

    FWIW, I'm still waiting for my "bicycle for the mind".

    PS- I've tried, half-heartedly, to use the voice recorder app, and notes with voice transcription. But then it quickly becomes a treasure hunt. And my attempts to do this stuff with Siri just leaves me more frustrated.

    Thanks for listening.

    Great project. Please keep us posted on updates.

  • by dsalzman on 11/15/22, 3:11 PM

    I've been experimenting with this recently as well, but with an app on my apple watch. Looking for a method/model to split different speakers into different tracks to only look at audio from myself and certain people.
  • by miguelrochefort on 11/15/22, 8:04 PM

    Here's a 24/7 background audio recorder app I made for Android. The impact on battery and storage is surprisingly reasonable.

    https://github.com/miguelrochefort/eardrum

  • by gajus on 11/15/22, 3:58 PM

    I like this. It vibes with a language learning app concept idea I recently shared out loud.

    https://twitter.com/kuizinas/status/1591867392220594183

  • by apienx on 11/15/22, 6:19 PM

    Well done!

    Got a similar PoC that uses Tasker to record sound on my phone, Whisper to convert it to text, and neatly organizes everything into Obsidian.md. The continuous recording kills the battery life on my phone so it's only usable if you don't mind going around with a powerbank. Would be great if a manufacturer would put in a separate low-energy chip with a good ADC.

    P.S. "Active functions" with custom home automation is easy as pie with joaoapps's suite. I use BusyBox to SSH into a Pi with a Tellstick Duo. And some RFID tags for the system to know where I am (e.g. bedtime routine gets triggered when I place my phone on the bedside table). But yeah...traffic goes thru Google.

  • by sorwin on 11/15/22, 1:17 PM

    How would this work with other voices, like a coffee shop, would it hear those simultaneously, and interupt a command?

    Also, how do you handle using OpenAi whisper, seems like they do 30 second intervals - would that be an issue if your command is cut off mid word?

  • by frontman1988 on 11/15/22, 3:07 PM

    The future will definitely have devices which record visually/verbally all your life. VR headsets are already able to record all your facial expressions. A google glasses like gear which records all your life is pretty much possible in the near future. The future influencers won't have to carry a phone/camera to create vlogs, they would just see wherever they want and the glasses will record not only the thing they are seeing but also their expressions. Privacy will probably not be such a big thing as now given most people with each generation are increasingly becoming more and more comfortable sharing their whole lives online.
  • by vachina on 11/15/22, 3:32 PM

    Awesome idea. However people would find it weird that I talk to myself all day long.
  • by chatterhead on 11/15/22, 2:34 PM

    This is awesome! I've been recording myself (video/audio) for the last couple years on and off (thousands of hours) and have no efficient way of processing the info. Was not aware of Whisper and what he's done is exactly what I'm looking to pull off.

    The GPT-3 idea is scary and most certainly the future. I can't stand the world of never ending 'Moviefone' menus and chatbots, but when it's me that gets to be the machine response the future doesn't seem so annoying. Would be nice to have my own GPT-3 model that I can use to "get to a real person" when calling places.

  • by layer8 on 11/15/22, 2:52 PM

    The passive information would be useful if that would work with your inner voice.
  • by varispeed on 11/15/22, 4:14 PM

    I have recording turned on on my phone. Usually it records 6 hours at a time, so it is annoying that I have to manually restart recording. Another annoying thing is that it will pause the recording when I pick up a call.

    Why Google decided to block call recording is beyond me. In the past when I was able to record calls it saved me a lot of trouble - for instance when insurance company lied to me over the phone about the product I could confront them about it and get my money back. I wish I could be able to record calls with my relatives as well. Call recording is legal in my country.

  • by r_hoods_ghost on 11/16/22, 11:24 AM

    So the author is recording all their interactions with everyone they meet and then processing and analysing those intetactions. How is this not a massive invasion of privacy?

    I see he is concerned about his own, but he doesn't seem to be at all concerned with anyone elses. Personally if i discovered that someone I interacted with was doing this I would insist that they deleted any data concerning myself they had gathered and pursue whatever means were available to me if they refused.

  • by ogicar on 11/15/22, 2:59 PM

    Very interesting idea.

    Would you be willing to share more info on the tech used in the process?

    >I bought a couple of Chinese microphones

    Which exact microphones? How long does their battery last?

    As well as other parts of the process.

  • by ISL on 11/15/22, 1:42 PM

    What a bonanza for opposing counsel of any kind.

    (which is a bummer, as there are lots of interesting uses for digitizing our lives if the data could be guaranteed to remain private)

  • by adamgordonbell on 11/15/22, 1:25 PM

    An off the shelf solution for recording your whole life:

    I have a Sony recorder, ICD-UX570, and it has a setting where it turns on or off based on sound, and also adjusts the gain to best record. It takes a micro-SD card and has pretty solid battery.

    I think you could put it in a breast pocket and run it for several days on a single charge. Because it would just record when you are talking or making noises you could likely run it for a year on a big SD card recording in mp3.

    Change to a wifi SD card and suck the files off and process them and you might have something kind of cool.

  • by krzyk on 11/15/22, 3:05 PM

    A related question, is there a ready solution to do constant recording using some Linux box (e.g. Raspberry PI)?

    AFAIR I've seen someone recommended such software on HN but I can't find it right now, it was something for recording radio stations or similar.

    I would like to get some kind of sound monitoring of my house when I'm away or sleeping and besides using arecord I couldn't find anything useful.

  • by kettleballroll on 11/15/22, 1:23 PM

    As you'd be recording all of your conversations, this is illegal under some legislations, unless all your conversation partners agree with being recorded/their convos being stored.
  • by hit8run on 11/15/22, 2:14 PM

    Interesting article. Thanks for posting this. I think when wearing something like google glass and recording everything the potential is even bigger. The AI can extract so much more context. Analyse faces, gestures, locations and more. Dystopian and yet so interesting.
  • by wartijn_ on 11/15/22, 2:41 PM

    Are you planning to combine it with other info? Your smartwatch already knows how long you've slept, getting that info directly to your database seems more efficient and less error prone. The same goes for the amount of money you've spend, if your bank allows you to export that info it'll save you a step. Your bank doesn't know what you've bought, only the total cost, time and shop, but if you scan and upload your receipts and use ocr you'll have a detailed record of that too.

    And you could also keep track of your location, so you know where you had a conversation or at what gas station you spend 250,000

  • by PaulHoule on 11/15/22, 1:04 PM

    If you are going through that much trouble you might as well get a WiFi scale, wear a tracker that has an API, etc. I’ve definitely thought about taking speech-to-text notes at work, nice to see somebody did it.
  • by junwonapp on 11/16/22, 4:00 AM

    Interesting to see comments suggesting use of constant recording as a defense against invasion of privacy via constant surveillance.

    Interesting that the defense against the harms of technology is technology itself. When the humankind unlocks a new powerful technology, and when it is possible for criminals to use the technology to harm us, our best course of action may be, not to look away from it out of fear, but to spend more time understanding it and its implications, and to get there faster than adversaries do.

  • by moritonal on 11/15/22, 1:23 PM

    Handling the privacy of other people might be oddly easy. If you can detect the voice accurately enough the AI might be able to _drop_ the other participants.
  • by davide_v on 11/16/22, 9:07 AM

    This reminds me about Black Mirror's White Christmas episode where they create a digital clone inside a white "cookie" and then they use it for receiving tasks such as making toasts. This project is very similar actually. I found very interesting the part where you can track your food and automatically calculate the calories every day without writing anything anywhere.
  • by Player6225 on 11/15/22, 1:49 PM

    I'm curious if there was other work you were inspired by. I have also been a bit interested in using this style of "personal database/logger/journaling";

    task-agnostic input -> processing -> visualization/recall

    My assumption is you are just storing post-processed conclusions in a local db on your computer + raw audio for possible future re-processing, and not currently storing other media input (ala food pics)?

  • by luuuzeta on 11/15/22, 3:17 PM

    I'd be so self-conscious with my speech being recorded roughly 24/7 by myself. I'd probably get used to it but it'd take some time.
  • by alkonaut on 11/15/22, 2:02 PM

    People who take notes in life (the org mode people): oh cool. Everyone else: why would I want to know what I ate, weighed, or thought last week?
  • by hawski on 11/15/22, 4:37 PM

    Great idea. However always recording is a disadvantage for me.

    I thought about a device that could look a bit like Star Trek badge. It should react to pushing it slightly and it would have a microphone. It would connect to a phone with Bluetooth.

    Main use for me would be push-to-talk as I use Zello with my wife quite a lot. But all those reliable assistant/voice-notes uses would be also sweet.

  • by Ninjinka on 11/15/22, 5:19 PM

    The advent of Whisper gave me a similar idea, except instead of uploading the recording once a day, it worked by calling my computer from my phone and recording and processing in batches. Realized pretty quickly that most of my day is silent though, and would rather be able to trigger it on demand, which I haven't gotten around to.
  • by phendrenad2 on 11/16/22, 1:18 AM

    Pretty sure Bill Gates wrote about this idea in his book. It SEEMED like the future, but software/hardware innovation goes where the money is, and no one is interested in recording their own lives. Maybe once the AI to make use of it gets better, it'll find product-market fit.
  • by bitcurious on 11/15/22, 2:22 PM

    In some jurisdictions it’s illegal to record a conversation without all-party consent. Example:

    https://www.rcfp.org/reporters-recording-guide/massachusetts...

  • by pards on 11/15/22, 1:11 PM

    I think the passive part of this could be really interesting - starting with a simple "tag cloud" of keywords by frequency linking to audio snippets that mention them, it'd provide a way to index conversations during the day for future reference (or processing).
  • by skydoctor on 11/15/22, 4:41 PM

    Anyone familiar with rewind.ai which seems to be building a product on similar lines?
  • by radu_floricica on 11/15/22, 5:26 PM

    Anybody made some progress with using google assistant with arbitrary commands? I know there are a few integrators online that could, in theory, get commands and send them to a spreadsheet, but I couldn't get them to work.
  • by otikik on 11/15/22, 6:57 PM

    I liked this article and find it intriguing. That said, I would set the original sound data to expire relatively quickly, perhaps erasing everything week or so. I like letting the past be the past.
  • by AlexErrant on 11/15/22, 1:40 PM

    Would you mind linking/listing what microphones you're using?
  • by jes5199 on 11/15/22, 5:45 PM

    for a while I had my laptop set up to take a photograph and and screenshot every ten minutes. The information was completely useless, but I got some great candid photos of myself
  • by c1sc0 on 11/15/22, 7:10 PM

    Are there any good (discrete) wireless throat mic patches out there that are sensitive enough to pick up subvocalization / whispering?
  • by macrolime on 11/15/22, 1:08 PM

    Are you planning to make it an open source project?
  • by riiri778 on 11/15/22, 2:10 PM

    I do that as well. I had a few arguments with police patrol over driving tickets. Once dog attack and very aggressive dog owner.
  • by fnordpiglet on 11/15/22, 5:30 PM

    I thought this would be recording everything to generate an AI model that could sit in zoom meetings for you.
  • by colordrops on 11/15/22, 4:25 PM

    Did I miss something or is a description and/or link to the software used not in the article?
  • by sdze on 11/15/22, 2:37 PM

    That calorie tracking will be WAY off and useless if not dangerous. Nothing beats a kitchen scale.
  • by commitpizza on 11/15/22, 1:15 PM

    Very cool idea, the only question I have is how fast does this not drain the battery of the mics?
  • by bronzejaguar on 11/17/22, 12:40 AM

    This is the sort of thing Nietzsche alludes to as his "last man".
  • by karol on 11/15/22, 2:29 PM

    What are the limitations of numbers as descriptors of Being?
  • by j_mo on 11/15/22, 1:18 PM

    Doesn't this break wiretapping laws (depending on the user's geographic location) and possibly GDPR/NDAs(if left on while at work)?

    Ethically most people probably wouldn't be happy to find out that you recorded a conversation with them.

  • by RamblingCTO on 11/16/22, 12:19 PM

    Forgetting stuff is a feature, not a bug.
  • by thro_213r692s on 11/15/22, 3:50 PM

    Why is this being upvoted ? There is no code
  • by tezza on 11/15/22, 7:06 PM

    So where was the remote control ?
  • by Lapsa on 11/15/22, 1:52 PM

    yeah but there's 3 toasts
  • by mikro2nd on 11/15/22, 3:50 PM

    Why are you so self-obsessed?
  • by edw519 on 11/15/22, 2:37 PM

    Too bad Abbot and Costello aren't around to attend a standup with a bunch of people using your app.

      - Robert Robert what did you do yesterday End Robert
      - Robert I met with Robert in accounting to finish Jira 12392 story on the Robertson patches End Robert
      - Robert Then I took a break to have coffee and watch a Julia Roberts short with Robert in System Admin End Robert
      - Robert Robert what are you going to do today End Robert
      - Robert It depends if Robert Roberts has internet access End Robert
      - Robert If he doesn't, that'll be the end of Robert Roberts End Robert
      - Robert No impediments for either Robert in the Robert epic End Robert
      - Robert Robert's on Help Desk, Jean Robert's in Code Review, and Sam Robertson's running the Roberton's retrospective End Robert
      - Robert But then who is Robert Robertson? End Robert
      - Robert Oh Robert Robertson's our scrum master! End Robert
  • by bluenose69 on 11/15/22, 4:01 PM

    I'm a bit concerned about the calorie level I see here, 832/day. That is about 1/3 of the NHS recommendation [1] for males.

    1. https://www.nhs.uk/common-health-questions/food-and-diet/wha...

  • by yuvalkarmi on 11/15/22, 1:42 PM

    I think your last sentence summarizes the sentiment really well: “The difference between utopia or dystopia is who has access to that information”
  • by Waterluvian on 11/15/22, 1:01 PM

    > My biggest problem with “OK Google” is that I don’t know by heart what it can do interactively

    Maybe it’s just me but this feels unaddressed and that seems ridiculous.

    Why is it so hard for me to find a single, precise location on my phone with an enumerated list of every command Siri or Google can work with?

  • by alexmolas on 11/15/22, 1:14 PM

    (Slightly off topic) If you click in the "RoberDam.com" link that appears when you scroll a little bit you get redirected to "http://localhost:8080/".

    It seems to only happen in the English page. In the Spanish version of the post the link works well.

  • by gruez on 11/15/22, 2:03 PM

    >RELATIONSHIP THERMOMETER

    >According to studies on couple relationships, it is possible to predict with an accuracy of up to 90% if the couple is going to divorce by studying the interactions, specifically the relationship between positive and negative interactions between the couple

    Apparently the studies that were used to reach that conclusion does no such thing and were hilariously flawed.

    https://slate.com/human-interest/2010/03/a-dissection-of-joh...

    >The upshot? What Gottman did wasn’t really a prediction of the future but a formula built after the couples’ outcomes were already known. [...] The fundamental problem is that no matter how many equations, even quite similar ones, Gottman generates, we have no real idea of his forecasting power because of the way he reports his data

  • by wallfacer120 on 11/16/22, 3:31 AM

    I'm in the middle of building literally the exact same thing for myself.

    Beyond privacy/security, the aspect of the app I worry about the most is giving oneself perfect memory and then never being able to escape the past. That last fight you had with your ex? Well now its recorded and you can listen to it, and dissect it, and wonder what you could have done differently, right up until you blow your brains out.

    But, as always, its up to the user to use the technology in a healthy way. It would be, after all, a choice to remain mired in the past rather than taking healthy lessons from it to make your future better.

  • by artursapek on 11/15/22, 3:45 PM

    insert men with autism meme
  • by tomerbd on 11/15/22, 2:16 PM

    What a nice pair of feet!