by CobrastanJorji on 1/11/23, 7:13 AM
This title really undersells the absolute insanity of the described solution. This is a beautiful example of "if it's stupid, but it works, it's not stupid." The justification is very convincing.
One thing I'm curious about: how did you build your corpus of meme images and videos?
by merpkz on 1/11/23, 8:10 AM
Nice project, I wanted to build meme search engine myself one day, but figured Tesseract will fail at most of the memes because of how stylized those have become. So I tuned down my meme source to only /r/bertstrips as those contain sane looking text and it's working quite alright - project has no frontend yet, I search from cli and click links.
> Initial testing with the Postgres Full Text Search indexing functionality proved unusably slow at the scale of anything over a million images, even when allocated the appropriate hardware resources.
I can guarantee you that correctly setup PostgreSQL text search will be faster than ES with much, much less hardware resources needed, it's just a matter of correctly creating tsvector column and creating GIN index on it (and ofc asking right queries so it's actually used). I can help you out setting postgres schema up and debugging queries if you are interested, for testing purposes at least.
by CiceroCiceronis on 1/11/23, 11:37 AM
This is really brilliant to see, and I've been surprised for quite a long time that nothing similar exists. I think it's a real shame that few people with interest in memes have interest in building solutions like this that help us engage with them.
People in the 21st century know a lot about the mistakes of the past century that led to much popular culture of the time being lost (especially terminally online people who've watched lots of Youtube documentaries about lost Dr. Who episodes and so on), so it surprises me how little we try and avoid those same mistakes with today's ephemeral pop culture in the form of memes. People like yourself who want to help make the internet's huge corpus of memes tractable are part of the solution in terms of meme archival and cultural memory.
(There's a good meme metadiscussion group on Discord, "The Philosopher's Meme," which you might be interested in joining. People there would be very keen to discuss what you've made.)
by iamflimflam1 on 1/11/23, 1:14 PM
by yakubin on 1/11/23, 7:22 AM
The photo under Upgrading the iPhone OCR Service Into An OCR Cluster. In the future, data centres are going to host racks of iPhones.
by aabajian on 1/11/23, 10:28 AM
I had a friend in med school who wrote a very early note-taking app for the iPad. Turns out that there was no way to render PowerPoint files when the iPad first came out. He realized that the iOS/Mac OS "quick preview" function could be used to take screenshots of each PowerPoint slide. For a brief time, his was the only app that could display PowerPoints (albeit, they were just screenshots!). There's a lot of hidden utility in Apple libraries.
by ksdme9 on 1/11/23, 7:30 AM
Love the inventiveness.
My question is about the image distribution costs. All the memes on the site seem to be coming straight off an object storage, all that bandwidth consumption has got to add up(?). Some sort of a CDN might help depending on the search patterns.
by memeatlas on 1/11/23, 2:42 PM
Although not as elegant a solution as this I've also tried my hand as well at indexing and categorizing memes. I wanted to save a very specific type of meme though since there are, in my mind, 2 main categories of memes. The first category are what I call "story" memes, they are standalone and typically what you see being shared on Facebook. They usually have text and are able to tell a story on their own with no additional context and can be presented as a single post, story, etc, (think 4 panel comics). The second type are reaction memes. These are used to respond to people and usually convey a feeling towards a post or tweet. They can also be standalone so they should probably be considered a subset of the "story" memes. I've gravitated towards the reaction memes as I see more utility in them and can be used in a more universal way. My site if anyone is interested (its still a work in progress):
https://www.memeatlas.com
by lormayna on 1/11/23, 9:08 AM
by philsnow on 1/11/23, 8:31 AM
I sat down literally last night and started sketching out the scratch-my-own-itch solution to more or less exactly this problem, because I too have meme-aphasia where I
know there exists a meme that fits perfectly in a conversation, but I have about 5 seconds to find it before the moment passes.
I'm so, so glad to see that I'm not the only person in the world with the same "problem". Well done, mandatory.
edit: holy crap you even index videos, nice
by csande17 on 1/11/23, 8:13 AM
I wonder how the performance of Vision.framework on desktop Mac hardware compares to a cluster of phones. (The author mentions that it was "fairly slow", but it sounds like they were running an iOS app in the simulator and not a macOS app.)
by Fabricio20 on 1/11/23, 3:47 PM
Does the Vision API call back to apple servers in any fashion? Like how google on-device voice recognition APIs will call back to Google when you are online (unless you explicitly pass flags to force it in offline mode).
If so, is there any risk in getting your account suspended or ip range banned somehow because of this, for example?
by tomw1808 on 1/11/23, 9:27 AM
Absolutely amazing on the tech side!!!
Now, after reading the article, I gave your search engine a try. I was looking for that futurama its a trap meme (pretty much pops up on any image search here https://www.google.com/search?q=futurama+its+a+trap)
The problem is, the search engine you built is now very text-heavy, which seems to be usually very unconnected to the actual meme. So, searching for "its a trap" did not yield the results I was actually hoping for, but made total sense looking at how the search was implemented.
Are you planning to implement an actual tagging of the content of some sorts? Maybe a clustering of similar objects (like iphone clusters similar peoples faces in the gallery) and then tag those clusters with keywords somehow?
by mseidl on 1/11/23, 10:12 AM
Do you use mongodb to make it web scale? You turn it on and it scales right up.
by Arbortheus on 1/11/23, 8:57 AM
This is great, I particularly like the part about using compute from old unwanted iPhones. Quite an inventive way to reuse/recycle otherwise obsolete hardware!
by permo-w on 1/11/23, 7:21 PM
I have absolutely no experience in this area and I'm curious:
is there really no open-source text recognition software that's on-par with or close to Apple's (presumably proprietary) implementation? the article mentions Tesseract. is that the current best open-source option?
by MrGilbert on 1/11/23, 6:53 AM
This is remarkable. I'd love to see that combined with some kind of sentiment analysis like Microsoft offers, just to see if something useful comes out of it.
Sometimes, I don’t know the exact words when looking for a meme, but once I see it, I know that’s the one.
by oefrha on 1/11/23, 8:14 AM
> My preliminary speed tests were fairly slow on my Macbook. However, once I deployed the app to an actual iPhone the speed of OCR was extremely promising (possibly due to the Vision framework using the GPU). I was then able to perform extremely accurate OCR on thousands of images in no time at all, even on the budget iPhone models like the 2nd gen SE.
I suppose that’s an old Intel MacBook? I’d be very surprised if the Vision framework performs better on a 2nd gen iPhone SE than even the first M1 MacBook Air.
by komali2 on 1/11/23, 7:11 AM
Would love to see that load balancer implementation, as I'm a scrub and this project fascinates me.
by andai on 1/11/23, 3:06 PM
I have a "hackish but works for me" meme database: I use my Telegram "self chat" to send memes I like to myself, and I tag them with the kind of words I'm likely to search for when looking for them later.
Works great for me.
It's kind of like trying to come up with a good Google search phrase, based on how other people must have phrased something, but relying on knowledge of how you phrase things instead.
by lysecret on 1/11/23, 8:23 AM
Wait what this is absolutely brilliant. Actually insane it works so well using a stack of iPhones as an ocr server. My deepest respect.
by Thorentis on 1/11/23, 8:13 AM
IaaS - iPhone as a Service, coming soon to AWS.
by marginalia_nu on 1/11/23, 9:26 AM
Now this is the sort of disgusting pile of jank I love to see.
by JustARandomGuy on 1/11/23, 8:02 AM
Very inventive. Admittedly when I read the first few paragraphs, I was thinking “he’s got to have $40K of iPhones doing image processing” but you made a good point about being able to use iPhones with screen and other damage.
What was your average price per iPhone, if you don’t mind disclosing?
by ekns on 1/11/23, 7:36 AM
Last time I looked into OCR stuff I came to a similar conclusion (though I didn't implement anything back thne). It would be really nice to have "open source" models that had similar performance, without having to deal with the iphone cluster hackery.
by nysv on 1/11/23, 10:06 AM
If only there was a way to filter out ifunny results, I absolutely detest that watermark.
by Freak_NL on 1/11/23, 8:03 AM
> Better yet, I don’t even want to use them as phones, so even iPhones that are IMEI banned or are locked to unpopular networks are perfectly fine for my use.
Fences worldwide will be overjoyed to hear of this novel application.
by suave_dude on 1/20/23, 3:53 PM
I have a question what do you guys think is the best back end for a video search engine app?
by dirtyid on 1/11/23, 8:15 AM
Outrageous effort! So far japanese, mandarin returning results as well.
Do you have list of sources where memes are ingested from?
Would be nice to have some option to explore memes by category.
by looki on 1/11/23, 5:34 PM
Very cool project! I'll try to remember it the next time I'm looking for a specific image. I noticed that repeated appearances of the search term are ranked higher, which isn't necessarily productive. Also, some kind of duplicate detection would be nice. Searching for "SpongeBob" yields many copies of the same images that mentions "SpongeBob" several times.
by baradhiren07 on 1/11/23, 8:32 AM
by petesergeant on 1/11/23, 9:11 AM
I was hoping this would help me find the Database Iceberg meme that shows different levels of database insanity. It didn’t. Anyone have a link?
by 2Gkashmiri on 1/11/23, 11:31 AM
Are you going to open source the "app" part of it ?
I would love to replicate this setup for my own project....
I am thinking, load balanced, multi location redundant "iOS machines" with 3-4 phones in with power backup and internet dongle.
We could use something like zerotier/tailscale to get internet access from outside your local network
by ipsum2 on 1/11/23, 8:32 AM
This is amazing. Out of curiosity, why not try deep learning OCR software instead of Tesseract? PaddleOCR is popular.
by solarkraft on 1/11/23, 4:20 PM
That's a fun way to do OCR. Next up: Classifying memes by subjects and themes to build something like KnowYourMeme's gallery, but
for every meme.
Bonus: Index from a lot of sources to help track a meme's origin.
This type of thing is on my long list of "can somebody else please do this already".
by schappim on 1/12/23, 7:35 AM
Pretty insane. If you don’t want to use iPhones, I made macOCR a while back. It uses the same vision APIs, with a very simple CLI interface. See:
https://github.com/schappim/macOCRby petercooper on 1/11/23, 8:00 PM
You can do it on macOS as well, it has the same API for fast high quality OCR. I used it to create an OCR system to detect secrets or credentials in screencasts:
https://github.com/peterc/videocrby kgbcia on 1/11/23, 12:27 PM
That's genius. I realized the cost advantage of text to speech on an old android versus Google cloud
by sneak on 1/11/23, 9:55 AM
Don't you have to re-sign and re-deploy tour iOS app every 7 days to keep it running on the iPhones?
by paulmd on 1/11/23, 7:23 PM
Is there a pgsync equivalent for Oracle? Spent some time building replication from a source-of-truth to a search engine at a previous job.
Wish we could have used postgres but the tools were dictated rather than letting the requirements drive the tooling.
by spuz on 1/11/23, 9:25 AM
I'm curious how well the iPhone OCR actually works. How do you deal with errors? Is the error rate low enough that you can accept the output from the iPhone OCR as is or do you also run it through a cleaning process (e.g. spell check)?
by yreg on 1/11/23, 9:29 AM
This is absolutely brilliant.
I believe I will actually use it a lot if you keep the site up.
Minor feedback for the blog post: It deserves a better meta description (for link previews). The first paragraph doesn't advertise how good the article is going to be.
by causality0 on 1/11/23, 1:11 PM
I tried a few memes. The results were quite poor, and vastly inferior to just using Google. In the case of text searches I had to scroll through dozens of results before finding the original meme images.
by nowahe on 1/11/23, 4:15 PM
Out of curiosity, how does your Image Similarity Search works ? Are you also using some feature of Apple's Vision framework, or running some ML model on your linode instance ?
by Liquidor on 1/11/23, 9:54 AM
Brilliant! :-)
Maybe a dumb question, but could you use your data to train a new OCR model so you wouldn't have to rely on iOS?
I don't know much about ML/AI so maybe not feasible.
by gerdesj on 1/11/23, 12:04 PM
"It looked like it was time to bite the bullet and write an OCR iOS server in Swift."
Quite a large bullet required, one with plenty of chewing left in it.
by joshu on 1/11/23, 6:48 PM
Heh, this finds a bunch of copies of a video I made. If you are going to cache them and repost them, you probably need to have a DMCA process.
by kome on 1/11/23, 8:35 AM
you are a genius. also, the Search Engine works so well
by surume on 1/11/23, 11:06 AM
Thank you for building this! It's so much fun. I looked for memes that I saw years ago and found them in seconds. Excellent work!!
by Scaevolus on 1/11/23, 10:45 PM
Looks like you've solved the OCR problem, now to solve the duplication problem and use it as a ranking hint. :-)
by 1wsk on 1/11/23, 8:50 AM
Could you not extract the model and run it on a server?
Its probably not as easy but i know it has been done with NeuralHash
by nisegami on 1/11/23, 11:25 AM
Is this the person on /r/hardwareswap who's been looking for semi-functional, even IMEI banned, iPhones?
by the_arun on 1/11/23, 4:26 PM
Nice hack. But doesn't Google/(any public search engine) image search do this for us already?
by kevmo314 on 1/11/23, 8:54 AM
Could you expose your iPhone cluster as an OCR API? Seems like it would be competitive with the GCP API.
by francis-io on 1/11/23, 10:06 AM
Would be great if the images had a unique name so I could save them without having to rename them.
by mateuszbuda on 1/11/23, 8:21 AM
by 9dev on 1/11/23, 10:24 PM
This is the single bestest thing I have read in a long while. Absolute madness. Pure bliss.
by fuzzygroup on 1/11/23, 8:10 AM
This is utterly fantastic and you are to be commended for your Crazy Mad Scientist genius!
by julianeon on 1/11/23, 2:48 PM
This makes me wonder what other cool things I can do with an old iPhone.
by Tade0 on 1/11/23, 10:13 AM
I wonder how it fares against deep-fried "E" or Opossum memes?
by skizm on 1/11/23, 1:30 PM
Where did the original set of meme images, gifs, and videos come from?
by coayer on 1/11/23, 1:38 PM
I love the iPhone cluster so much!
by danbruder on 1/11/23, 6:30 PM
this is very clever. I wonder what other use cases could leverage this approach
by ariehkovler on 1/11/23, 7:27 PM
This is mad and I love it.
by geek_at on 1/11/23, 10:20 AM
really amazing! I love the solution and the project in general
by transitivebs on 1/11/23, 7:19 AM
My #1 recommendation for anyone thinking about the convoluted OCR solution: use a cheap OCR API and save yourself months of time / hassle / upkeep. Google's OCR API is a good place to start, but AWS has one too and dozens of others out there.