by deepshaswat on 8/26/24, 12:33 AM with 4 comments
The requirement is I have 700k records of YouTube channels with descriptions and keywords. I also have a list of categories and subcategories with description: 776.
Now I want to map each account to a relevant category and sub-category and not spend more than $50-100.
I thought to use AWS Bedrock with Titan model but the cost comes around $11k.
Any pointers will be really helpful
by any_throw777 on 8/26/24, 1:56 AM
If you have local hardware (16-24 gb of video card), you should be able to use a number of different ml approaches, including LLMs. You could also try a forest classifier, or other approaches, but without specific about your data, it might be a little challenging to rec one.
I would try OpenAI or Anthropic. You should be able to fit all your categories in a prompt, and then ask the specific item. Something like >>
You are a classifer agent. Here is a list of categories and subcategory.
-item 1 -item 2 -... -item 776
The details of the channel are x, and here are a few descriptions.
--- The above could work if you can fit all the categories and description in one prompt. Both OpenAI and claude are pretty large.
The downside of this approach is that it's very hard to predict the classification with any reliability.
The upside is that it's fast and very likely to fit your budget. GL
by wizzerking on 8/26/24, 2:19 AM