from Hacker News

Hugging Face Releases Agents

by mach1ne on 5/10/23, 4:33 PM with 125 comments

by NumberWangMan on 5/10/23, 7:54 PM
I'm not 100% sure that AGI is guaranteed to end humanity like Yudkowsky, but if that's the course we're on, seeing news like this is depressing. Can anyone legitimately argue that LLMs are safe because they don't have agency, when we just straight up give them agency? I know current-generation LLMs aren't really dangerous -- but is this not likely to happen over and over again as our machine intelligences get smarter and smarter? someone is going to give them the ability to affect the world. They won't even have to try to "get out of the box", because it'll have 2 sides missing.
I'm getting more and more on board with "shut it all down" being the only course of action, because it seems like humanity needs all the safety margin we can get, to account for the ease at which anyone can deploy stuff like this. It's not clear alignment of a super-intelligence is even a solvable problem.
by rahimnathwani on 5/10/23, 4:43 PM
If you want an overview, scroll down to this part of the page: https://huggingface.co/docs/transformers/transformers_agents...
In short:
- they've predefined a bunch of tools (e.g. image_generator)
- the agent is an LLM (e.g. GPT-*) which is prompted with the name and spec of each tool (the same each time) and the task(s) you want to perform
- the code generated by the agent is run by a python interpreter that has access to these tools
by samstave on 5/10/23, 4:45 PM
Asking for help from those that are smarter than I am ;;
-
One of the very common things for Martial Arts Books in the past, was the fact that one were presented with a series of pics, along with some descriptions about what was being done in the pics.
Sometimes, these are really hard to interpolate between frames, unless you had a much larger repetoir of movements based on experience (i.e. a white belt vs another higher belt... e.g. a green belt will have better context of movement than a white belt...)
--
So can this be used to interpolate frames and digest lists (lists are what many martial arts count as documentation for their various arts...
Many of these have been passed down via scrolls with either textual transmissions, paintings and then finally pics before vids existed...
It would be really interesting to see if AI can interpret btwn images and or scroll text to be able to create an animation of said movements.
---
For example, not only was Wally Jay one of my teachers, but as the inventor (re-discoverer) of Small Circle JuiJitsu - his pics are hard to infer what is happening... because there is a lot of nuanced feeling in each movement that is hard to convey via pics/text
But if you can interpolate btwn frames, and model the movements, its game changing because through such interpolations on can imagine that you can get any angle of viewership -- and additionally, one can have the precise positioning and translucent display of bone/joint/muscle articulation such that one may provide for a deeper insight into the kinematics behind each movement.
by senko on 5/10/23, 6:58 PM
I've been thinking lately of the two tiered reasoner + tools architecture inspired by LangChain, simonw's writing[0] and this is right along those lines.
We're trying too hard to have one model do it all. If we coordinate multiple models + other tools (ala ReAct pattern) we could make the systems more resistant to prompt injection (and possibly other) attacks and leverage their respective strengths and weaknesses.
I'm a bit wary of tool invocation via python code instead of prompting the "reasoning" LLM to teach it about the special commands it can invoke. Python's a good crutch because LLMs know it reasonably well (I use a similar trick in my project, but I parse the resulting AST instead of running the untrusted code) so it's simpler to prompt them.
In a few iterations I expect to see LLMs fine tuned to know about the standard toolset at their disposal (eg. huggingface default tools) and further refinement of the two-tiered pattern.
[0] https://simonwillison.net/2023/Apr/25/dual-llm-pattern/
by abidlabs on 5/10/23, 5:29 PM
Follow up Guide that explains how to create your own tools: https://huggingface.co/docs/transformers/custom_tools
by ed on 5/10/23, 10:20 PM
Cool! The DX is tricky to nail, when combined with LLM's tendency to hallucinate.
I asked it to extract some text from an image, which it dutifully tried to do. However the generated python kept throwing errors. There's no image -> text tool yet, so it was trying to use the image segmenter to generate a mask and somehow extract text from that.
It would be super helpful to:
1) Have a complete list of available tools (and / or a copy of the entire prompt given to the LLM responsible for generating python). I used prompt injection to get a partial list of tools and checked the Github agent PR for the rest, but couldn't find `<<all_tools>>` since it gets generated at runtime (I think?).
2) Tell the LLM it's okay to fail. E.g.: "Extract the text from image `image`. If you are unable to do this using the tools provided, say so." This prompt let me know there's no tool for text extraction.
Update: per https://huggingface.co/docs/transformers/custom_tools you can output a full list of tools with `print(agent.toolbox)`
by syntaxing on 5/10/23, 11:06 PM
Whoa this is super awesome, kind of makes a ton of sense since HF pretty much dominates the market for model hosting and interfacing. The documentation actually looks about as complex as langchain. Gonna give it a go to query the docs with an agent to get an example (going full circle).
by PaulHoule on 5/10/23, 5:10 PM
Kinda what people are asking for, I mean people are really attracted to "describe a task" as opposed to "create a training set".
by nico on 5/10/23, 6:05 PM
They also released today StarChat, their code model fine tuned as an assistant
Might be good to try with CodeGPT, AutoGPT or BabyAGI
by minimaxir on 5/10/23, 5:48 PM
From the documentation, HF Agents are much better explained than LangChain but not easier to use, and due to multimodality it may actually be more arcane to use.
by anton5mith2 on 5/10/23, 10:58 PM
Could use LocalAI to get around this: “The openAI models perform better (but require you to have an openAI API key, so cannot be used for free);”
https://www.reddit.com/r/selfhosted/comments/12w4p2f/localai...
by bluepoint on 5/11/23, 7:19 AM
If you are like me and you tried to copy paste the python commands and it did not work, you need to generate an access token. Here is what you should do:
1. Sign up (https://huggingface.co/) to hugging face.
2. Setup access tokens (https://huggingface.co/settings/tokens)
3. Install or Upgrade some dependencies `pip install huggingface_hub transformers accelerate`
4. From the terminal run `jupyter lab`
5. Then, if I did not forget any other dependencies you can just copy paste
```python
from huggingface_hub import login from transformers import HfAgent
login("hf_YOUR_HUGGING_FACE_TOKEN")
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcode...")
agent.run("Is the following `text` (in Spanish) positive or negative?", text="¡Este es un API muy agradable!")
```
by chaxor on 5/10/23, 11:56 PM
This is beautiful, but is there a decent way to plow through say, 20TB of text and put that into a vector database (encoder only)? It would be quite a great addition, especially if the vectors could then be translated into other forms (different language, json representation, pull out names/NER, etc) by just applying a decoder to the database.
by og_kalu on 5/10/23, 4:57 PM
If a typical LLM has decent representation of the languages in question (and you'd be surprised how little decent is with all the positive transfer that goes on during training) then outsourcing translation is just a downgrade. a pretty big one in fact.
https://github.com/ogkalu2/Human-parity-on-machine-translati...
T5 seems to be the default so i get why it's done here. Just an observation.
by IAmStoxe on 5/10/23, 5:06 PM
This seems to be an interpretation similar to that of langchain.
by sudoapps on 5/10/23, 7:55 PM
As this LLM agent architecture continues to evolve and improve, we will probably see a lot of incredible products built on top of it.
by macrolime on 5/10/23, 8:26 PM
How does this compare to langchain agents?