by hazrmard on 11/1/23, 1:13 PM with 1 comments
by hazrmard on 11/1/23, 1:32 PM
The project started out with learning Bayesian Networks. The nets were learned to determine probabilities of maintenance actions on machines. But, once we explored the data, we realized extensive pre-processing was needed. Most of the information about actions was in free-form descriptions, and not tabulated numerically.
So, the choice was between using a large language model (LLM), remotely or locally, or a more rustic NLP approach. I chose the latter for:
1. Explainability. Smaller models are easier to decompose and analyze.
2. Security. Using LLMs would likely require cloud-based or off-premise devices. They were an additional security hurdle in our case.
3. Speed. This was a very domain-specific dataset with relatively few training instances. Fine-tuning LLMs on this would take time (data collection + training).
4. Performance. The use-case was not generative text, but text retrieval for Bayesian reasoning. We could get away with NLP models that could only adequately infer similarity between text.