- Large-Scale AI Batch Inference: 9x Faster Embedding Generation
by covi on 3/31/25, 4:14 AM, with comments
- Abusing SQLite to Handle Concurrency
by covi on 3/7/25, 12:39 AM, with comments
- Using DeepSeek R1 for RAG: Do's and Don'ts
by covi on 2/27/25, 2:19 AM, with comments
- Building Large-Scale Image Search Using VectorDB and OpenAI Clip
by covi on 2/11/25, 5:28 PM, with comments
- Fungible cloud compute and storage: AI training on any cloud
by covi on 12/6/24, 1:55 AM, with comments
- Getting $1M cloud credits for AI startups – and using them wisely
by covi on 11/1/24, 4:37 PM, with comments
- You Ran the Operational Database on What? Testing Spot Instances
by covi on 9/22/24, 4:20 PM, with comments
- Can Multi-Modal LLMs "See" Images? A Deep Dive with ASCII Art
by covi on 9/17/24, 2:15 AM, with comments
- Embarrassingly parallel batch jobs on multiple clouds or regions
by covi on 8/29/24, 8:43 PM, with comments
- Llama 3.1 finetuning on your own data and infra, without vendor lock-in
by covi on 7/24/24, 3:26 PM, with comments
- Guide: Finetune Llama 3.1 on your infra
by covi on 7/24/24, 2:43 AM, with comments
- AI on Kubernetes Without the Pain
by covi on 7/16/24, 3:49 AM, with comments
- Dagster and SkyPilot: Orchestrating LLM training cost-effectively
by covi on 4/12/24, 6:44 PM, with comments
- Inflection CEO Left, Became CEO of Microsoft AI
by covi on 3/19/24, 4:28 PM, with comments
- SkyServe: A cost-efficient, multi-region/cloud library for serving GenAI models
by covi on 3/8/24, 10:59 PM, with comments
- SGLang: Fast and Expressive LLM Inference with RadixAttention for 5x Throughput
by covi on 2/21/24, 4:56 PM, with comments
- SkyServe: 50% Cheaper AI Serving on Any Cloud with High Availability
by covi on 2/21/24, 3:43 AM, with comments
- Tigris: Globally Distributed S3-Compatible Object Storage
by covi on 2/8/24, 8:21 PM, with comments
- Scaling Mixtral LLM Serving Across Clouds
by covi on 12/21/23, 6:46 PM, with comments
- Covariant: Moving from on-prem to the cloud for 4x faster AI development
by covi on 9/26/23, 4:30 PM, with comments
- ML experiments in the cloud with SkyPilot and DVC
by covi on 8/11/23, 4:21 PM, with comments
- Cookbook: Finetuning Llama 2 in your own cloud environment, privately
by covi on 8/2/23, 6:50 PM, with comments
- Run Llama2 in your cloud, completely privately
by covi on 7/19/23, 3:34 PM, with comments
- The Production Environment at Google, from the Viewpoint of an SRE
by covi on 7/1/23, 3:38 PM, with comments
- UC Berkeley's SkyPilot project: unprecedented GPU availability across new clouds
by covi on 5/31/23, 2:27 AM, with comments
- UC Berkeley's open-source Vicuna LLM chatbot released new improved model weights
by covi on 4/14/23, 4:45 PM, with comments
- Show HN: Run LLaMA LLM chatbots on any cloud with one click
by covi on 3/22/23, 3:14 AM, with comments
- Run LLaMA LLM chatbots on any cloud with one click
by covi on 3/21/23, 2:14 AM, with comments