Featured imageIf you’ve worked with local language models, you’ve probably run into the context window limit, especially when using smaller models on less powerful machines.While it’s an unavoidable constraint, techniques like context packing make it surprisingly manageable. Hello, I’m Philippe, and I am a Principal Solutions Architect helping customers with their usage of Docker. In my previous blog post, I wrote about how to make a very small model useful by using RAG.I had limited the message history to 2 to keep the context length

Just published by Docker: Read more