If you’ve worked with local language models, you’ve probably run
into the context window limit, especially when using smaller models
on less powerful machines.While it’s an unavoidable constraint,
techniques like context packing make it surprisingly manageable.
Hello, I’m Philippe, and I am a Principal Solutions Architect
helping customers with their usage of Docker. In my previous
blog post, I wrote about how to make a very small
model useful by using RAG.I had limited the message history to 2 to
keep the context length