# Why Your AI Gets Dumber After 10 Minutes

Tom Brewer
Table of Contents

These notes are based on the YouTube video by Ray Fernando


Context Windows and AI Performance

The video discusses how AI performance can be affected during long sessions, impacting tools like Claude, Cursor, and ChatGPT. This is attributed to the context window, which is the amount of information the AI can process at one time, acting as the model’s “short-term memory”. As a conversation or session continues and the context window fills up, older information may be truncated or summarized, which can affect the model’s ability to reference earlier details accurately.

  • State-of-the-art models now have context windows of 1 million to 100 million tokens, such as Gemini 1.5 and Magic.dev’s LTM-2-Mini, allowing for more extensive conversations and tasks.
  • As the conversation continues, the context window’s limitations can lead to decreased accuracy and performance, although this is less of a constraint for most use cases than in the past, with workarounds like vector databases commonly used to extend effective memory.

Sub-Agents and Context Windows

Sub-agents, or modular approaches, are an emerging technique to narrow the scope of the context window, allowing for more defined and accurate tasks.

  • Sub-agents can be used to break down tasks, enabling each agent to focus on a narrower context, which can help maintain accuracy and performance for specific subtasks.
  • While sub-agents are not yet a standard feature in most mainstream tools, some advanced platforms and research prototypes use this approach to improve performance and accuracy.

Distractors and Accuracy

Distractors are irrelevant or off-topic information that can reduce AI accuracy, especially as the context window grows and the model must sift through more data to find relevant details.

  • The general principle that distractors can reduce accuracy is supported by various studies, although there is no widely cited specific study to reference, such as the previously mentioned ChromaDB study.
  • As the number of distractors increases, the accuracy of the AI can decrease significantly, highlighting the importance of managing context and focusing on relevant information.

Code Generation and Context Windows

The video demonstrates how code generation can be affected by the context window, using examples with Cloud Code and other tools.

  • Code generation can be impacted by context window limitations: as more tokens are used, the model may lose track of earlier code or instructions, leading to reduced accuracy.
  • Using modular approaches and narrowing the scope of the context window can help improve code generation accuracy, although these techniques are still evolving and not yet universally available.

Tooling and Future Developments

The video touches on the current state of tooling for AI and the potential for future developments.

  • Current tools like Cloud Code and Cursor are improving, especially in handling larger context windows and integrating with code workflows, although explicit support for sub-agents is not yet a standard feature in most commercial tools.
  • The future of AI development is expected to involve more intelligent tooling, allowing for easier and more accurate interactions with AI models, with ongoing research and advancements in context window management and modular architectures.

Summary

The key takeaways from the video are:

  • AI performance can be affected by context window limitations, although state-of-the-art models now handle millions of tokens, making this less of a constraint for most use cases.
  • Sub-agents or modular approaches can be used to narrow the scope of the context window, improving accuracy and performance, although this is still an emerging technique.
  • Distractors can significantly reduce AI accuracy, especially with large context windows, highlighting the importance of managing context and focusing on relevant information.
  • Code generation can be improved by using modular approaches and focusing the context window, although these techniques are still evolving.
  • Tooling for AI is rapidly evolving, with improvements in context window management and features to support more intelligent interactions, although explicit support for sub-agents is not yet standard in most commercial tools.
Tom Brewer Avatar

Thanks for reading my notes! Feel free to check out my other notes or contact me via the social links in the footer.

# Frequently Asked Questions

What does 200K context window mean?

The context window refers to the amount of information an AI can process at one time, acting as the model's 'short-term memory'. A 200K context window would mean the AI can process 200,000 tokens or units of information at a time. However, the video mentions that state-of-the-art models now have context windows of 1 million to 100 million tokens.

What is GPT 4 context window?

The video does not specifically mention the context window size of GPT 4. However, it does mention that state-of-the-art models, such as Gemini 1.5 and Magic.dev's LTM-2-Mini, have context windows of 1 million to 100 million tokens.

Which AI has the biggest context window?

According to the video, state-of-the-art models such as Gemini 1.5 and Magic.dev's LTM-2-Mini have some of the largest context windows, with sizes ranging from 1 million to 100 million tokens.

Continue Reading