Step 1 of 5

Copilot CLI

45 minutes

GitHub Copilot CLI brings AI-powered assistance directly to your terminal, helping you craft shell commands, explain what commands do, and navigate your development environment more efficiently.

What is Copilot CLI?

Copilot CLI is a GitHub CLI extension that lets you ask Copilot questions in plain language directly from your terminal. It supports two key sub-commands:

  • gh copilot suggest — Translate a natural-language task description into a shell command
  • gh copilot explain — Get a plain-language explanation of any shell command

Prerequisites

Before you begin, make sure you have:

Workshop Activity

This topic has a dedicated hands-on workshop page with step-by-step exercises covering installation, command suggestions, explanations, shell aliases, a real-world challenge, and tips & best practices.

Start the Copilot CLI Activity

Quick Reference

Install the extension and start using it in minutes:

# Install
gh extension install github/gh-copilot

# Suggest a command from plain English
gh copilot suggest "list all files modified in the last 7 days"

# Explain an unfamiliar command
gh copilot explain "git rebase -i HEAD~3"

# Set up short aliases (ghcs / ghce)
eval "$(gh copilot alias -- bash)"
Step 2 of 5

Copilot SDK

20 minutes

The GitHub Copilot SDK allows you to programmatically integrate Copilot's AI capabilities into your own applications and tools, enabling you to build custom AI-powered experiences that appear directly in Copilot Chat.

Introduction: Copilot SDK Overview

The Copilot SDK (also referred to as the Copilot Extensions API) provides the building blocks to create GitHub Apps that communicate with Copilot. These extensions appear as agents within Copilot Chat and can respond to user messages with context-aware AI output.

Extension Model

Copilot Extensions follow a client-server architecture:

  • Client — GitHub Copilot Chat (in VS Code, GitHub.com, or GitHub Mobile)
  • Server — Your custom agent endpoint that processes requests and generates responses
  • Communication — HTTPS POST requests and Server-Sent Events (SSE) for streaming responses

Copilot Extensions vs. GitHub Apps

Copilot Extensions are a specialized type of GitHub App:

  • GitHub Apps — General-purpose integrations that can automate workflows, manage repositories, and interact with GitHub's API
  • Copilot Extensions — GitHub Apps with Copilot permissions that can be invoked in Copilot Chat to provide AI-powered assistance

Every Copilot Extension is a GitHub App, but not every GitHub App is a Copilot Extension. To create a Copilot Extension, you enable Copilot permissions on your GitHub App and implement an agent endpoint.

Key Concepts

  • Copilot Extensions — GitHub Apps that integrate with Copilot Chat via a defined API contract
  • Agent endpoint — An HTTPS endpoint your app exposes that Copilot calls when a user invokes your agent
  • Server-Sent Events (SSE) — The streaming protocol used to send responses back to Copilot Chat in real time
  • Payload verification — Signature verification to ensure requests originate from GitHub
  • Context references — Files, issues, PRs, and other resources that users can reference in their messages
Prerequisites

Before starting, ensure you have: (1) A GitHub account with Copilot access, (2) Node.js 18+ or Python 3.9+, and (3) ngrok or similar tool for local HTTPS tunneling.

Exercise 1: Set Up a New Copilot Extension Project

Let's create your first Copilot Extension. You can choose either Node.js or Python.

Option A: Node.js Project

  1. Create a new directory and initialize a Node.js project:
    mkdir my-copilot-extension
    cd my-copilot-extension
    npm init -y
    npm install express
  2. Create server.js with a basic Express server:
    const express = require('express');
    const app = express();
    app.use(express.json());
    
    app.post('/agent', (req, res) => {
      res.setHeader('Content-Type', 'text/event-stream');
      res.write('data: ' + JSON.stringify({
        choices: [{delta: {content: 'Hello!'}, finish_reason: null}]
      }) + '\n\n');
      res.write('data: ' + JSON.stringify({
        choices: [{delta: {}, finish_reason: 'stop'}]
      }) + '\n\n');
      res.end();
    });
    
    app.listen(3000, () => console.log('Server running on port 3000'));
  3. Start the server: node server.js
  4. In another terminal, expose it with ngrok: ngrok http 3000

Option B: Python Project

  1. Create a new directory and set up a virtual environment:
    mkdir my-copilot-extension
    cd my-copilot-extension
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    pip install flask
  2. Create app.py with a basic Flask server:
    from flask import Flask, Response
    import json
    
    app = Flask(__name__)
    
    @app.route('/agent', methods=['POST'])
    def agent():
        def generate():
            yield 'data: ' + json.dumps({
                'choices': [{'delta': {'content': 'Hello!'}, 'finish_reason': None}]
            }) + '\n\n'
            yield 'data: ' + json.dumps({
                'choices': [{'delta': {}, 'finish_reason': 'stop'}]
            }) + '\n\n'
        return Response(generate(), mimetype='text/event-stream')
    
    if __name__ == '__main__':
        app.run(port=5000)
  3. Start the server: python app.py
  4. In another terminal, expose it with ngrok: ngrok http 5000
Complete Examples

Download complete working examples with more features from examples/copilot-sdk/ including Node.js and Python implementations.

Exercise 2: Send a Prompt to the Copilot Model and Stream a Response

Now let's enhance the extension to process user messages and stream responses word by word.

Node.js Implementation

app.post('/agent', async (req, res) => {
  const { messages } = req.body;
  const userMessage = messages[messages.length - 1].content;

  res.setHeader('Content-Type', 'text/event-stream');

  // Generate response based on user message
  const response = `You asked: "${userMessage}". Here's my response!`;
  const words = response.split(' ');

  // Stream word by word
  for (const word of words) {
    res.write('data: ' + JSON.stringify({
      choices: [{delta: {content: word + ' '}, finish_reason: null}]
    }) + '\n\n');
    await new Promise(r => setTimeout(r, 50)); // Delay for effect
  }

  // Send completion signal
  res.write('data: ' + JSON.stringify({
    choices: [{delta: {}, finish_reason: 'stop'}]
  }) + '\n\n');
  res.end();
});

Python Implementation

import time

@app.route('/agent', methods=['POST'])
def agent():
    data = request.json
    messages = data.get('messages', [])
    user_message = messages[-1]['content']

    def stream_response():
        response = f'You asked: "{user_message}". Here\'s my response!'
        words = response.split(' ')

        # Stream word by word
        for word in words:
            chunk = {
                'choices': [{'delta': {'content': word + ' '}, 'finish_reason': None}]
            }
            yield f'data: {json.dumps(chunk)}\n\n'
            time.sleep(0.05)

        # Send completion signal
        final = {'choices': [{'delta': {}, 'finish_reason': 'stop'}]}
        yield f'data: {json.dumps(final)}\n\n'

    return Response(stream_response(), mimetype='text/event-stream')

Test your streaming implementation by asking different questions and watching the response appear word by word!

Exercise 3: Add Context to Model Requests

Copilot can provide context about files, issues, and other resources the user references. Let's use this context in our responses.

Understanding Context References

When users mention files or resources, GitHub includes them in the request:

{
  "messages": [{"role": "user", "content": "Explain this file"}],
  "copilot_references": [
    {
      "type": "file",
      "id": "src/app.js",
      "data": {"content": "const express = require('express');..."}
    }
  ]
}

Using Context in Your Extension

// Node.js
app.post('/agent', (req, res) => {
  const { messages, copilot_references } = req.body;
  const userMessage = messages[messages.length - 1].content;

  res.setHeader('Content-Type', 'text/event-stream');

  let response = `Analyzing your request: "${userMessage}"`;

  if (copilot_references && copilot_references.length > 0) {
    response += '\n\nContext provided:';
    copilot_references.forEach(ref => {
      response += `\n- ${ref.type}: ${ref.id}`;
      if (ref.type === 'file' && ref.data.content) {
        const lines = ref.data.content.split('\n').length;
        response += ` (${lines} lines)`;
      }
    });
  }

  // Stream the response...
});

Try It Out

  1. Update your extension to handle context references
  2. Restart your server
  3. In Copilot Chat, open a file and ask: @your-extension explain this file
  4. Your extension should acknowledge the file context in its response

Exercise 4: Register the Extension in GitHub and Test

Now let's make your extension official by registering it as a GitHub App.

Create a GitHub App

  1. Go to Settings → Developer settings → GitHub Apps → New GitHub App
  2. Fill in the basic information:
    • Name: Choose a unique name (e.g., "My Copilot Helper")
    • Homepage URL: Can be your GitHub profile or repository
    • Webhook URL: Your ngrok HTTPS URL + /agent
  3. Under Permissions, enable:
    • Copilot chat — Read access
  4. Under Copilot section:
    • Set App Type to "Agent"
    • Set URL to your ngrok HTTPS URL + /agent
    • Optionally add inference description and model
  5. Click Create GitHub App

Install and Test

  1. Install the app to your account: Click Install App in the left sidebar
  2. Open GitHub Copilot Chat (in VS Code or GitHub.com)
  3. Invoke your extension: @your-app-name hello!
  4. You should see your extension respond!
Important

Your agent endpoint must use HTTPS. ngrok provides HTTPS URLs automatically. In production, deploy to a platform with HTTPS support (Heroku, Railway, AWS, etc.).

Challenge: Build a Domain-Specific Extension

Now that you understand the basics, build something more specialized! Here's a challenge to create a changelog generator extension.

Changelog Generator Extension

Build an extension that generates changelogs from git repositories:

  • Feature: User provides a repository and version range
  • Processing: Fetch commits between versions using GitHub API
  • Output: Categorized changelog (features, fixes, other)

Implementation Steps

  1. Install GitHub API client:
    • Node.js: npm install @octokit/rest
    • Python: pip install PyGithub
  2. Parse user messages to extract repo name and version tags
  3. Use GitHub API to fetch commits between tags
  4. Categorize commits by keywords (feat, fix, docs, etc.)
  5. Stream formatted changelog back to Copilot Chat

Example Request

@changelog-generator create changelog for microsoft/vscode from v1.85.0 to v1.86.0

Example Response

# Changelog: v1.85.0 → v1.86.0

## ✨ Features
- Add new editor theme support
- Implement improved search functionality

## 🐛 Bug Fixes
- Fix memory leak in extension host
- Resolve syntax highlighting issue

## 📝 Other Changes
- Update documentation
- Refactor test suite

**Total commits:** 147
Ready-Made Example

Find a complete changelog generator implementation in examples/copilot-sdk/changelog-generator/ with full source code and documentation!

Other Extension Ideas

  • Code Review Assistant — Analyze PRs and provide feedback
  • Documentation Generator — Create docs from code comments
  • Test Generator — Generate unit tests for functions
  • Issue Triager — Categorize and label GitHub issues
  • Deployment Helper — Check deployment status and logs
  • Security Scanner — Scan code for common vulnerabilities

SDK Libraries and Tools

GitHub and the community provide libraries to simplify extension development:

Best Practices

  • Verify signatures — Always verify request signatures in production to ensure requests come from GitHub
  • Handle errors gracefully — Stream error messages back to users rather than returning HTTP errors
  • Stream responses — Use SSE to provide real-time feedback as your extension processes requests
  • Add context awareness — Use copilot_references to access files, issues, and other resources
  • Keep responses concise — Users expect quick, actionable responses
  • Document your extension — Provide clear instructions on how to use your extension
  • Test thoroughly — Test with various inputs and edge cases
  • Monitor usage — Track errors and usage patterns to improve your extension

Reference Links

Step 3 of 5

Agent Skills

20 minutes

Agent Skills in VS Code Copilot Chat allow you to define custom, reusable behaviors that guide how Copilot responds to specific types of requests. Skills are defined using markdown files that Copilot discovers and invokes automatically based on the context of your conversation.

What are Agent Skills?

Skills are instructions stored in .SKILL.md files that tell Copilot how to handle specific tasks or workflows. When you ask Copilot a question, it can detect which skills are relevant and use them to provide more tailored, domain-specific responses.

Think of skills as specialized tools in Copilot's toolkit — each one teaches Copilot a specific capability, such as:

  • Writing unit tests following your team's testing conventions
  • Reviewing pull requests with specific criteria in mind
  • Generating documentation in a particular format
  • Refactoring code according to established patterns
  • Creating boilerplate code for common scenarios

How Copilot Discovers Skills

Copilot looks for skill files in specific locations within your project:

  • .github/copilot/skills/ — Repository-level skills shared by all contributors
  • .vscode/copilot/skills/ — Project-specific skills (alternative location)

Skills are automatically loaded when you open a workspace, and Copilot will invoke them based on the conversation context and the skill's applyTo patterns.

Creating Your First Skill

Let's create a skill for writing unit tests. Create a file named write-tests.SKILL.md in your .github/copilot/skills/ directory:

---
name: write-tests
description: Write comprehensive unit tests following best practices
applyTo:
  - "**/*.test.js"
  - "**/*.test.ts"
  - "**/*.spec.js"
  - "**/*.spec.ts"
---

# Writing Unit Tests

When writing unit tests:

1. **Naming Convention**: Use descriptive test names that explain what is being tested and the expected outcome
   - Format: `it('should [expected behavior] when [condition]', ...)`

2. **AAA Pattern**: Structure tests using Arrange, Act, Assert
   - Arrange: Set up test data and dependencies
   - Act: Execute the code being tested
   - Assert: Verify the expected outcome

3. **Coverage**: Write tests for:
   - Happy path scenarios
   - Edge cases and boundary conditions
   - Error handling and exceptions

4. **Mocking**: Mock external dependencies to keep tests isolated and fast

5. **Assertions**: Use specific, clear assertions that make test failures easy to diagnose

## Example Test Structure

```javascript
describe('UserService', () => {
  it('should create a new user when valid data is provided', async () => {
    // Arrange
    const userData = { name: 'John Doe', email: 'john@example.com' };
    const mockDb = { save: jest.fn().mockResolvedValue(userData) };
    const service = new UserService(mockDb);

    // Act
    const result = await service.createUser(userData);

    // Assert
    expect(result).toEqual(userData);
    expect(mockDb.save).toHaveBeenCalledWith(userData);
  });
});
```
Tip

The front matter (between --- lines) defines metadata about the skill, including when it should be applied. The content provides instructions that guide Copilot's behavior.

Scoping Skills with applyTo Patterns

The applyTo field accepts glob patterns that specify which files or contexts should trigger the skill:

---
applyTo:
  - "src/**/*.tsx"        # React/TypeScript files in src
  - "tests/**/*"          # All files in tests directory
  - "*.md"                # Markdown files in root
  - "!node_modules/**"    # Exclude node_modules
---

When you're working with a file that matches these patterns, Copilot will automatically consider this skill when generating responses.

Skill Best Practices

  • Be specific — Provide concrete examples and clear guidelines rather than vague instructions
  • Keep it focused — Each skill should address one specific task or workflow
  • Use code examples — Show Copilot the exact patterns and style you expect
  • Scope appropriately — Use applyTo patterns to ensure skills are invoked at the right time
  • Document conventions — Include your team's specific coding standards and patterns
  • Version control — Check skills into your repository so the whole team benefits

Exercise 1: Create a PR Review Skill

Create a skill that guides Copilot to review pull requests with specific criteria:

  1. Create .github/copilot/skills/pr-review.SKILL.md
  2. Define criteria such as:
    • Check for test coverage
    • Verify documentation is updated
    • Look for security issues
    • Ensure error handling is present
  3. Use applyTo: ["*.md"] so it activates when viewing PR descriptions
  4. Test by asking Copilot to review a pull request

Exercise 2: Create a Documentation Skill

Create a skill for generating API documentation in a specific format:

  1. Create .github/copilot/skills/api-docs.SKILL.md
  2. Specify a documentation template that includes:
    • Function signature
    • Description of purpose
    • Parameter descriptions with types
    • Return value description
    • Usage examples
    • Error cases
  3. Scope to your API files with appropriate applyTo patterns
  4. Test by asking Copilot to document a function in your codebase

Exercise 3: Test Your Skills

Put your skills to work in real scenarios:

  1. Open a test file that matches your unit test skill's applyTo pattern
  2. Ask Copilot: "Write tests for the UserService class"
  3. Observe how Copilot applies your skill's guidelines
  4. Refine the skill based on the results — add more examples or clarify instructions
  5. Try the same prompt without the skill to see the difference
Pro Tip

You can see which skills Copilot is using by checking the VS Code Copilot Chat panel. Skills that are active for the current context will be indicated in the response.

When to Use Skills vs. Instructions vs. Prompts

Understanding when to use each customization approach:

Use Skills when:

  • You want reusable, invokable behaviors for specific tasks
  • The guidance applies to specific file types or patterns
  • You need Copilot to automatically adapt based on context
  • You're documenting repeatable workflows (testing, documentation, review)

Use Instructions when:

  • You want universal guidelines that apply to all Copilot interactions
  • You're setting coding standards, style preferences, or constraints
  • The guidance is project-wide rather than task-specific

Use Prompt Files when:

  • You have complex, multi-step prompts you use frequently
  • You want to create custom slash commands for common requests
  • You need parameterized prompts with variables
Prerequisites

Agent Skills require VS Code with the GitHub Copilot Chat extension. Make sure you have the latest version installed for the best experience.

Challenge: Create a Real-World Skill

Design and implement a skill for an actual workflow in your team's project:

  1. Identify a repetitive task your team performs regularly
  2. Document the conventions, patterns, and standards for that task
  3. Create a skill file with clear instructions and examples
  4. Define appropriate applyTo patterns to scope the skill
  5. Share it with your team and gather feedback
  6. Iterate based on how well Copilot follows the skill in practice

Some ideas to get started:

  • Database migration file creation with your team's naming conventions
  • Component scaffolding following your framework's patterns
  • Error handling implementation with your logging standards
  • Security review checklist for authentication/authorization code
  • Performance optimization guidelines for database queries

Additional Resources

Step 4 of 5

Sub-Agents

25 minutes

Sub-agents allow you to create specialized Copilot agents that work together in coordinated workflows. An orchestrator agent delegates tasks to specialized sub-agents, enabling modular, multi-step workflows that combine different AI capabilities.

What are Sub-Agents?

Sub-agents are custom agent definitions (.agent.md files) that are invoked by a parent agent rather than directly by the user. The parent agent acts as an orchestrator, deciding which sub-agent to invoke and how to combine their results into a unified response.

This differs from single-agent workflows where one agent handles all aspects of a task. With sub-agents, you can:

  • Separate concerns by having specialized agents for research, implementation, testing, etc.
  • Reuse agent definitions across different workflows
  • Maintain focused context within each specialized agent
  • Build complex multi-step workflows with clear task delegation

Agent Definition Files

Custom agents are defined in .agent.md files stored in your workspace. These files contain instructions that shape how the agent behaves:

# Research Agent

You are a specialized research agent focused on gathering information
and analyzing code.

## Your Responsibilities

- Search through the codebase to understand existing patterns
- Identify relevant files, functions, and modules
- Analyze documentation and comments
- Provide recommendations based on findings

## Response Format

Structure your research results with:
- Findings: Key files and patterns discovered
- Analysis: How the code is structured
- Recommendations: Implementation suggestions

Multi-Agent Architecture

A typical multi-agent workflow involves:

  1. Orchestrator agent — Receives user requests and decomposes them into sub-tasks
  2. Research sub-agent — Gathers information and analyzes the codebase
  3. Implementation sub-agent — Writes code and creates files based on research findings
  4. Testing sub-agent (optional) — Creates tests for the implemented code
Prerequisites

This exercise requires VS Code with the GitHub Copilot extension and Copilot Chat with agent mode enabled. You'll need a GitHub Copilot license to use these features.

Exercise 1: Review Multi-Agent Setup

We've created a complete multi-agent example for you to review. Download the example files:

Review these files to understand how agents are structured and how they reference each other.

Exercise 2: Create Your First Sub-Agent

  1. In your workspace, create a .github/agents/ directory
  2. Create a file named research.agent.md with instructions for a research agent (use the example above as a template)
  3. In VS Code Copilot Chat, test your agent by typing: @research analyze the authentication setup in this project
  4. Observe how the agent follows the instructions you provided

Exercise 3: Build an Orchestrator Workflow

  1. Create orchestrator.agent.md that can delegate to your research agent
  2. Create implementation.agent.md for writing code
  3. Add delegation instructions to the orchestrator:
    To invoke a sub-agent, use:
    @research-agent [task description]
    @implementation-agent [task description]
  4. Test the workflow: @orchestrator add a user profile feature to the app
  5. Observe how the orchestrator delegates tasks to research first, then implementation

Exercise 4: Context Passing Between Agents

Pay attention to how context flows between agents:

  1. When the orchestrator calls the research agent, it passes the user's request
  2. The research agent returns structured findings
  3. The orchestrator includes those findings when calling the implementation agent
  4. The final response combines insights from both agents

Experiment by modifying the orchestrator to pass different context or format the output differently.

Challenge: 3-Agent Pipeline

Design a complete pipeline with three specialized agents:

  1. Research Agent: Analyzes the codebase and gathers requirements
  2. Plan Agent: Creates a detailed implementation plan based on research
  3. Implementation Agent: Executes the plan and writes the code

Create agent definitions for each, set up the orchestrator to call them in sequence, and test with a real feature request.

Trade-offs of Multi-Agent Architectures

Advantages

  • Separation of Concerns: Each agent has a clear, focused responsibility
  • Reusability: Sub-agents can be reused across different workflows
  • Better Context Management: Specialized agents maintain focused context
  • Modularity: Easy to add or modify specialized agents
  • Quality: Specialized agents often produce higher quality output in their domain

Disadvantages

  • Complexity: More moving parts to manage and maintain
  • Latency: Multiple agent calls increase overall response time
  • Cost: Each agent invocation uses tokens, increasing API costs
  • Coordination Overhead: Orchestrator must manage handoffs and context passing
  • Debugging: Harder to trace issues across multiple agent interactions

When to Use Sub-Agents

  • Complex tasks that benefit from specialized expertise
  • Tasks requiring distinct phases (research → plan → implement)
  • When you want to reuse agent definitions across projects
  • When quality matters more than speed

When to Use a Single Agent

  • Simple, straightforward tasks
  • When low latency is critical
  • When keeping costs low is a priority
  • For exploratory or ad-hoc queries
Pro Tip

Start with a single agent and only introduce sub-agents when you find that tasks naturally break into distinct phases or require different types of expertise. Premature decomposition can add unnecessary complexity.

Best Practices

  • Keep each agent's instructions clear and focused on a single responsibility
  • Document the expected input and output format for each agent
  • Pass context explicitly between agents rather than relying on implicit state
  • Test individual agents before integrating them into workflows
  • Monitor token usage and costs when using multi-agent workflows
  • Version control your agent definitions alongside your code

Additional Resources

Step 5 of 5

Copilot Memory

15 minutes

Copilot Memory enables agents to persist and retrieve information across conversations, allowing for more personalized, context-aware interactions over time.

What is Copilot Memory?

By default, Copilot agents are stateless — each conversation starts fresh. Memory adds a persistence layer that lets agents store facts, preferences, and context that can be recalled in future sessions.

Types of Memory

  • Short-term memory — Context maintained within a single conversation (automatically handled via the conversation window)
  • Long-term memory — Persisted facts stored in an external store (database, vector store, etc.) and retrieved on demand
  • Episodic memory — Summaries of past conversations stored and retrieved to inform future interactions

Implementing Long-term Memory

A common approach uses a vector database to store and semantically search memories:

# Example: storing and retrieving memories (Python)
from openai import OpenAI

client = OpenAI()

def store_memory(text, memory_store):
    embedding = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    ).data[0].embedding
    memory_store.append({"text": text, "embedding": embedding})

def retrieve_memory(query, memory_store, top_k=3):
    import numpy as np
    query_embedding = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    ).data[0].embedding
    scores = [
        np.dot(query_embedding, mem["embedding"])
        for mem in memory_store
    ]
    top_indices = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:top_k]
    return [memory_store[i]["text"] for i in top_indices]

Memory in Copilot Agents

To integrate memory into a Copilot agent:

  1. At the start of each conversation, retrieve relevant memories based on the user's message
  2. Inject the retrieved memories into the system prompt as context
  3. At the end of the conversation, extract and store key facts from the exchange
Privacy Note

Always be transparent with users about what information is being stored and provide a way for them to review or delete their stored memories.

Hands-on Exercise

  1. Build a simple in-memory store for your agent using a Python list or dictionary
  2. Add a remember skill that stores a key-value fact provided by the user
  3. Add a recall skill that retrieves stored facts and inject them into the agent's response

What's Next?

Resources