Education & Careers

Unlocking Agentic Data Science: A Step-by-Step Guide to marimo Pair Programming

2026-05-04 07:49:08

Overview

Modern data science workflows often involve repetitive tasks like data cleaning, exploration, and debugging. What if you could pair with an intelligent coding agent that understands your notebook and helps you iterate faster? That's exactly what marimo pair offers: an agentic layer within the marimo reactive notebook environment that assists with data wrangling, research, and code generation. This guide will walk you through adding agent skills to your data science pipeline using marimo pair. You'll learn how to set it up, start a pair session, and collaborate with the agent on real-world tasks such as wrangling messy datasets and performing exploratory analysis. By the end, you'll be able to accelerate your data science projects while maintaining full control over your code.

Unlocking Agentic Data Science: A Step-by-Step Guide to marimo Pair Programming
Source: realpython.com

Prerequisites

Step-by-Step Instructions

1. Installing marimo and Enabling marimo Pair

First, install marimo from PyPI. Open a terminal and run:

pip install marimo

After installation, start a new notebook:

marimo edit my_notebook.py

This launches the marimo editor in your default browser. To activate the pair agent, you need to set your LLM API key as an environment variable. For example, if using OpenAI:

export OPENAI_API_KEY='your-api-key-here'

Then, inside the notebook, click on the Pair icon in the toolbar (or use the keyboard shortcut Cmd+Shift+P). This opens the pair panel where you can start a conversation with the agent.

2. Initializing Your Data Science Session

With the pair panel ready, load your dataset. Create a new cell and write:

import pandas as pd
df = pd.read_csv('sales_data.csv')
df.head()

marimo automatically runs the cell and displays the result. Now you can ask the agent for help. In the pair chat, type: "Is there any missing data in this dataframe?" The agent will inspect the notebook's state and respond with a code snippet you can insert.

3. Invoking the marimo Pair Agent

The agent understands the current notebook context. For instance, you can ask: "Show me all rows where revenue is null". The agent will generate code like:

df[df['revenue'].isnull()]

Click Insert Code in the chat bubble to place it into a new cell. You can also request explanations: "Why are there 120 missing values in the date column?" The agent might suggest checking the data source or imputing with a forward fill.

4. Guided Data Wrangling with the Agent

Let's walk through a common wrangling task: cleaning a column and merging two datasets. Start by asking: "Help me clean the 'price' column – it has dollar signs and commas." The agent may propose:

Unlocking Agentic Data Science: A Step-by-Step Guide to marimo Pair Programming
Source: realpython.com
df['price'] = df['price'].replace('[\$,]', '', regex=True).astype(float)

Insert it and run. Next, load a second dataset and ask: "Merge this inventory data with the main sales table on product_id." The agent will generate the merge code. You can then request: "Create a summary table of total sales per region." The agent produces the appropriate groupby and aggregation.

5. Collaborative Research and Analysis

Beyond wrangling, marimo pair helps with exploratory research. For example, ask: "What are the top 3 factors affecting sales?" The agent cannot run statistical models alone, but it can suggest a correlation analysis or a simple linear regression. It writes the code, you run it. The agent also helps interpret results: "Explain this confusion matrix." This back‑and‑forth turns your notebook into a collaborative whiteboard.

Common Mistakes

Summary

marimo pair transforms your data science notebook into an interactive partner for wrangling, analysis, and research. By following the steps above – installing marimo, activating pair with an LLM, and collaborating on tasks – you can accelerate your workflow while maintaining code quality. With agentic pair programming, you gain a second pair of eyes that never sleeps.

Explore

Lego and Sega Team Up for a Nostalgic Brick-Based Sega Genesis Set Major Sports Unions Urge CFTC to Ban Prediction Market Bets on Player Underperformance The Future of Quantum Computing in 2026 GitHub RCE Flaw Exposed Millions of Repositories Before Patch How to Follow and Analyze Major EV News: A Step-by-Step Guide Inspired by the Electrek Podcast Episode on Tesla Semi, Xpeng, and Rivian