Herd Documentation
==


Herd is a powerful platform and SDK for automating your own browser, ten or millions of them. Similar to Puppeteer but with support for multiple devices and real-time events, and no infrastructure to setup.


Document: Automation Basics
URL: https://herd.garden/docs/automation-basics

# Automation Basics

Welcome to Monitoro Herd! This guide will walk you through creating your first browser automation step-by-step. We'll start with the basics and gradually build up to more complex examples, explaining each concept along the way.

## javascript

## JavaScript SDK

### Setting Up Your Environment

Before writing any code, you'll need to set up your JavaScript environment and install the Herd SDK:

1. Make sure you have Node.js installed (version 14 or higher recommended)
2. Create a new project directory
3. Install the SDK using npm:

Code (bash):
npm install @monitoro/herd

### Initializing the Client

The first step in any automation is to initialize the Herd client with your API credentials:

Code (javascript):
// Import the Herd client
import { HerdClient } from '@monitoro/herd';

// Initialize the client with your API URL and token
const client = new HerdClient('your-token');

// Always initialize the client before using it
await client.initialize();
Note: **Note:** Replace the token with your actual Herd API token from your dashboard.

### Connecting to a Device

After initializing the client, you need to connect to a device (browser) that will perform the automation:

Code (javascript):
// Get a list of available devices
const devices = await client.listDevices();

// Connect to the first available device
const device = devices[0];

console.log(`Connected to device: ${device.id}`);

This code retrieves all devices registered to your account and connects to the first one. In a production environment, you might want to select a specific device based on its properties or availability.

### Creating a Page and Navigating

Now that you're connected to a device, you can create a new browser page and navigate to a website:

Code (javascript):
// Create a new page in the browser
const page = await device.newPage();

// Navigate to a website
await page.goto('https://example.com');

console.log('Successfully navigated to example.com');

The `goto` method loads the specified URL and waits for the page to load. By default, it waits until the page's `load` event is fired, but you can customize this behavior with options.

### Extracting Basic Information

One of the most common automation tasks is extracting information from web pages. Here's how to extract basic elements:

Code (javascript):
// Extract content using CSS selectors
const content = await page.extract({
  title: 'h1',           // Extracts the main heading
  description: 'p',      // Extracts the first paragraph
  link: 'a'              // Extracts the first link text
});

// Display the extracted content
console.log('Extracted content:');
console.log(`Title: ${content.title}`);
console.log(`Description: ${content.description}`);
console.log(`Link: ${content.link}`);

The `extract` method uses CSS selectors to find elements on the page and extract their text content. This is a powerful way to scrape structured data from websites.

### Proper Resource Management

Always remember to close resources when you're done with them to prevent memory leaks:

Code (javascript):
// Close the page when done
await page.close();

// Close the client connection
await client.close();

### Putting It All Together

Here's a complete example that combines all the steps above into a single function:

Code (javascript):
import { HerdClient } from '@monitoro/herd';

async function runBasicAutomation() {
  const client = new HerdClient('your-token');
  
  try {
    // Initialize the client
    await client.initialize();
    console.log('Client initialized successfully');
    
    // Get the first available device
    const devices = await client.listDevices();
    if (devices.length === 0) {
      throw new Error('No devices available');
    }
    const device = devices[0];
    console.log(`Connected to device: ${device.id}`);
    
    // Create a new page
    const page = await device.newPage();
    console.log('New page created');
    
    // Navigate to a website
    console.log('Navigating to example.com...');
    await page.goto('https://example.com');
    console.log('Navigation complete');
    
    // Extract content
    console.log('Extracting content...');
    const content = await page.extract({
      title: 'h1',
      description: 'p',
      link: 'a'
    });
    
    // Display the extracted content
    console.log('\nExtracted content:');
    console.log(`Title: ${content.title}`);
    console.log(`Description: ${content.description}`);
    console.log(`Link: ${content.link}`);
    
  } catch (error) {
    console.error('Error during automation:', error);
  } finally {
    // Always close the client when done
    console.log('Closing client connection...');
    await client.close();
    console.log('Client connection closed');
  }
}

// Run the automation
runBasicAutomation();

### Interacting with Web Pages

Now let's explore how to interact with elements on a page. This includes clicking buttons, typing text, and handling forms.

#### Finding Elements

Before interacting with an element, you need to find it on the page:

Code (javascript):
// Find an element using a CSS selector
const searchBox = await page.$('input[name="q"]');

// Check if the element was found
if (searchBox) {
  console.log('Search box found');
} else {
  console.log('Search box not found');
}

The `$` method returns the first element that matches the CSS selector, or `null` if no element is found.

#### Typing Text

To type text into an input field:

Code (javascript):
// Type text into an input field
await page.type('input[name="q"]', 'Monitoro Herd automation');
console.log('Text entered into search box');

The `type` method finds the element using the CSS selector and simulates typing the specified text.

#### Clicking Elements

To click a button or link:

Code (javascript):
// Click a button
await page.click('input[type="submit"]');
console.log('Search button clicked');

By default, the `click` method just clicks the element. If you want to wait for navigation to complete after clicking:

Code (javascript):
// Click and wait for navigation
await page.click('input[type="submit"]', { 
  waitForNavigation: 'networkidle2' 
});
console.log('Search button clicked and navigation completed');

The `networkidle2` option waits until there are no more than 2 network connections for at least 500ms.

#### Waiting for Elements

Sometimes you need to wait for elements to appear on the page:

Code (javascript):
// Wait for an element to appear
await page.waitForSelector('#search');
console.log('Search results have loaded');

This is useful when dealing with dynamic content that loads after the initial page load.

#### Search Engine Example

Let's put these concepts together in a search engine example:

Code (javascript):
async function searchExample() {
  const client = new HerdClient('your-token');
  
  try {
    await client.initialize();
    const devices = await client.listDevices();
    const device = devices[0];
    const page = await device.newPage();
    
    // Navigate to a search engine
    console.log('Navigating to Google...');
    await page.goto('https://www.google.com');
    
    // Type in the search box
    console.log('Entering search query...');
    await page.type('input[name="q"]', 'Monitoro Herd automation');
    
    // Submit the search form and wait for results
    console.log('Submitting search...');
    await page.click('input[type="submit"]', { 
      waitForNavigation: 'networkidle2' 
    });
    
    // Wait for results to load completely
    console.log('Waiting for search results...');
    await page.waitForSelector('#search');
    
    // Extract search result titles
    console.log('Extracting search results...');
    const searchResults = await page.extract({
      titles: {
        _$r: '#search .g h3',  // _$r extracts multiple elements
        text: ':root'           // For each match, get its text
      }
    });
    
    // Display the search result titles
    console.log('\nSearch Results:');
    searchResults.titles.forEach((result, index) => {
      console.log(`${index + 1}. ${result.text}`);
    });
    
  } catch (error) {
    console.error('Error:', error);
  } finally {
    await client.close();
  }
}

## python

## Python SDK

### Setting Up Your Environment

Before writing any code, you'll need to set up your Python environment and install the Herd SDK:

1. Make sure you have Python 3.8+ installed
2. Create a virtual environment (recommended)
3. Install the SDK using pip:

Code (bash):
pip install monitoro-herd

### Initializing the Client

The first step in any automation is to initialize the Herd client with your API credentials:

Code (python):
# Import the Herd client
from monitoro_herd import HerdClient

# Initialize the client with your API URL and token
client = HerdClient('your-token')

# Always initialize the client before using it
client.initialize()
Note: **Note:** Replace the token with your actual Herd API token from your dashboard.

### Connecting to a Device

Next, connect to a device that will run your automation:

Code (python):
# Get available devices
devices = await client.list_devices()

# Connect to the first device
device = devices[0]

print(f"Connected to device: {device.id}")

### Creating a Page and Navigating

Now create a browser page and navigate to a website:

Code (python):
# Create a new page
page = await device.new_page()

# Navigate to a website
await page.goto("https://example.com")

print("Successfully navigated to example.com")

### Extracting Basic Information

Extract information from the page using CSS selectors:

Code (python):
# Extract basic information
data = await page.extract({
    "title": "h1",          # Main heading
    "description": "p",     # First paragraph
    "link": "a"             # First link text
})

# Display the extracted data
print("Extracted data:")
print(f"Title: {data['title']}")
print(f"Description: {data['description']}")
print(f"Link: {data['link']}")

### Resource Management

Always close resources when you're done:

Code (python):
# Close the page
await page.close()

# Close the client
await client.close()

### Complete Basic Example

Here's a complete example putting all these concepts together:

Code (python):
import asyncio
from monitoro_herd import HerdClient

async def basic_extraction():
    # Initialize the client
    client = HerdClient("your-token")
    
    try:
        # Initialize the connection
        await client.initialize()
        print("Client initialized successfully")
        
        # Get the first available device
        devices = await client.list_devices()
        if not devices:
            raise Exception("No devices available")
        device = devices[0]
        print(f"Connected to device: {device.id}")
        
        # Create a new page
        page = await device.new_page()
        print("New page created")
        
        # Navigate to a website
        print("Navigating to example.com...")
        await page.goto("https://example.com")
        print("Navigation complete")
        
        # Extract data using simple selectors
        print("Extracting content...")
        data = await page.extract({
            "title": "h1",
            "description": "p",
            "link": "a"
        })
        
        # Display the extracted data
        print("\nExtracted data:")
        print(f"Title: {data['title']}")
        print(f"Description: {data['description']}")
        print(f"Link: {data['link']}")
        
    except Exception as e:
        print(f"Error during automation: {e}")
    finally:
        # Always close resources
        print("Closing client connection...")
        await client.close()
        print("Client connection closed")

# Run the async function
asyncio.run(basic_extraction())

### Working with Lists and Structured Data

One of the most powerful features of Herd is the ability to extract structured data from lists of elements. This is perfect for scraping search results, product listings, or article collections.

#### The `_$r` Selector

To extract multiple elements that match a pattern, use the `_$r` (repeat) selector:

Code (python):
# Extract a list of items
data = await page.extract({
    "items": {
        "_$r": ".item",       # Find all elements with class "item"
        "name": ".item-name", # For each item, get the name
        "price": ".price"     # For each item, get the price
    }
})

# Access the extracted items
for item in data["items"]:
    print(f"Name: {item['name']}, Price: {item['price']}")

The `_$r` selector tells Herd to find all elements matching the selector and extract the specified properties for each one.

#### Extracting Attributes

Sometimes you need to extract an attribute rather than the text content:

Code (python):
# Extract links and their href attributes
data = await page.extract({
    "links": {
        "_$r": "a",              # Find all links
        "text": ":root",         # Get the link text
        "url": {
            "_$": ":root",       # Reference the same element
            "attribute": "href"  # Get its href attribute
        }
    }
})

# Display the links
for link in data["links"]:
    print(f"Link: {link['text']} -> {link['url']}")

#### Hacker News Example

Let's put these concepts together to scrape stories from Hacker News:

Code (python):
import asyncio
from monitoro_herd import HerdClient

async def scrape_hacker_news():
    client = HerdClient("your-token")
    
    try:
        await client.initialize()
        devices = await client.list_devices()
        device = devices[0]
        page = await device.new_page()
        
        # Navigate to Hacker News
        print("Navigating to Hacker News...")
        await page.goto("https://news.ycombinator.com")
        
        # Extract stories and their metadata
        print("Extracting stories...")
        data = await page.extract({
            # Extract the story elements
            "stories": {
                "_$r": ".athing",           # Each story row
                "title": ".titleline > a",  # Story title
                "site": ".sitestr",         # Source website
                "link": {
                    "_$": ".titleline > a", # Story link
                    "attribute": "href"     # Get the URL
                }
            },
            # Extract the metadata (points, author, etc.)
            "metadata": {
                "_$r": ".subline",          # Metadata rows
                "points": ".score",         # Points count
                "author": ".hnuser",        # Author username
                "time": ".age"              # Submission time
            }
        })
        
        # Combine stories with their metadata
        # (They're in separate lists but in the same order)
        combined_stories = list(zip(data["stories"], data["metadata"]))
        
        # Display the first 3 stories
        print(f"\nExtracted {len(combined_stories)} stories:")
        for i, (story, meta) in enumerate(combined_stories[:3]):
            print(f"\nStory {i+1}:")
            print(f"Title: {story['title']}")
            if "site" in story:
                print(f"Site: {story['site']}")
            print(f"Link: {story['link']}")
            if "points" in meta:
                print(f"Points: {meta['points']}")
            if "author" in meta:
                print(f"Author: {meta['author']}")
            if "time" in meta:
                print(f"Posted: {meta['time']}")
    
    finally:
        await page.close()
        await client.close()

# Run the function
asyncio.run(scrape_hacker_news())

## Tips for Successful Automation

1. **Start Simple**: Begin with basic extractions before moving to complex interactions
2. **Use Appropriate Selectors**: Learn CSS selectors to target elements precisely
3. **Handle Errors**: Always include try/catch (JavaScript) or try/except (Python) blocks
4. **Close Resources**: Always close pages and clients when done to avoid resource leaks
5. **Test Incrementally**: Build your automation step by step, testing each part
6. **Add Delays When Needed**: For dynamic content, use `waitForSelector` or similar methods
7. **Debug with Screenshots**: Take screenshots during automation to see what's happening

## Next Steps

Now that you've created your first automation, you can:

- Explore more complex selectors and extraction patterns
- Learn how to handle authentication and login flows
- Set up scheduled automations for regular data collection
- Integrate with your existing systems via APIs

==

Document: Data Extraction
URL: https://herd.garden/docs/data-extraction

# Data Extraction

Welcome to Monitoro Herd's powerful data extraction system! This guide will walk you through how to extract structured data from web pages using our intuitive selector system and transformation pipelines.

## Understanding Selectors

Herd provides a flexible and powerful way to extract data from web pages using a declarative JSON-based selector system.

### Basic Extraction

## javascript

The simplest form of extraction uses CSS selectors to target elements:

Code (javascript):
// Extract basic text content
const data = await page.extract({
  title: 'h1',           // Extracts the main heading
  description: 'p',      // Extracts the first paragraph
  link: 'a'              // Extracts the first link text
});

console.log(data.title);       // "Welcome to Our Website"
console.log(data.description); // "This is our homepage."

## python

The simplest form of extraction uses CSS selectors to target elements:

Code (python):
# Extract basic text content
data = await page.extract({
    "title": "h1",           # Extracts the main heading
    "description": "p",      # Extracts the first paragraph
    "link": "a"              # Extracts the first link text
})

print(data["title"])       # "Welcome to Our Website"
print(data["description"]) # "This is our homepage."

### Advanced Selector Syntax

## javascript

For more complex extraction needs, use the expanded object syntax:

Code (javascript):
const data = await page.extract({
  title: {
    _$: 'h1',            // CSS selector
    attribute: 'id'      // Extract the ID attribute instead of text
  },
  price: {
    _$: '.price',        // Target price element
    pipes: ['parseNumber'] // Apply transformation
  }
});

## python

For more complex extraction needs, use the expanded object syntax:

Code (python):
data = await page.extract({
    "title": {
        "_$": "h1",            # CSS selector
        "attribute": "id"      # Extract the ID attribute instead of text
    },
    "price": {
        "_$": ".price",        # Target price element
        "pipes": ["parseNumber"] # Apply transformation
    }
})

### Extracting Lists of Items

## javascript

To extract multiple elements that match a pattern, use the `_$r` (repeat) selector:

Code (javascript):
const data = await page.extract({
  items: {
    _$r: '.item',        // Find all elements with class "item"
    title: 'h2',         // For each item, get the title
    price: '.price',     // For each item, get the price
    date: 'time'         // For each item, get the date
  }
});

// Access the extracted items
data.items.forEach(item => {
  console.log(`${item.title}: ${item.price}, Posted: ${item.date}`);
});

## python

To extract multiple elements that match a pattern, use the `_$r` (repeat) selector:

Code (python):
data = await page.extract({
    "items": {
        "_$r": ".item",        # Find all elements with class "item"
        "title": "h2",         # For each item, get the title
        "price": ".price",     # For each item, get the price
        "date": "time"         # For each item, get the date
    }
})

# Access the extracted items
for item in data["items"]:
    print(f"{item['title']}: {item['price']}, Posted: {item['date']}")

### Nested Extraction

## javascript

You can nest selectors to extract hierarchical data:

Code (javascript):
const data = await page.extract({
  product: {
    name: '.product-name',
    details: {
      _$: '.product-details',
      specs: {
        _$r: '.spec-item',
        label: '.spec-label',
        value: '.spec-value'
      }
    }
  }
});

## python

You can nest selectors to extract hierarchical data:

Code (python):
data = await page.extract({
    "product": {
        "name": ".product-name",
        "details": {
            "_$": ".product-details",
            "specs": {
                "_$r": ".spec-item",
                "label": ".spec-label",
                "value": ".spec-value"
            }
        }
    }
})

## Special Selectors

Herd provides special selectors to handle various extraction scenarios:

### Root Selector (`:root`)

The `:root` selector refers to the current element in context:

## javascript

Code (javascript):
const data = await page.extract({
  items: {
    _$r: '.item',
    someElement: ':root',        // Extract text of the .item element itself
    classes: {
      _$: ':root',
      attribute: 'class'  // Extract class attribute of the same element
    }
  }
});

## python

Code (python):
data = await page.extract({
    "items": {
        "_$r": ".item",
        "someElement": ":root",        # Extract text of the .item element itself
        "classes": {
            "_$": ":root",
            "attribute": "class"  # Extract class attribute of the same element
        }
    }
})

### Property Extraction

You can extract JavaScript properties from elements:

## javascript

Code (javascript):
const data = await page.extract({
  dimensions: {
    _$: '.box',
    property: 'getBoundingClientRect'  // Get element dimensions
  },
  html: {
    _$: '.content',
    property: 'innerHTML'  // Get inner HTML
  }
});

## python

Code (python):
data = await page.extract({
    "dimensions": {
        "_$": ".box",
        "property": "getBoundingClientRect"  # Get element dimensions
    },
    "html": {
        "_$": ".content",
        "property": "innerHTML"  # Get inner HTML
    }
})

## Transformation Pipelines

Herd includes powerful transformation pipelines to process extracted data:

### Available Transformations

| Pipe | Description | Example Input | Example Output |
|------|-------------|--------------|----------------|
| `trim` | Removes whitespace from start/end | `"  Hello  "` | `"Hello"` |
| `toLowerCase` | Converts text to lowercase | `"HELLO"` | `"hello"` |
| `toUpperCase` | Converts text to uppercase | `"hello"` | `"HELLO"` |
| `parseNumber` | Extracts numbers from text | `"$1,2K.45"` | `1200.45` |
| `parseDate` | Converts text to date | `"2024-01-15"` | `"2024-01-15T00:00:00.000Z"` |
| `parseDateTime` | Converts text to datetime | `"2024-01-15T12:00:00Z"` | `"2024-01-15T12:00:00.000Z"` |

### Using Transformations

Apply transformations using the `pipes` property:

## javascript

Code (javascript):
const data = await page.extract({
  price: {
    _$: '.price',
    pipes: ['parseNumber']  // Convert "$1,234.56" to 1234.56
  },
  title: {
    _$: 'h1',
    pipes: ['trim', 'toLowerCase']  // Apply multiple transformations
  }
});

## python

Code (python):
data = await page.extract({
    "price": {
        "_$": ".price",
        "pipes": ["parseNumber"]  # Convert "$1,234.56" to 1234.56
    },
    "title": {
        "_$": "h1",
        "pipes": ["trim", "toLowerCase"]  # Apply multiple transformations
    }
})

### Handling Currency and Large Numbers

The `parseNumber` transformation handles various formats:

## javascript

Code (javascript):
const data = await page.extract({
  price1: {
    _$: '.price-1',  // Contains "$1,234.56"
    pipes: ['parseNumber']  // Result: 1234.56
  },
  price2: {
    _$: '.price-2',  // Contains "$1.5M"
    pipes: ['parseNumber']  // Result: 1500000
  },
  price3: {
    _$: '.price-3',  // Contains "1.5T€"
    pipes: ['parseNumber']  // Result: 1500000000000
  }
});

## python

Code (python):
data = await page.extract({
    "price1": {
        "_$": ".price-1",  # Contains "$1,234.56"
        "pipes": ["parseNumber"]  # Result: 1234.56
    },
    "price2": {
        "_$": ".price-2",  # Contains "$1.5M"
        "pipes": ["parseNumber"]  # Result: 1500000
    },
    "price3": {
        "_$": ".price-3",  # Contains "1.5T€"
        "pipes": ["parseNumber"]  # Result: 1500000000000
    }
})

## Real-World Examples

Let's look at some practical examples of data extraction:

### E-commerce Product Listing

Extract products from a search results page:

## javascript

Code (javascript):
const searchResults = await page.extract({
  products: {
    _$r: '[data-component-type="s-search-result"]',
    title: {
      _$: 'h2 .a-link-normal',
      pipes: ['trim']
    },
    price: {
      _$: '.a-price .a-offscreen',
      pipes: ['parseNumber']
    },
    rating: {
      _$: '.a-icon-star-small .a-icon-alt',
      pipes: ['trim']
    },
    reviews: {
      _$: '.a-size-base.s-underline-text',
      pipes: ['trim']
    }
  }
});

## python

Code (python):
searchResults = await page.extract({
    "products": {
        "_$r": '[data-component-type="s-search-result"]',
        "title": {
            "_$": "h2 .a-link-normal",
            "pipes": ["trim"]
        },
        "price": {
            "_$": ".a-price .a-offscreen",
            "pipes": ["parseNumber"]
        },
        "rating": {
            "_$": ".a-icon-star-small .a-icon-alt",
            "pipes": ["trim"]
        },
        "reviews": {
            "_$": ".a-size-base.s-underline-text",
            "pipes": ["trim"]
        }
    }
})

### News Article List

Extract articles from a news site:

## javascript

Code (javascript):
const articles = await page.extract({
  items: {
    _$r: '.item',
    title: {
      _$: 'h2',
      pipes: ['trim', 'toLowerCase']
    },
    price: {
      _$: '.price',
      pipes: ['parseNumber']
    },
    date: {
      _$: 'time',
      pipes: ['parseDate']
    }
  }
});

## python

Code (python):
articles = await page.extract({
    "items": {
        "_$r": ".item",
        "title": {
            "_$": "h2",
            "pipes": ["trim", "toLowerCase"]
        },
        "price": {
            "_$": ".price",
            "pipes": ["parseNumber"]
        },
        "date": {
            "_$": "time",
            "pipes": ["parseDate"]
        }
    }
})

## Advanced Techniques

### Handling Dynamic Content

For dynamic content that loads after the page is ready:

## javascript

Code (javascript):
// Wait for dynamic content to load
await page.waitForElement('#dynamic span');

// Then extract the content
const data = await page.extract({
  content: '#dynamic span'
});

## python

Code (python):
# Wait for dynamic content to load
await page.waitForElement('#dynamic span')

# Then extract the content
data = await page.extract({
    "content": "#dynamic span"
})

### Extracting Page Metadata

Extract information about the page itself:

## javascript

Code (javascript):
const pageInfo = await page.extract({
  title: 'title',
  metaDescription: 'meta[name="description"]',
  canonicalUrl: {
    _$: 'link[rel="canonical"]',
    attribute: 'href'
  }
});

## python

Code (python):
pageInfo = await page.extract({
    "title": "title",
    "metaDescription": 'meta[name="description"]',
    "canonicalUrl": {
        "_$": 'link[rel="canonical"]',
        "attribute": "href"
    }
})

## Tips for Effective Extraction

1. **Use Specific Selectors**: The more specific your CSS selectors, the more reliable your extraction
2. **Test Incrementally**: Build your extraction schema step by step, testing each part
3. **Handle Missing Data**: Always account for elements that might not exist on the page
4. **Apply Appropriate Transformations**: Use pipes to clean and format data as needed
5. **Combine with Interactions**: For complex sites, interact with the page before extraction

## Next Steps

Now that you understand Herd's data extraction system, you can:

- Create complex extraction schemas for any website
- Transform raw data into structured, usable formats
- Build powerful automations that collect and process web data

==

Document: Getting Started
URL: https://herd.garden/docs/getting-started

# Getting Started with Herd

This guide will help you get up and running with Herd quickly to run your first trail.

## What is Herd?

Herd connects AI Agents to websites using your own browser credentials. It enables you to:

- **Run Trails** - pre-built automations for specific websites and tasks
- **Extract data and interact with websites** using your logged-in browser sessions
- **Interact with web pages** through AI Agents like OpenAI's ChatGPT and Anthropic's Claude

## Quick Start

### 1. Install the Browser Extension

     Chrome

     Edge

     Brave

### 2. Register Your Browser

After installing the extension:

1. Click the Herd icon in your browser toolbar
2. Sign in with your Herd account (or create one)
3. Name your device and register it

![Browser Registration](https://herd.garden/register-device.png)

### 3. Install the Herd SDK

Install the Herd SDK using npm:

## npm

Code (bash):
npm install -g @monitoro/herd

## yarn

Code (bash):
yarn global add @monitoro/herd

## pnpm

Code (bash):
pnpm add -g @monitoro/herd

### 4. Run Your First Trail

The browser trail provides core functionality for navigating and extracting data from any website. Run this command to test it out:

Code (bash):
herd trail run @herd/browser -a markdown -p '{"url": "https://example.com"}'

That's it! Add it to your MCP config to use it in your AI agents like in this example. Note, you can add as many trails as you want to your MCP config:

Code (json):
{
    "mcpServers": {
        "browser": {
            "command": "herd",
            "args": [
                "trail",
                "server",
                "@herd/browser"
            ]
        }
    }
}

## For Developers

You can also automate your browser with the Herd SDK. Connect to it with your AI agents or code:

## javascript

Code (javascript):
// Connect to your Herd device
const client = new HerdClient('your-token');
await client.initialize();
const devices = await client.listDevices();
const device = devices[0];

// Create a new page and navigate
const page = await device.newPage();
await page.goto("https://example.com");

// Extract data using simple selectors
const data = await page.extract({
  title: "h1",
  description: "p",
  link: "a"
});

console.log("Extracted data:", data);

## python

Code (python):
from monitoro_herd import HerdClient

# Connect to your Herd device
client = HerdClient("your-token")
await client.initialize()
devices = await client.list_devices()
device = devices[0]

# Create a new page and navigate
page = await device.new_page()
await page.goto("https://example.com")

# Extract data using simple selectors
data = await page.extract({
  "title": "h1",
  "description": "p",
  "link": "a"
})

print("Extracted data:", data)

## What's Next?

Now that you've run your first trail, you can:

- [Explore available trails](/trails) - Browse pre-built trails for various websites
- [Learn about data extraction](/docs/data-extraction) - Extract structured data from web pages
- [Create your own trail](/docs/trails-automations) - Build and share your own custom trails

## Need Help?

If you encounter any issues during setup:

- Make sure your browser extension is correctly installed and you're signed in
- Check that your device is registered in the [device dashboard](/devices)
- Visit our [troubleshooting guide](/docs/troubleshooting) for common solutions

.browser-btn {
  display: inline-flex;
  align-items: center;
  padding: 0.2rem 1rem;
  background-color: #1f2937;
  color: white;
  border-radius: 0.375rem;
  font-size: 1.2rem;
  font-weight: 500;
  text-decoration: none;
  border: 1px solid rgba(255, 255, 255, 0.1);
}

.browser-btn:hover {
  background-color: #374151;
}

==

Document: Trails Automations
URL: https://herd.garden/docs/trails-automations

# Trails

Trails in the Herd platform are packaged automations that perform specific tasks such as extracting data or submitting forms. They make it easy to reuse automation logic across different projects and achieve high reliability through a solid and accessible testing process.

## javascript

Trails are fully supported in the JavaScript SDK.

## Creating a Trail

A trail defines the following components:

- **urls.ts**: Exports an array of URL definitions
- **selectors.ts**: Exports an array of selector configurations
- **actions.ts**: Exports action classes that implement the `TrailAction` interface

A trail at the minimum has the following structure:

Code:
google-search/
    urls.ts
    selectors.ts
    actions.ts
    package.json

The `package.json` file defines the trail and its dependencies:

Code (json):
{
    "name": "google-search",
    "description": "Search google for webpages.",
    "version": "1.0.0",
    "dependencies": {
        "@monitoro/herd": "latest"
    }
}

You should never run the .ts files manually. Instead, use the Herd CLI to run, test, debug, and publish trails. Make sure to use version control to manage your trails, as publishing submits a built version to the Herd registry, not the source code.

## Trail Implementation Guide

### Step 1: Set up the trail structure

Create a new directory for your trail with the following files:

Code:
my-trail/
    urls.ts
    selectors.ts
    actions.ts
    package.json

Add the basic package.json configuration:

Code (json):
{
    "name": "my-trail",
    "version": "1.0.0",
    "dependencies": {
        "@monitoro/herd": "latest"
    }
}

### Step 2: Define URLs

In `urls.ts`, export an array of URL definitions that your trail will interact with:

Code (typescript):
export default [{
    "id": "my-url",
    "template": "https://example.com/path?param1={param1}&param2={param2}",
    "description": "Description of what this URL represents",
    "examples": [
        { "param1": "value1", "param2": "value2" }
    ],
    "params": {
        "param1": {
            "type": "string",
            "required": true,
            "description": "Description of param1"
        },
        "param2": {
            "type": "string",
            "required": false,
            "default": "defaultValue",
            "description": "Description of param2"
        }
    }
}];

### Step 3: Define Selectors

In `selectors.ts`, export an array of selector configurations that define how to extract data from web pages:

Code (typescript):
export default [{
    "id": "my-selector",
    "value": {
        "dataKey": {
            "_$r": "#main-container",
            "title": { "_$": ".title" },
            "description": { "_$": ".description" },
            "link": { "_$": "a.link", "attribute": "href" }
        }
    },
    "description": "Selector for extracting specific data",
    "examples": [
        {
            "urlId": "my-url",
            "urlParams": { "param1": "value1", "param2": "value2" }
        }
    ]
}];

### Step 4: Implement Actions

In `actions.ts`, define one or more action classes that implement the `TrailAction` interface:

Code (typescript):
import type { Device, TrailAction, TrailActionManifest, TrailRunResources } from "@monitoro/herd";

export class MyTrailAction implements TrailAction {
    manifest: TrailActionManifest = {
        name: "my-action",
        description: "Description of what this action does",
        params: {
            param1: {
                type: "string",
                description: "Description of param1"
            },
            param2: {
                type: "number",
                description: "Description of param2",
                default: 10
            }
        },
        result: {
            type: "array",
            description: "Description of the result",
            items: {
                type: "object",
                properties: {
                    property1: { type: "string" },
                    property2: { type: "number" }
                }
            }
        },
        examples: [
            {
                "param1": "value1",
                "param2": 10
            }
        ]
    }

    async test(device: Device, params: Record, resources: TrailRunResources) {
        try {
            const result = await this.run(device, params, resources);
            // Validate result
            if (!result || result.length === 0) {
                return { status: "error", message: "No results found", result: [] };
            }
            return { status: "success", result };
        } catch (e) {
            return { status: "error", message: `Error: ${e}`, result: null };
        }
    }

    async run(device: Device, params: Record, resources: TrailRunResources) {
        const { param1, param2 } = params;
        const page = await device.newPage();
        
        try {
            // Navigate to URL using the URL template from urls.ts
            await page.goto(resources.url('my-url', { param1, param2 }));
            
            // Extract data using selectors from selectors.ts
            const extracted = await page.extract(resources.selector('my-selector'));
            
            // Process extracted data
            const results = (extracted as any)?.dataKey || [];
            
            // Return processed results
            return results;
        } finally {
            await page.close();
        }
    }
}

### Step 5: Testing and debugging

Use the Herd CLI to test your trail locally (make sure to define test cases as `examples` in the trail action manifest):

Code (bash):
herd trail test --action my-trail

You can also watch for changes in the trail and re-run tests:

Code (bash):
herd trail test --action my-trail --watch  # or herd trail test -a my-trail -w

And you can also test selectors only:

Code (bash):
herd trail test --selector my-selector

And watch for changes in the selectors:

Code (bash):
herd trail test --selector my-selector --watch  # or herd trail test -s my-selector -w

### Step 6: Publish your trail

Publishing is coming soon!

### Best Practices

1. **Error handling**: Implement robust error handling in your actions to handle network issues, missing elements, etc.
2. **Performance**: Minimize page loads and extract as much data as possible from each page.
3. **Maintainability**: Use descriptive names and add comments to make your trail easier to maintain.
4. **Testing**: Test your trail with different parameters to ensure it works in various scenarios.
5. **Versioning**: Increment your trail's version in package.json when making changes.

## python

Trails support for Python SDK is coming soon. Stay tuned for updates!

In the meantime, you can use the JavaScript SDK to create trails and then use the Herd CLI to publish them.

==