Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/saiashirwad/parserator/llms.txt

Use this file to discover all available pages before exploring further.

Once you’ve mastered basic parsers, you can combine them to parse complex nested structures. This guide covers composition patterns, recursive parsers, and real-world examples.

Parser Composition

Parsers are built by composing smaller parsers together. The key combinators are:
  • parser(function* () { ... }) - Sequence parsers with generator syntax
  • or() - Try alternatives
  • sepBy() - Parse lists with separators
  • between() - Parse content between delimiters

Email Parser

Here’s a real example from examples/email.ts that parses email addresses:
import { alphabet, char, digit, many1, or, parser } from 'parserator';

const email = parser(function* () {
  // Parse username (letters, digits, dots)
  const username = yield* many1(or(alphabet, digit, char('.'))).expect('username');
  
  yield* char('@').expect('@');
  
  // Parse domain name
  const domain = yield* many1(or(alphabet, digit))
    .map(chars => chars.join(''))
    .expect('domain name');
  
  yield* char('.').expect('.');
  
  // Parse top-level domain
  const tld = yield* many1(alphabet)
    .map(chars => chars.join(''))
    .expect('top-level domain (TLD)');

  return { username: username.join(''), domain: domain + '.' + tld };
});

email.parse('john.doe@example.com');
// ✓ { username: 'john.doe', domain: 'example.com' }
The .expect() method provides semantic error messages. Instead of “Expected ‘a’”, you get “Expected username”.

Phone Number Parser

From examples/phone-number.ts, here’s a parser for formatted phone numbers:
import { char, digit, many1, parser } from 'parserator';

const phoneNumber = parser(function* () {
  yield* char('(');
  const areaCode = yield* many1(digit).expect('area code');
  yield* char(')');
  yield* char(' ');
  const exchange = yield* many1(digit).expect('exchange');
  yield* char('-');
  const number = yield* many1(digit).expect('number');

  return `(${areaCode.join('')}) ${exchange.join('')}-${number.join('')}`;
});

phoneNumber.parse('(555) 123-4567');
// ✓ '(555) 123-4567'

Parsing Lists with sepBy

The sepBy combinator parses zero or more elements separated by a delimiter:
import { char, sepBy, digit, many1 } from 'parserator';

const number = many1(digit).map(d => parseInt(d.join('')));
const comma = char(',');

const numberList = sepBy(number, comma);

numberList.parse('1,2,3,4,5'); // ✓ [1, 2, 3, 4, 5]
numberList.parse('');          // ✓ [] (empty is valid)
numberList.parse('42');        // ✓ [42] (single element)
Use sepBy1 when you need at least one element:
import { sepBy1 } from 'parserator';

const nonEmptyList = sepBy1(number, comma);

nonEmptyList.parse('1,2,3'); // ✓ [1, 2, 3]
nonEmptyList.parse('');      // ✗ fails (needs at least one)

Using between for Delimiters

The between combinator parses content between opening and closing delimiters:
import { between, char, string, many } from 'parserator';

const quoted = between(
  char('"'),
  char('"'),
  many(alphabet).map(chars => chars.join(''))
);

quoted.parse('"hello"'); // ✓ 'hello'

const bracketed = between(
  char('['),
  char(']'),
  sepBy(number, char(','))
);

bracketed.parse('[1,2,3]'); // ✓ [1, 2, 3]

Recursive Parsers with Parser.lazy()

To parse nested structures like JSON, you need recursive parsers. Use Parser.lazy() to define parsers that reference themselves:
import { Parser, or, string, char, sepBy, between, parser } from 'parserator';

// Forward declaration - parser defined later
const jsonValue: Parser<any> = Parser.lazy(() =>
  or(jsonNull, or(jsonBool, or(jsonNumber, or(jsonString, or(jsonArray, jsonObject)))))
);

const jsonNull = string('null').map(() => null);
const jsonBool = or(
  string('true').map(() => true),
  string('false').map(() => false)
);
const jsonNumber = regex(/-?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][+-]?[0-9]+)?/)
  .map(Number);
const jsonString = /* ... string parser ... */;

// Array can contain any JSON value (including nested arrays)
const jsonArray: Parser<any[]> = between(
  char('['),
  char(']'),
  sepBy(jsonValue, char(','))
);

// Object can contain any JSON value
const jsonObject: Parser<Record<string, any>> = parser(function* () {
  yield* char('{');
  const pairs = yield* sepBy(
    parser(function* () {
      const key = yield* jsonString;
      yield* char(':');
      const value = yield* jsonValue; // Recursive!
      return [key, value] as const;
    }),
    char(',')
  );
  yield* char('}');
  return Object.fromEntries(pairs);
});
This example is simplified from examples/json-parser.ts.
Always use Parser.lazy() for recursive parsers! Otherwise you’ll get “Cannot access variable before initialization” errors.

Whitespace Handling in Complex Parsers

Real-world formats often have flexible whitespace. Use the token pattern:
import { regex, Parser, skipSpaces } from 'parserator';

// Wrap any parser to skip leading whitespace
function token<T>(p: Parser<T>): Parser<T> {
  return skipSpaces.then(p);
}

// Now use token() to make parsers whitespace-insensitive
const jsonArray = between(
  token(char('[')),
  token(char(']')),
  sepBy(token(jsonValue), token(char(',')))
);

// This now handles arbitrary whitespace:
jsonArray.parse('[  1  ,  2  ,  3  ]'); // ✓ [1, 2, 3]

Real Example: JSON Parser

Here’s the complete structure from examples/json-parser.ts:
import { parser, char, string, regex, or, many, sepBy, between, Parser, skipSpaces } from 'parserator';

const whitespace = regex(/\s*/);
function token<T>(p: Parser<T>): Parser<T> {
  return p.trimLeft(whitespace);
}

const jsonNull = string('null').map(() => null);
const jsonTrue = string('true').map(() => true);
const jsonFalse = string('false').map(() => false);
const jsonBool = or(jsonTrue, jsonFalse);

const jsonNumber = regex(/-?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][+-]?[0-9]+)?/).map(Number);

const jsonString = parser(function* () {
  yield* char('"');
  const chars: string[] = [];
  
  while (true) {
    const next = yield* or(
      string('\\"').map(() => '"'),
      string('\\\\').map(() => '\\'),
      string('\\/').map(() => '/'),
      string('\\b').map(() => '\b'),
      string('\\f').map(() => '\f'),
      string('\\n').map(() => '\n'),
      string('\\r').map(() => '\r'),
      string('\\t').map(() => '\t'),
      regex(/\\u[0-9a-fA-F]{4}/).map(s => String.fromCharCode(parseInt(s.slice(2), 16))),
      regex(/[^"\\]+/),
      char('"').map(() => null)
    );
    
    if (next === null) break;
    chars.push(next);
  }
  
  return chars.join('');
});

const jsonValue: Parser<any> = Parser.lazy(() =>
  or(jsonNull, jsonBool, jsonNumber, jsonString, jsonArray, jsonObject)
);

const jsonArray: Parser<any[]> = between(
  token(char('[')),
  token(char(']')),
  sepBy(token(jsonValue), token(char(',')))
);

const jsonObject: Parser<Record<string, any>> = parser(function* () {
  yield* token(char('{'));
  
  const pairs = yield* sepBy(
    parser(function* () {
      const key = yield* token(jsonString);
      yield* token(char(':'));
      const value = yield* token(jsonValue);
      return [key, value] as const;
    }),
    token(char(','))
  );
  
  yield* token(char('}'));
  
  return Object.fromEntries(pairs);
});

export const json = token(jsonValue);

Key Patterns

1

Use sepBy for lists

sepBy(element, separator) handles comma-separated lists, space-separated tokens, etc.
2

Use between for brackets

between(open, close, content) parses content inside delimiters like (), [], {}.
3

Use Parser.lazy() for recursion

Wrap recursive parser references in Parser.lazy(() => ...) to avoid initialization errors.
4

Create token helpers

Make a token() helper to handle whitespace consistently across your parser.

Next Steps