Step 5: Parsing Positions

Extract location information from parsed results for error reporting, source mapping, and debugging.

Parse Results and Positions

Every successful parse returns a ParseResult<T> containing:

Position information is always available, but you typically work with simple types like Parser<Int> until you need the positions.

Simple Transformations with map

The map combinator works with just the value, keeping types simple:

import io.github.mirrgieriana.xarpeg.*
import io.github.mirrgieriana.xarpeg.parsers.*

val number = (+Regex("[0-9]+")).value map { it.toInt() } named "number"

fun main() {
    val result = number.parseAll("42").getOrThrow()
    check(result == 42)  // Just the value, no position info
}

Both map and mapEx accept nullable return values. Returning null from the block causes the parse to fail, enabling value-based rejection:

import io.github.mirrgieriana.xarpeg.*
import io.github.mirrgieriana.xarpeg.parsers.*

fun main() {
    val positiveNumber = (+Regex("[0-9]+")).value map {
        val n = it.toInt()
        if (n > 0) n else null  // Reject zero
    } named "positive number"

    val fallback = (+Regex("[0-9]+")).value map { it.toInt() } named "number"
    val parser = positiveNumber + fallback

    check(parser.parseAll("42").getOrThrow() == 42)
    check(parser.parseAll("0").getOrThrow() == 0)
}

Accessing Positions with mapEx

Use mapEx when you need position information. It receives the ParseContext and full ParseResult:

import io.github.mirrgieriana.xarpeg.*
import io.github.mirrgieriana.xarpeg.parsers.*

val identifier = (+Regex("[a-zA-Z][a-zA-Z0-9_]*")).value named "identifier"

val identifierWithPosition = identifier mapEx { ctx, result ->
    "${result.value}@${result.start}-${result.end}"
}

fun main() {
    val result = identifierWithPosition.parseAll("hello").getOrThrow()
    check(result == "hello@0-5")  // Includes position info
}

Note: +Regex(...) returns Parser<MatchResult>. Use .value to get Parser<String>, so result.value is already a String.

Getting the Full ParseResult

When you need the complete ParseResult object (including value, start, and end positions), use the .result extension:

import io.github.mirrgieriana.xarpeg.*
import io.github.mirrgieriana.xarpeg.parsers.*

val word = +"hello"
val wordWithResult = word.result

fun main() {
    val result = wordWithResult.parseAll("hello").getOrThrow()
    check(result.value == "hello")
    check(result.start == 0)
    check(result.end == 5)
}

The .result extension transforms Parser<T> into Parser<ParseResult<T>>, giving you direct access to all position information without needing to use mapEx.

Extracting Matched Text

Get the original matched substring using the text() extension:

import io.github.mirrgieriana.xarpeg.*
import io.github.mirrgieriana.xarpeg.parsers.*

val number = (+Regex("[0-9]+")).value named "number"

val numberWithText = number mapEx { ctx, result ->
    val matched = result.text(ctx)
    val value = matched.toInt()
    "Parsed '$matched' as $value"
}

fun main() {
    val result = numberWithText.parseAll("123").getOrThrow()
    check(result == "Parsed '123' as 123")  // Matched text extracted
}

Calculating Line and Column Numbers

Build enhanced error reporting with line/column information:

import io.github.mirrgieriana.xarpeg.*
import io.github.mirrgieriana.xarpeg.parsers.*

data class Located<T>(val value: T, val line: Int, val column: Int)

fun <T : Any> Parser<T>.withLocation(): Parser<Located<T>> = this mapEx { ctx, result ->
    val text = ctx.src.substring(0, result.start)
    val line = text.count { it == '\n' } + 1
    val column = text.length - (text.lastIndexOf('\n') + 1) + 1
    Located(result.value, line, column)
}

val keyword = (+Regex("[a-z]+")).value named "keyword"
val keywordWithLocation = keyword.withLocation()

fun main() {
    val result = keywordWithLocation.parseAll("hello").getOrThrow()
    check(result.value == "hello" && result.line == 1 && result.column == 1)
}

Multi-line Position Tracking

Track positions across multiple lines:

import io.github.mirrgieriana.xarpeg.*
import io.github.mirrgieriana.xarpeg.parsers.*

data class Token(val value: String, val line: Int, val col: Int)

fun <T : Any> Parser<T>.withPos(): Parser<Token> = this mapEx { ctx, result ->
    val prefix = ctx.src.substring(0, result.start)
    val line = prefix.count { it == '\n' } + 1
    val col = prefix.length - (prefix.lastIndexOf('\n') + 1) + 1
    Token(result.text(ctx), line, col)
}

fun main() {
    val word = (+Regex("[a-z]+")).value named "word"
    val wordWithPos = word.withPos()

    // Parse tracks position in input
    val result = wordWithPos.parseAll("hello").getOrThrow()
    check(result == Token("hello", 1, 1))
}

Practical Example: Error Messages

Combine position tracking with error context for helpful messages:

import io.github.mirrgieriana.xarpeg.*
import io.github.mirrgieriana.xarpeg.parsers.*

fun main() {
    val parser = (+Regex("[0-9]+")).value map { it.toInt() } named "number"

    fun parseWithErrors(input: String): Result<Int> {
        val result = parser.parseAll(input)
        val exception = result.exceptionOrNull() as? ParseException

        return if (exception != null) {
            val pos = exception.context.errorPosition ?: 0
            val prefix = input.substring(0, pos)
            val line = prefix.count { it == '\n' } + 1
            val column = prefix.length - (prefix.lastIndexOf('\n') + 1) + 1
            val expected = exception.context.suggestedParsers.orEmpty().mapNotNull { it.name }

            Result.failure(Exception(
                "Syntax error at line $line, column $column. Expected: ${expected.joinToString()}"
            ))
        } else {
            result
        }
    }

    val result = parseWithErrors("abc")
    check(result.isFailure)  // Parsing fails as expected
}

Best Practices

Use map by default - Keep types simple when positions aren’t needed (example: val simple = (+Regex("[0-9]+")).value map { it.toInt() } named "number").

Use mapEx when needed - Extract positions only where required.

Return null to reject - Use nullable return in map or mapEx to reject parsed values based on semantic conditions, combined with choice (+) for fallback alternatives.

Isolate position logic - Create reusable helpers like fun <T : Any> Parser<T>.withLocation(): Parser<Located<T>> for position tracking.

Remember: positions are always there - You don’t need to change your parser’s return type throughout your grammar. Extract position information at boundaries where you need it.

Key Takeaways

Next Steps

Discover how PEG parsers naturally handle template strings with embedded expressions.

Step 6: Template Strings