resurgo

package module
v0.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 15, 2026 License: MIT Imports: 8 Imported by: 0

README

ResurGo

CI Go Reference GitHub Tag

ResurGo is a Go library for static function recovery from stripped executable binaries.

Features

  • Disassembly-based detection: function entry recovery via three complementary signals - prologue pattern matching, call-site analysis, and alignment boundary analysis
  • DWARF CFI-based detection: high-confidence function entries extracted from .eh_frame FDE records - compiler-written, survives strip --strip-all
  • False positive filtering: discards intra-function jump targets and linker-generated PLT stubs from the candidate set
  • Format-agnostic core: works on raw machine code bytes from any binary format
  • ELF convenience wrapper: built-in support for parsing ELF executables and inferring architecture

Supported architectures

  • x86_64 (AMD64)
  • ARM64 (AArch64)

Detection strategies

Disassembly-based

Resurgo disassembles the .text section and runs three independent signals in parallel, then merges the results:

  • Prologue matching - recognizes architecture-specific function entry instruction sequences. See docs/PROLOGUES.md.
  • Call-site analysis - extracts CALL and JMP targets; functions called or jumped to from many sites carry higher confidence. See docs/CALLSITES.md.
  • Alignment boundary analysis - recovers pure-leaf and never-called functions by detecting the alignment gap compilers emit between adjacent functions. See docs/BOUNDARY.md.

Candidates from all three signals are merged and scored. ELF-specific false-positive filters (PLT ranges, intra-function jump anchor check) are applied before the final result is returned.

DWARF CFI-based

When the binary contains an .eh_frame section, resurgo parses its FDE (Frame Description Entry) records and uses their initial_location fields as a high-confidence function entry set. These addresses were written by the compiler - not inferred by heuristics - and are typically present in stripped ELF binaries where .symtab and .debug_* are long gone.

The EhFrameDetector emits these addresses as candidates. The EhFrameFilter then retains only candidates confirmed by an FDE, dropping disassembly noise. See docs/CFI.md.

Usage

Detect functions from a stripped ELF

package main

import (
    "debug/elf"
    "fmt"
    "log"

    "github.com/maxgio92/resurgo"
)

func main() {
    f, err := elf.Open("./myapp")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()

    candidates, err := resurgo.DetectFunctionsFromELF(f)
    if err != nil {
        log.Fatal(err)
    }

    for _, c := range candidates {
        fmt.Printf("0x%x: %s (confidence: %s)\n",
            c.Address, c.DetectionType, c.Confidence)
    }
}
Example output
0x401000: both (confidence: high)
0x401100: prologue-only (confidence: medium)
0x401200: call-target (confidence: medium)
0x401300: aligned-entry (confidence: low)
0x401400: cfi (confidence: high)

Raw bytes (format-agnostic)

For non-ELF binaries or raw memory dumps, use the lower-level primitives directly:

prologues, err := resurgo.DetectPrologues(data, 0x400000, resurgo.ArchAMD64)
edges, err := resurgo.DetectCallSites(data, 0x400000, resurgo.ArchAMD64)

API Reference

// DetectFunctionsFromELF runs all detectors then all filters against f and
// returns a deduplicated, sorted slice of function candidates.
// Architecture is inferred from the ELF header.
// opts may include WithDetectors or WithFilters to replace either pipeline.
func DetectFunctionsFromELF(f *elf.File, opts ...Option) ([]FunctionCandidate, error)

// WithDetectors replaces the default detector pipeline.
// Detectors run in order; results are merged before filtering.
func WithDetectors(detectors ...CandidateDetector) Option

// WithFilters replaces the default filter pipeline.
// Filters run in order. Pass no arguments to disable all filters.
func WithFilters(filters ...CandidateFilter) Option

// Built-in detectors, enabled by default in the order listed:
var DisasmDetector   CandidateDetector  // prologue, call-site, and alignment-boundary detection
var EhFrameDetector  CandidateDetector  // emits candidates from .eh_frame FDE records

// Built-in filters, enabled by default in the order listed:
var CETFilter     CandidateFilter  // drops non-ENDBR64 aligned entries on CET AMD64 binaries
var EhFrameFilter CandidateFilter  // retains only FDE-confirmed candidates
var PLTFilter     CandidateFilter  // removes PLT-section candidates (always last)

// DetectPrologues scans raw machine code bytes for architecture-specific
// function prologue patterns. Works on any binary format.
func DetectPrologues(code []byte, baseAddr uint64, arch Arch) ([]Prologue, error)

// DetectCallSites scans raw machine code bytes for CALL and JMP instructions
// and returns their resolved target addresses. Works on any binary format.
func DetectCallSites(code []byte, baseAddr uint64, arch Arch) ([]CallSiteEdge, error)

Key types:

type DetectionType string

const (
    DetectionPrologueOnly DetectionType = "prologue-only"
    DetectionCallTarget   DetectionType = "call-target"
    DetectionJumpTarget   DetectionType = "jump-target"
    DetectionPrologueCallSite DetectionType = "prologue-callsite"
    DetectionAlignedEntry DetectionType = "aligned-entry"
    DetectionCFI          DetectionType = "cfi"
)

type FunctionCandidate struct {
    Address       uint64        `json:"address"`
    DetectionType DetectionType `json:"detection_type"`
    PrologueType  PrologueType  `json:"prologue_type,omitempty"`
    CalledFrom    []uint64      `json:"called_from,omitempty"`
    JumpedFrom    []uint64      `json:"jumped_from,omitempty"`
    Confidence    Confidence    `json:"confidence"`
}

Implementation

+------------------+
|   *elf.File      |
+------------------+
         |
         +-------------------------------+
         |                               |
         v                               v
+------------------+           +------------------+
|  DisasmDetector  |           | EhFrameDetector  |
|  (.text bytes)   |           |   (.eh_frame)    |
+---+---------+----+           +--------+---------+
    |         |    |                    |
    v         v    v                    v
+------+ +------+ +--------+  +------------------+
|Prolog| |Call  | |Boundary|  | FDE entry VAs    |
|ues   | |Sites | |Analysis|  | (DetectionCFI)   |
+--+---+ +--+---+ +---+----+  +--------+---------+
   |        |         |                |
   +--------+---------+----------------+
            v
   +------------------+
   | mergeCandidates  |
   | (dedup by addr)  |
   +--------+---------+
            |
            v
   +------------------+
   |   CETFilter      |  drops non-ENDBR64 aligned entries (CET binaries)
   +--------+---------+
            |
            v
   +------------------+
   |  EhFrameFilter   |  retains only FDE-confirmed candidates
   +--------+---------+
            |
            v
   +------------------+
   |   PLTFilter      |  removes PLT-section candidates
   +--------+---------+
            |
            v
   +------------------+
   |[]FunctionCandidate|
   +------------------+

Limitations

  • Reports addresses only - no symbol names on stripped binaries
  • Disassembly signals are heuristic; CRT scaffolding on aligned addresses can still produce false positives when .eh_frame is absent
  • Linear disassembly - indirect jumps and computed addresses are not resolved

Dependencies

  • Go 1.25.7+
  • golang.org/x/arch - x86 and ARM64 disassembler
  • debug/elf (standard library) - ELF parser

References

Documentation

Overview

Package resurgo provides static function recovery from stripped ELF binaries.

It combines two complementary detection strategies:

  • Disassembly-based: three parallel signals (prologue pattern matching, call-site analysis, and alignment boundary analysis) are merged and scored to produce a ranked set of function candidates.

  • DWARF CFI-based: when the binary contains an .eh_frame section, the initial_location fields from its FDE records are used as an authoritative whitelist. These addresses were written by the compiler and survive stripping, making CFI the highest-confidence source available.

The primary entry point is DetectFunctionsFromELF, which accepts a parsed *elf.File, runs all detectors and filters, and returns a deduplicated, filtered slice of FunctionCandidate values.

For format-agnostic use (non-ELF binaries, raw memory dumps) the lower-level DetectPrologues and DetectCallSites APIs accept raw machine code bytes.

Supported architectures: x86_64 (AMD64) and ARM64 (AArch64).

Index

Examples

Constants

View Source
const (
	// Recognized call site instruction types.
	CallSiteCall CallSiteType = "call"
	CallSiteJump CallSiteType = "jump"

	// Recognized addressing modes for call site instructions.
	AddressingModePCRelative       AddressingMode = "pc-relative"
	AddressingModeAbsolute         AddressingMode = "absolute"
	AddressingModeRegisterIndirect AddressingMode = "register-indirect"

	// DetectionCallTarget indicates the candidate was found only as a target
	// of one or more CALL instructions.
	DetectionCallTarget DetectionType = "call-target"

	// DetectionJumpTarget indicates the candidate was found only as a target
	// of one or more JMP instructions.
	DetectionJumpTarget DetectionType = "jump-target"

	// DetectionPrologueCallSite indicates the candidate was confirmed by both
	// prologue matching and call-site analysis.
	DetectionPrologueCallSite DetectionType = "prologue-callsite"
)
View Source
const (
	// Supported architectures.
	ArchAMD64 Arch = "amd64"
	ArchARM64 Arch = "arm64"

	// DetectionPrologueOnly indicates the candidate was found by prologue
	// pattern matching only.
	DetectionPrologueOnly DetectionType = "prologue-only"

	// Recognized x86_64 function prologue patterns.
	PrologueClassic        PrologueType = "classic"
	PrologueNoFramePointer PrologueType = "no-frame-pointer"
	ProloguePushOnly       PrologueType = "push-only"
	PrologueLEABased       PrologueType = "lea-based"

	// Recognized ARM64 function prologue patterns.
	PrologueSTPFramePair  PrologueType = "stp-frame-pair"
	PrologueSTRLRPreIndex PrologueType = "str-lr-preindex"
	PrologueSubSP         PrologueType = "sub-sp"
	PrologueSTPOnly       PrologueType = "stp-only"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type AddressingMode added in v0.3.0

type AddressingMode string

AddressingMode represents how the target address is specified.

type Arch

type Arch string

Arch represents a CPU architecture.

type CallSiteEdge added in v0.3.0

type CallSiteEdge struct {
	// SourceAddr is the virtual address of the call or jump instruction.
	SourceAddr uint64 `json:"source_addr"`
	// TargetAddr is the virtual address of the call or jump target.
	TargetAddr uint64 `json:"target_addr"`
	// Type indicates whether this edge was produced by a call or jump
	// instruction.
	Type CallSiteType `json:"type"`
	// AddressMode describes how the target address is encoded in the
	// instruction.
	AddressMode AddressingMode `json:"address_mode"`
	// Confidence is the reliability level of this edge.
	Confidence Confidence `json:"confidence"`
}

CallSiteEdge represents a detected call site (call or jump to a function).

func DetectCallSites added in v0.3.0

func DetectCallSites(code []byte, baseAddr uint64, arch Arch) ([]CallSiteEdge, error)

DetectCallSites analyzes raw machine code bytes and returns detected call sites (CALL and JMP instructions with their targets). baseAddr is the virtual address corresponding to the start of code. arch selects the architecture-specific detection logic. This function performs no I/O and works with any binary format.

Example
package main

import (
	"fmt"
	"log"

	"github.com/maxgio92/resurgo"
)

func main() {
	// x86-64 machine code: call $+0x20 (E8 1B 00 00 00)
	// At address 0x1000, calls target at 0x1000 + 5 + 0x1B = 0x1020
	code := []byte{0xE8, 0x1B, 0x00, 0x00, 0x00}
	edges, err := resurgo.DetectCallSites(code, 0x1000, resurgo.ArchAMD64)
	if err != nil {
		log.Fatal(err)
	}
	for _, e := range edges {
		fmt.Printf("[%s] 0x%x -> 0x%x (%s, %s)\n",
			e.Type, e.SourceAddr, e.TargetAddr, e.AddressMode, e.Confidence)
	}
}
Output:
[call] 0x1000 -> 0x1020 (pc-relative, high)

type CallSiteType added in v0.3.0

type CallSiteType string

CallSiteType represents the type of call site instruction.

type CandidateDetector added in v0.4.2

type CandidateDetector func(*elf.File) ([]FunctionCandidate, error)

CandidateDetector reads an ELF file and emits function candidates. Detectors run before filters; their results are merged with those of other detectors (deduplicated by address) before the filter pipeline is applied.

type CandidateFilter added in v0.4.0

type CandidateFilter func([]FunctionCandidate, *elf.File) ([]FunctionCandidate, error)

CandidateFilter applies an ELF-aware transformation to a candidate slice. Each filter reads only what it needs from f and returns the updated slice.

type Confidence added in v0.3.0

type Confidence string

Confidence represents the reliability level of a detected function candidate.

const (
	// Confidence levels ordered from highest to lowest reliability.
	ConfidenceHigh   Confidence = "high"
	ConfidenceMedium Confidence = "medium"
	ConfidenceLow    Confidence = "low"
	ConfidenceNone   Confidence = "none"
)

type DetectionType added in v0.3.0

type DetectionType string

DetectionType represents the signal or combination of signals that produced a function candidate.

const (

	// DetectionAlignedEntry indicates the candidate was found by alignment-
	// boundary analysis: a ret/jmp terminator followed by NOP padding ending
	// at a 16-byte aligned address.
	DetectionAlignedEntry DetectionType = "aligned-entry"
)
const (
	// DetectionCFI is assigned to function candidates whose entry address was
	// read from DWARF Call Frame Information (CFI) rather than inferred by
	// disassembly heuristics. On ELF binaries the CFI is stored in .eh_frame.
	// These addresses are written by the compiler and are the highest-confidence
	// source available on stripped binaries.
	DetectionCFI DetectionType = "cfi"
)

type FunctionCandidate added in v0.3.0

type FunctionCandidate struct {
	// Address is the virtual address of the function entry point.
	Address uint64 `json:"address"`
	// DetectionType is the signal or combination of signals that produced
	// this candidate.
	DetectionType DetectionType `json:"detection_type"`
	// PrologueType is the matched prologue pattern, if any.
	PrologueType PrologueType `json:"prologue_type,omitempty"`
	// CalledFrom holds the virtual addresses of instructions that call this
	// candidate directly.
	CalledFrom []uint64 `json:"called_from,omitempty"`
	// JumpedFrom holds the virtual addresses of instructions that jump to
	// this candidate.
	JumpedFrom []uint64 `json:"jumped_from,omitempty"`
	// Confidence is the reliability level of this candidate.
	Confidence Confidence `json:"confidence"`
}

FunctionCandidate represents a potential function entry point detected through one or more signals (prologue matching, call-site analysis, boundary analysis, or CFI).

func CETFilter added in v0.4.0

func CETFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)

CETFilter filters candidates using the CET-aware ENDBR64 heuristic, reading the .text section from f. Non-AMD64 binaries are returned unchanged. The ELF entry point is exempt from the ENDBR64 requirement: it is not an indirect branch target and therefore never carries ENDBR64 even in CET binaries (e.g. _start). The filter must run before EhFrameFilter.

func DetectFunctionsFromELF added in v0.3.0

func DetectFunctionsFromELF(f *elf.File, opts ...Option) ([]FunctionCandidate, error)

DetectFunctionsFromELF returns detected function candidates from f by running all detectors then all filters in order.

By default the detector pipeline is [DisasmDetector, EhFrameDetector] and the filter pipeline is [CETFilter, EhFrameFilter, PLTFilter]. opts may include WithDetectors or WithFilters to replace either pipeline.

Example
package main

import (
	"debug/elf"
	"fmt"
	"log"

	"github.com/maxgio92/resurgo"
)

func main() {
	f, err := elf.Open("/usr/bin/ls")
	if err != nil {
		log.Fatal(err)
	}
	defer f.Close()

	candidates, err := resurgo.DetectFunctionsFromELF(f)
	if err != nil {
		log.Fatal(err)
	}

	// Count candidates by detection type.
	counts := make(map[resurgo.DetectionType]int)
	for _, c := range candidates {
		counts[c.DetectionType]++
	}
	fmt.Printf("total: %d\n", len(candidates))
	fmt.Printf("prologue+callsite: %d\n", counts[resurgo.DetectionPrologueCallSite])
	fmt.Printf("cfi: %d\n", counts[resurgo.DetectionCFI])
}

func DisasmDetector added in v0.4.2

func DisasmDetector(f *elf.File) ([]FunctionCandidate, error)

DisasmDetector is a CandidateDetector that runs the disassembly-based pipeline (prologue matching, call-site analysis, alignment-based boundary detection) against the .text section of f. The architecture is inferred from the ELF header.

func EhFrameDetector added in v0.4.2

func EhFrameDetector(f *elf.File) ([]FunctionCandidate, error)

EhFrameDetector is a CandidateDetector that emits function candidates sourced from .eh_frame FDE records. Each candidate carries DetectionCFI and ConfidenceHigh. Returns an empty slice (no error) when .eh_frame is absent; the caller falls back to disassembly-only results.

func EhFrameFilter added in v0.4.0

func EhFrameFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)

EhFrameFilter retains only candidates whose address is confirmed by an FDE record in .eh_frame, upgrading their confidence to ConfidenceHigh. When .eh_frame is absent the slice is returned unchanged.

func FilterAlignedEntriesCETAMD64 added in v0.4.2

func FilterAlignedEntriesCETAMD64(candidates []FunctionCandidate, textBytes []byte, textVA, entryVA uint64) []FunctionCandidate

FilterAlignedEntriesCETAMD64 drops aligned-entry candidates lacking ENDBR64 on CET-enabled AMD64 binaries. On CET binaries every indirect-branch-target function entry carries ENDBR64; an aligned address inside a function body (reached by a jump or NOP padding) never does, making it a reliable discriminator for aligned-entry false positives.

The ELF entry point (e.g. _start) is exempt: it is not an indirect branch target and therefore never carries ENDBR64 even in CET binaries.

CET is detected when >= 5 aligned-entry candidates carry ENDBR64; this avoids false triggering on non-CET binaries that may have a few incidental ENDBR64 hits from CRT helpers. Non-CET binaries are returned unchanged. Only DetectionAlignedEntry candidates are affected.

func FilterCandidatesInRanges added in v0.4.2

func FilterCandidatesInRanges(candidates []FunctionCandidate, ranges [][2]uint64) []FunctionCandidate

FilterCandidatesInRanges removes candidates whose addresses fall within any of the given address ranges. Each range is a [lo, hi) pair.

Used to discard candidates that land inside linker-generated sections (e.g. PLT stubs) that the call-site scanner can detect as CALL/JMP targets even though they are not real function entries in the binary under analysis.

func PLTFilter added in v0.4.0

func PLTFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)

PLTFilter removes candidates that land inside linker-generated PLT sections (.plt, .plt.got, .plt.sec, .iplt) as reported by f.

type Option added in v0.4.0

type Option func(*options)

Option configures the behaviour of DetectFunctionsFromELF.

func WithDetectors added in v0.4.2

func WithDetectors(detectors ...CandidateDetector) Option

WithDetectors replaces the default detector pipeline with the provided detectors. They run in the order provided and their results are merged before filtering. Pass no arguments to disable all detectors.

func WithFilters added in v0.4.0

func WithFilters(filters ...CandidateFilter) Option

WithFilters replaces the default filter pipeline with the provided filters. They run in the order provided. Pass no arguments to disable all filters.

type Prologue

type Prologue struct {
	// Address is the virtual address of the detected prologue.
	Address uint64 `json:"address"`
	// Type is the matched prologue pattern.
	Type PrologueType `json:"type"`
	// Instructions is a human-readable representation of the matched
	// prologue instructions.
	Instructions string `json:"instructions"`
}

Prologue represents a detected function prologue.

func DetectPrologues

func DetectPrologues(code []byte, baseAddr uint64, arch Arch) ([]Prologue, error)

DetectPrologues analyzes raw machine code bytes and returns detected function prologues. baseAddr is the virtual address corresponding to the start of code. arch selects the architecture-specific detection logic. This function performs no I/O and works with any binary format.

Example
package main

import (
	"fmt"
	"log"

	"github.com/maxgio92/resurgo"
)

func main() {
	// x86-64 machine code: nop; push rbp; mov rbp, rsp
	code := []byte{0x90, 0x55, 0x48, 0x89, 0xe5}
	prologues, err := resurgo.DetectPrologues(code, 0x1000, resurgo.ArchAMD64)
	if err != nil {
		log.Fatal(err)
	}
	for _, p := range prologues {
		fmt.Printf("[%s] 0x%x: %s\n", p.Type, p.Address, p.Instructions)
	}
}
Output:
[classic] 0x1001: push rbp; mov rbp, rsp

type PrologueType

type PrologueType string

PrologueType represents the type of function prologue.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL