Documentation
¶
Overview ¶
Package resurgo provides static function recovery from stripped ELF binaries.
It combines two complementary detection strategies:
Disassembly-based: three parallel signals (prologue pattern matching, call-site analysis, and alignment boundary analysis) are merged and scored to produce a ranked set of function candidates.
DWARF CFI-based: when the binary contains an .eh_frame section, the initial_location fields from its FDE records are used as an authoritative whitelist. These addresses were written by the compiler and survive stripping, making CFI the highest-confidence source available.
The primary entry point is DetectFunctionsFromELF, which accepts a parsed *elf.File, runs all detectors and filters, and returns a deduplicated, filtered slice of FunctionCandidate values.
For format-agnostic use (non-ELF binaries, raw memory dumps) the lower-level DetectPrologues and DetectCallSites APIs accept raw machine code bytes.
Supported architectures: x86_64 (AMD64) and ARM64 (AArch64).
Index ¶
- Constants
- type AddressingMode
- type Arch
- type CallSiteEdge
- type CallSiteType
- type CandidateDetector
- type CandidateFilter
- type Confidence
- type DetectionType
- type FunctionCandidate
- func CETFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)
- func DetectFunctionsFromELF(f *elf.File, opts ...Option) ([]FunctionCandidate, error)
- func DisasmDetector(f *elf.File) ([]FunctionCandidate, error)
- func EhFrameDetector(f *elf.File) ([]FunctionCandidate, error)
- func EhFrameFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)
- func FilterAlignedEntriesCETAMD64(candidates []FunctionCandidate, textBytes []byte, textVA, entryVA uint64) []FunctionCandidate
- func FilterCandidatesInRanges(candidates []FunctionCandidate, ranges [][2]uint64) []FunctionCandidate
- func PLTFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)
- type Option
- type Prologue
- type PrologueType
Examples ¶
Constants ¶
const ( // Recognized call site instruction types. CallSiteCall CallSiteType = "call" CallSiteJump CallSiteType = "jump" // Recognized addressing modes for call site instructions. AddressingModePCRelative AddressingMode = "pc-relative" AddressingModeAbsolute AddressingMode = "absolute" AddressingModeRegisterIndirect AddressingMode = "register-indirect" // DetectionCallTarget indicates the candidate was found only as a target // of one or more CALL instructions. DetectionCallTarget DetectionType = "call-target" // DetectionJumpTarget indicates the candidate was found only as a target // of one or more JMP instructions. DetectionJumpTarget DetectionType = "jump-target" // DetectionPrologueCallSite indicates the candidate was confirmed by both // prologue matching and call-site analysis. DetectionPrologueCallSite DetectionType = "prologue-callsite" )
const ( // Supported architectures. ArchAMD64 Arch = "amd64" ArchARM64 Arch = "arm64" // DetectionPrologueOnly indicates the candidate was found by prologue // pattern matching only. DetectionPrologueOnly DetectionType = "prologue-only" // Recognized x86_64 function prologue patterns. PrologueClassic PrologueType = "classic" PrologueNoFramePointer PrologueType = "no-frame-pointer" ProloguePushOnly PrologueType = "push-only" PrologueLEABased PrologueType = "lea-based" // Recognized ARM64 function prologue patterns. PrologueSTPFramePair PrologueType = "stp-frame-pair" PrologueSTRLRPreIndex PrologueType = "str-lr-preindex" PrologueSubSP PrologueType = "sub-sp" PrologueSTPOnly PrologueType = "stp-only" )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AddressingMode ¶ added in v0.3.0
type AddressingMode string
AddressingMode represents how the target address is specified.
type CallSiteEdge ¶ added in v0.3.0
type CallSiteEdge struct {
// SourceAddr is the virtual address of the call or jump instruction.
SourceAddr uint64 `json:"source_addr"`
// TargetAddr is the virtual address of the call or jump target.
TargetAddr uint64 `json:"target_addr"`
// Type indicates whether this edge was produced by a call or jump
// instruction.
Type CallSiteType `json:"type"`
// AddressMode describes how the target address is encoded in the
// instruction.
AddressMode AddressingMode `json:"address_mode"`
// Confidence is the reliability level of this edge.
Confidence Confidence `json:"confidence"`
}
CallSiteEdge represents a detected call site (call or jump to a function).
func DetectCallSites ¶ added in v0.3.0
func DetectCallSites(code []byte, baseAddr uint64, arch Arch) ([]CallSiteEdge, error)
DetectCallSites analyzes raw machine code bytes and returns detected call sites (CALL and JMP instructions with their targets). baseAddr is the virtual address corresponding to the start of code. arch selects the architecture-specific detection logic. This function performs no I/O and works with any binary format.
Example ¶
package main
import (
"fmt"
"log"
"github.com/maxgio92/resurgo"
)
func main() {
// x86-64 machine code: call $+0x20 (E8 1B 00 00 00)
// At address 0x1000, calls target at 0x1000 + 5 + 0x1B = 0x1020
code := []byte{0xE8, 0x1B, 0x00, 0x00, 0x00}
edges, err := resurgo.DetectCallSites(code, 0x1000, resurgo.ArchAMD64)
if err != nil {
log.Fatal(err)
}
for _, e := range edges {
fmt.Printf("[%s] 0x%x -> 0x%x (%s, %s)\n",
e.Type, e.SourceAddr, e.TargetAddr, e.AddressMode, e.Confidence)
}
}
Output: [call] 0x1000 -> 0x1020 (pc-relative, high)
type CallSiteType ¶ added in v0.3.0
type CallSiteType string
CallSiteType represents the type of call site instruction.
type CandidateDetector ¶ added in v0.4.2
type CandidateDetector func(*elf.File) ([]FunctionCandidate, error)
CandidateDetector reads an ELF file and emits function candidates. Detectors run before filters; their results are merged with those of other detectors (deduplicated by address) before the filter pipeline is applied.
type CandidateFilter ¶ added in v0.4.0
type CandidateFilter func([]FunctionCandidate, *elf.File) ([]FunctionCandidate, error)
CandidateFilter applies an ELF-aware transformation to a candidate slice. Each filter reads only what it needs from f and returns the updated slice.
type Confidence ¶ added in v0.3.0
type Confidence string
Confidence represents the reliability level of a detected function candidate.
const ( // Confidence levels ordered from highest to lowest reliability. ConfidenceHigh Confidence = "high" ConfidenceMedium Confidence = "medium" ConfidenceLow Confidence = "low" ConfidenceNone Confidence = "none" )
type DetectionType ¶ added in v0.3.0
type DetectionType string
DetectionType represents the signal or combination of signals that produced a function candidate.
const ( // DetectionAlignedEntry indicates the candidate was found by alignment- // boundary analysis: a ret/jmp terminator followed by NOP padding ending // at a 16-byte aligned address. DetectionAlignedEntry DetectionType = "aligned-entry" )
const ( // DetectionCFI is assigned to function candidates whose entry address was // read from DWARF Call Frame Information (CFI) rather than inferred by // disassembly heuristics. On ELF binaries the CFI is stored in .eh_frame. // These addresses are written by the compiler and are the highest-confidence // source available on stripped binaries. DetectionCFI DetectionType = "cfi" )
type FunctionCandidate ¶ added in v0.3.0
type FunctionCandidate struct {
// Address is the virtual address of the function entry point.
Address uint64 `json:"address"`
// DetectionType is the signal or combination of signals that produced
// this candidate.
DetectionType DetectionType `json:"detection_type"`
// PrologueType is the matched prologue pattern, if any.
PrologueType PrologueType `json:"prologue_type,omitempty"`
// CalledFrom holds the virtual addresses of instructions that call this
// candidate directly.
CalledFrom []uint64 `json:"called_from,omitempty"`
// JumpedFrom holds the virtual addresses of instructions that jump to
// this candidate.
JumpedFrom []uint64 `json:"jumped_from,omitempty"`
// Confidence is the reliability level of this candidate.
Confidence Confidence `json:"confidence"`
}
FunctionCandidate represents a potential function entry point detected through one or more signals (prologue matching, call-site analysis, boundary analysis, or CFI).
func CETFilter ¶ added in v0.4.0
func CETFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)
CETFilter filters candidates using the CET-aware ENDBR64 heuristic, reading the .text section from f. Non-AMD64 binaries are returned unchanged. The ELF entry point is exempt from the ENDBR64 requirement: it is not an indirect branch target and therefore never carries ENDBR64 even in CET binaries (e.g. _start). The filter must run before EhFrameFilter.
func DetectFunctionsFromELF ¶ added in v0.3.0
func DetectFunctionsFromELF(f *elf.File, opts ...Option) ([]FunctionCandidate, error)
DetectFunctionsFromELF returns detected function candidates from f by running all detectors then all filters in order.
By default the detector pipeline is [DisasmDetector, EhFrameDetector] and the filter pipeline is [CETFilter, EhFrameFilter, PLTFilter]. opts may include WithDetectors or WithFilters to replace either pipeline.
Example ¶
package main
import (
"debug/elf"
"fmt"
"log"
"github.com/maxgio92/resurgo"
)
func main() {
f, err := elf.Open("/usr/bin/ls")
if err != nil {
log.Fatal(err)
}
defer f.Close()
candidates, err := resurgo.DetectFunctionsFromELF(f)
if err != nil {
log.Fatal(err)
}
// Count candidates by detection type.
counts := make(map[resurgo.DetectionType]int)
for _, c := range candidates {
counts[c.DetectionType]++
}
fmt.Printf("total: %d\n", len(candidates))
fmt.Printf("prologue+callsite: %d\n", counts[resurgo.DetectionPrologueCallSite])
fmt.Printf("cfi: %d\n", counts[resurgo.DetectionCFI])
}
Output:
func DisasmDetector ¶ added in v0.4.2
func DisasmDetector(f *elf.File) ([]FunctionCandidate, error)
DisasmDetector is a CandidateDetector that runs the disassembly-based pipeline (prologue matching, call-site analysis, alignment-based boundary detection) against the .text section of f. The architecture is inferred from the ELF header.
func EhFrameDetector ¶ added in v0.4.2
func EhFrameDetector(f *elf.File) ([]FunctionCandidate, error)
EhFrameDetector is a CandidateDetector that emits function candidates sourced from .eh_frame FDE records. Each candidate carries DetectionCFI and ConfidenceHigh. Returns an empty slice (no error) when .eh_frame is absent; the caller falls back to disassembly-only results.
func EhFrameFilter ¶ added in v0.4.0
func EhFrameFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)
EhFrameFilter retains only candidates whose address is confirmed by an FDE record in .eh_frame, upgrading their confidence to ConfidenceHigh. When .eh_frame is absent the slice is returned unchanged.
func FilterAlignedEntriesCETAMD64 ¶ added in v0.4.2
func FilterAlignedEntriesCETAMD64(candidates []FunctionCandidate, textBytes []byte, textVA, entryVA uint64) []FunctionCandidate
FilterAlignedEntriesCETAMD64 drops aligned-entry candidates lacking ENDBR64 on CET-enabled AMD64 binaries. On CET binaries every indirect-branch-target function entry carries ENDBR64; an aligned address inside a function body (reached by a jump or NOP padding) never does, making it a reliable discriminator for aligned-entry false positives.
The ELF entry point (e.g. _start) is exempt: it is not an indirect branch target and therefore never carries ENDBR64 even in CET binaries.
CET is detected when >= 5 aligned-entry candidates carry ENDBR64; this avoids false triggering on non-CET binaries that may have a few incidental ENDBR64 hits from CRT helpers. Non-CET binaries are returned unchanged. Only DetectionAlignedEntry candidates are affected.
func FilterCandidatesInRanges ¶ added in v0.4.2
func FilterCandidatesInRanges(candidates []FunctionCandidate, ranges [][2]uint64) []FunctionCandidate
FilterCandidatesInRanges removes candidates whose addresses fall within any of the given address ranges. Each range is a [lo, hi) pair.
Used to discard candidates that land inside linker-generated sections (e.g. PLT stubs) that the call-site scanner can detect as CALL/JMP targets even though they are not real function entries in the binary under analysis.
func PLTFilter ¶ added in v0.4.0
func PLTFilter(candidates []FunctionCandidate, f *elf.File) ([]FunctionCandidate, error)
PLTFilter removes candidates that land inside linker-generated PLT sections (.plt, .plt.got, .plt.sec, .iplt) as reported by f.
type Option ¶ added in v0.4.0
type Option func(*options)
Option configures the behaviour of DetectFunctionsFromELF.
func WithDetectors ¶ added in v0.4.2
func WithDetectors(detectors ...CandidateDetector) Option
WithDetectors replaces the default detector pipeline with the provided detectors. They run in the order provided and their results are merged before filtering. Pass no arguments to disable all detectors.
func WithFilters ¶ added in v0.4.0
func WithFilters(filters ...CandidateFilter) Option
WithFilters replaces the default filter pipeline with the provided filters. They run in the order provided. Pass no arguments to disable all filters.
type Prologue ¶
type Prologue struct {
// Address is the virtual address of the detected prologue.
Address uint64 `json:"address"`
// Type is the matched prologue pattern.
Type PrologueType `json:"type"`
// Instructions is a human-readable representation of the matched
// prologue instructions.
Instructions string `json:"instructions"`
}
Prologue represents a detected function prologue.
func DetectPrologues ¶
DetectPrologues analyzes raw machine code bytes and returns detected function prologues. baseAddr is the virtual address corresponding to the start of code. arch selects the architecture-specific detection logic. This function performs no I/O and works with any binary format.
Example ¶
package main
import (
"fmt"
"log"
"github.com/maxgio92/resurgo"
)
func main() {
// x86-64 machine code: nop; push rbp; mov rbp, rsp
code := []byte{0x90, 0x55, 0x48, 0x89, 0xe5}
prologues, err := resurgo.DetectPrologues(code, 0x1000, resurgo.ArchAMD64)
if err != nil {
log.Fatal(err)
}
for _, p := range prologues {
fmt.Printf("[%s] 0x%x: %s\n", p.Type, p.Address, p.Instructions)
}
}
Output: [classic] 0x1001: push rbp; mov rbp, rsp
