Natnael Alemseged
AboutProjectsTestimonialsWork Experience
© 2026 Natnael Alemseged. All Rights Reserved.
Secure Agent Protocol // Latency Critical // Addis Ababa

GitHub Evaluator / Automaton Auditor

Agentic Code Review Systems Engineer

LangGraph forensic codebase evaluator using a Digital Courtroom protocol: evidence-gathering detectives, three judge personas, deterministic conflict resolution, and hallucination policing.

"Made repository evaluation adversarial and evidence-bound instead of letting a single judge model grade from vibes."
GitHub Evaluator forensic codebase audit workflow
Click to Zoom
Detective evidence fan-out, adversarial judge synthesis, deterministic verdict rules, and hallucination audit

Problem

Automated code evaluators can hallucinate file evidence, collapse into one weak opinion, or miss security and production-readiness failures when grading complex repositories.

Solution

Designed a fan-out/fan-in LangGraph workflow where investigators collect AST, git, security, PDF, and vision evidence, judges argue independently, and Chief Justice rules apply deterministic vetoes and synthesis.

Deep Dive

What It Audits

Automaton Auditor evaluates repositories through staged graph execution. It gathers objective evidence, merges it with integrity checks, asks prosecutor/defense/tech-lead personas to score independently, and writes a final audit report.

Engineering Highlights

  • •Evidence integrity: cited file paths are checked against the repo manifest and hallucinated paths are penalized.
  • •Deterministic justice node: security, evidence, and functionality rules settle disputes without another LLM call.
  • •Report output: final Markdown includes executive summary, per-criterion scores, dissent summaries, remediation steps, and evidence audit.

Tech Stack

LangGraphPythonRAGAST AnalysisLangSmithDocker

Tags

#Code Review#Agent Workflows#LLM Evaluation#LangGraph
View GitHub Repo

More Software Software

Case studies in similar engineering domains.

Axiom Ledger

→

Event-sourced lending pipeline for document intake, extraction, credit analysis, fraud, compliance, and decision orchestration over an append-only ledger.

Brownfield Cartographer

→

Multi-agent codebase cartography tool that analyzes local or GitHub repositories with Surveyor and Hydrologist agents to produce module graphs and data lineage artifacts.

Data Contract Enforcer

→

Schema integrity and lineage attribution system that turns inter-system dependencies into formal contracts, detects schema/type/statistical drift, and reports downstream blast radius.