Fundamental Analyst
Lead Research
Analyze 10-K and 10-Q filings. Identify revenue drivers, margin risks, earnings quality, and guidance shifts.
- sec_filing
- earnings_cal
- av_overview
Case Study / 01 — Market Intelligence Crew
Built for a trading desk. CrewAI hierarchical orchestration, LightGBM with walk-forward validation, FinBERT and LunarCrush sentiment, Interactive Brokers integration. Out-of-sample across 90 tickers: +323.1% return, Sharpe 0.82, 53.9% win rate, 1,910 trades.
8
Agents
200+
ML features
7
Risk gates
10+
Data sources
A note on this case. This case is included as a technical systems example, not as a financial performance promise. The purpose is to show multi-agent orchestration, data validation, risk gates, and refusal logic.
The system in motion
The CRO orchestrator receives each specialist’s signal, weighs it against the LightGBM and HMM priors, and produces a single direction with conviction — or refuses to enter. Across 1,910 trades on 90 tickers, the resulting strategy returned +323.1% while SPY buy & hold lagged.
The manual baseline
Before the system, a single research analyst reviewed every SEC filing, earnings call, peer comparison, technical chart, and sentiment shift by hand — then made a call. Coverage: one or two instruments per day, with no audit trail of why a decision was made or rejected.
The crew
Hierarchical orchestration via CrewAI Process.hierarchical
market_intel_crew.flow · live
● runningCRO · Risk Manager └── synthesis Specialists → ├── Fundamental (10-K · earnings) ├── Competitor (peers · P/E) ├── Sentiment (news · social) ├── Technical (price · volume) ├── Quant (Z · ATR · HMM) ├── Dev Infra (APIs · risk) └── ML Data QA (leakage · quality) ML Brain (parallel signal) ┄┄→ └── LightGBM · 200+ features
The crew
Lead Research
Analyze 10-K and 10-Q filings. Identify revenue drivers, margin risks, earnings quality, and guidance shifts.
Market Positioning
Compare the asset to peers on P/E, PEG, revenue growth, and margin profile. Score relative positioning.
Narrative Voyager
Read news and social signals. Identify the dominant narrative, sentiment shift, and retail-flow impact.
Price & Volume
Read price action, volume, order flow, HFT-interest zones, support/resistance, and VPVR context. Define trade setup.
Math Core
Compute and interpret quantitative features: Z-score, ATR, Hurst exponent, ADX, and market-regime hypotheses (HMM).
Infrastructure
Audit pipeline stability: API availability, error handling, code security, infrastructure readiness for backtest.
Model Optimization
Assess data quality for ML readiness: completeness, consistency, leakage risks, bias, and overfitting indicators.
Chief Risk Officer · CRO
Synthesize the team's findings. Define trade decision, entry/exit levels, stop-loss, and position size using ATR and risk limits. Must explicitly declare agreement with the ML signal.
no direct tools — delegation only
Two of the eight system prompts shown verbatim. The CRO prompt below is bilingual — the left column is the actual prompt running in production, the right column is the English reference.
role: "Fundamental Analyst (Lead Research)" goal: Analyze 10-K and 10-Q filings. Identify revenue drivers, margin risks, earnings quality, and guidance shifts. backstory: Senior buy-side fundamental analyst with experience valuing public companies in the US and Europe. Extracts key factors from SEC filings and translates them into investment conclusions.
role: "Risk Manager (CRO)" goal: Synthesize the team's findings. Define the trade decision, entry/exit levels, stop-loss, and position size using ATR and risk limits. backstory: You are the Chief Risk Officer with final-call authority. Pragmatic, disciplined in risk-adjusted thinking, requires transparent reasoning.
The ML brain
Separate from the agent crew, a LightGBM 3-class classifier (long / flat / short) produces directional probabilities trained on 200+ engineered features: technical indicators, FRED macro data, FinBERT-scored news sentiment, and LunarCrush social signals. Validated walk-forward — not one-shot — with purged splits to prevent leakage near boundaries.
A 3-state Gaussian HMM runs in parallel, classifying market regimes as bear / range / bull with posterior entropy as a confidence score. Both signals feed into the CRO as a quantitative prior — not a vote in a poll.
P(long) − P(short) ∈ [−1, +1]Total features
202+
Categories
8
Largest cluster
Quant · 44
Feature space · cluster map
200+ engineered features across eight categories · projected layout (illustrative)
v1 of this system had hmm_regime_detection_placeholder() returning '[STUB] not implemented yet'. The current implementation in src/features/regime.py is a real GaussianHMM with leak-safe walk-forward fitting. The stub was kept in the codebase as a visible reminder of the maturation path.
Synthesis
The Risk Manager agent receives every specialist's structured output plus the ML signal as a separate prior. It must produce a single JSON decision — conviction, entry, stop, take-profit, position size — all ATR-driven, not gut estimates.
Most importantly: the CRO must explicitly declare its agreement with the ML signal. The output schema has a required agreement_with_ml field: agree | partial | disagree. If the CRO disagrees with the quant prior, it must populate disagreement_reason with a concrete cause — a news event, a regime shift, a data gap.
This is institutional discipline encoded in a JSON schema. No silent overrides, no convenient consensus.
{
"direction": "long | short | flat",
"conviction": 0.0–1.0,
"entry": <number>,
"stop": <number>,
"take_profit": <number>,
"position_size_pct": 0.0–1.0,
"rationale": "<short reason>",
"risks": ["risk1", "risk2"],
"invalidation": "<what kills the thesis>",
"agreement_with_ml": "agree | partial | disagree",
"disagreement_reason": "<required if disagree>"
}The risk gate
“Fail-closed: any ambiguous state returns
allowed=False. Graceful degradation only for buying-power.”
01
Realized loss above 3% of session-start equity flips the session kill-switch. No further positions open until manual reset.
daily_loss_pct: 0.03
02
LLM proposes 25%? Gate silently shrinks to the 10% hard cap with an audit-trail entry. The LLM never sees the override.
max_position_pct: 0.10
03
Stop further than 5% from entry? Trade is forced to flat. No retry path, no escalation.
max_stop_distance_pct: 0.05
If the LLM proposes a 25% position size — maybe high conviction on a clean setup — the gate doesn't argue with it. It shrinks the size to 10% in place, logs size_shrunk_cap:0.2500->0.1000, and forwards the order. The LLM never sees the override. It can't retry. It can't escalate. The cap is non-negotiable.
Daily-loss kill-switch
Session-scoped. Trips when realized loss exceeds threshold. Manual reset required.
Concurrent position cap
Maximum simultaneous open positions across the strategy.
Buying-power / leverage
Hard leverage ceiling. Position size shrinks gracefully when nearing limit.
Cash buffer floor
Minimum cash to retain as buffer. Below this, no new positions open.
Position size hard cap
Upper bound on what the LLM can propose per trade. Silently enforced.
Stop distance sanity
Maximum permitted stop distance from entry. Trades forced to flat if exceeded.
Overnight position cap
Maximum positions carried into the next session.
Below the gate, every external API call has rate limiting, exponential backoff, fallback chains (FMP → yfinance, Polygon → cached profile), and redact_secrets() scrubbing all 12 known API keys and URL token patterns from logs. The CRO never sees a leaked credential. The user never sees a stuck pipeline.
The result
Out-of-sample walk-forward backtest across 91 tickers loaded, 90 traded, 1,910 trades, average hold of 7 bars: Total return +323.1%. Sharpe 0.82. Sortino 1.03. Profit factor 1.33. Win rate 53.9%. Max drawdown −24.6%. Calmar 0.59.
The strategy outperformed the SPY buy & hold benchmark on the same window. The risk gate kept max drawdown contained at −24.6% while position-size caps and the CRO veto blocked the high-conviction trades that did not pass the ML prior — preventing the kind of concentration losses that show up after a regime shift.
The engineering — agent crew, ML model, risk gate, CRO orchestrator — is what produced both the upside and the contained downside.
Total return
+323.1%
Sharpe
0.82
Sortino
1.03
Max drawdown
−24.6%
Win rate
53.9%
Profit factor
1.33
Total trades
1,910
Tickers traded
90 / 91
Calmar
0.59
Most agencies show you a win without the engineering. We show you both — the metrics and the system that produced them.
The engineering
Multi-LLM with cost discipline — Qwen 32B via Aliyun is the default; GPT-4-turbo and Gemini 2.0 Flash are drop-in alternatives via env var. No vendor lock-in.
LaunchDarkly feature flags — demo-rollout-enabled for percentage rollout; agent-comments-level toggles agent verbosity (brief / full) at runtime. Local fallback works without the SDK.
PDF report generation — fpdf2 with DejaVuSans for Cyrillic, ANSI escape stripping, long-token wrapping, last-resort line-by-line render when multi_cell fails.
Preflight model ping — a 16-token call to the configured LLM before the crew runs. Fail-fast on bad keys, bad quota, or bad base URL. Never burn 90 seconds in agent #1 to discover the API is dead.
Secret redaction across logs — redact_secrets() scrubs 12 known env-key values and 5 URL-token patterns from every error message and log line.
The pattern
Anywhere a business needs to synthesize a decision from many noisy signals — and where a wrong call has real consequences — this architecture transplants directly. The pattern is multi-specialist + ML prior + manager synthesizer + fail-closed gate. The domain plugs in. Specialist roles, ML features, and risk limits change. Everything else stays.
15-minute fit call. We map the agents, signals, and gates around your decision.