Implementing TRNG in Embedded Devices: Best PracticesTrue Random Number Generators (TRNGs) provide nondeterministic randomness derived from physical processes. In embedded systems — where security, performance, power, and cost are tightly constrained — a properly implemented TRNG is often a foundation for secure keys, nonces, salts, and secure protocol operations. This article explains TRNG basics, common entropy sources, architecture patterns, design and validation best practices, integration strategies, performance and power trade-offs, and practical recommendations for deployment in embedded devices.
What is a TRNG and why it matters in embedded systems
A TRNG produces unpredictable bits based on physical phenomena (thermal noise, oscillator jitter, avalanche breakdown, photon arrival times, etc.). Unlike deterministic pseudo-random number generators (PRNGs), TRNG output cannot be reproduced from a seed. Embedded systems use TRNGs for:
- Key generation (private keys, symmetric keys)
- Nonces and IVs for encryption and authenticated protocols
- Challenge-response values for authentication
- Randomized timing or addressing to mitigate side-channel attacks
- Seeding PRNGs to ensure long-term unpredictability
In constrained devices (IoT sensors, smartcards, microcontrollers) weak randomness is a common root cause of compromise — for example predictable keys or repeated nonces enabling cryptographic attacks.
Entropy sources commonly used in embedded TRNGs
Choose a physical source based on available peripherals, power and cost budgets:
- Thermal/noise-amplifier-based resistor or diode noise: simple analog front-end, common in low-cost designs.
- Oscillator jitter (ring oscillator vs. reference clock): digital-friendly, uses existing oscillators but requires careful conditioning to avoid deterministic bias.
- Avalanche noise (reverse-biased diode breakdown): high entropy but may require higher voltage and careful lifetime testing.
- SRAM startup patterns: uses intrinsic memory power-up state as entropy for systems with SRAM; useful for seed generation but requires health testing.
- RC timing noise (comparing LO/HI slew or capacitor charge time): often low-power and simple to implement.
- Photonic / photon counting (for devices with optical sensors): high-quality but needs optics and may be impractical for many embedded contexts.
- Clock drift between independent oscillators: can be harvested with low overhead in multi-clock systems.
Each source has trade-offs in entropy rate, required analog/digital circuitry, sensitivity to environment, aging, and manufacturing variation.
TRNG architecture patterns
- On-chip discrete TRNG module: dedicated IP providing conditioned random bits and health monitors. Common in higher-end MCUs, SoCs.
- Hybrid TRNG + PRNG: TRNG supplies entropy seeds to a cryptographically secure PRNG (CSPRNG/DRBG) that provides bulk random data and rate smoothing.
- Boot-time entropy harvesting: collect entropy at boot (SRAM startup, oscillator jitter) to seed a CSPRNG; useful for power-cycled devices.
- Continuous background harvesting: continuously sample entropy source(s) and feed into an entropy pool with a CSPRNG to smooth bursts and provide on-demand randomness.
- External TRNG chip/module: discrete device connected via SPI/I²C for high-assurance applications or when local resources are insufficient.
Best practice is usually TRNG seeding a CSPRNG — TRNGs have limited bit-rate and can show short-term bias; a vetted CSPRNG (e.g., HMAC-DRBG, CTR-DRBG) produces high-rate, uniformly distributed output and provides forward/backward security properties when reseeded appropriately.
Conditioning and post-processing
Raw physical outputs often contain bias and correlations. Use cryptographic conditioning to obtain uniformly distributed bits and to reduce dependence on source assumptions:
- Whitening using cryptographic primitives: XOR folding alone is fragile; use a vetted extractor or a CSPRNG. Common choices: AES-based hash/DRBG, SHA-⁄3 hashing, HMAC-DRBG.
- Von Neumann correction: simple and useful for bit bias removal but discards data and is insufficient alone for complex biases.
- Entropy estimators: maintain conservative estimates of min-entropy per sample to decide reseed intervals and detect failures.
- Use standards and reference designs where possible (NIST SP 800-90B/C guidance for entropy sources and DRBG usage).
Conditioning should be implemented in a way that a single point of failure in the source cannot trivially produce attacker-controlled output.
Health monitoring and failure detection
Continuously or periodically run health tests to detect source failures (stuck bits, reduced variance, strong bias). Implement both online (continuous) and startup (initial) checks:
- Repetition count test: detect long runs of identical samples.
- Adaptive proportion test: detect unexpected bias changes.
- Entropy estimation monitoring: compare measured entropy against expected thresholds.
- Frequency/monobit tests: simple counts of ones vs zeros over windows.
- More sophisticated tests: autocorrelation checks, power spectral density checks.
Health tests should trigger alarms, fallback strategies, or device lockdown when failures occur. Logging or telemetry (securely) helps root-cause analysis.
Security considerations and attack surfaces
- Physical tampering: attackers with physical access may influence analog sources (inject current, temperature shifting, RF interference). Mitigate with shielding, tamper detection, and health tests.
- Environmental manipulation: sources sensitive to temperature or supply voltage can be biased. Use sensors to detect abnormal conditions and consider mixing sources with different failure modes.
- Fault injection: glitching power/clock to force predictable behavior. Harden with brown-out detection and clock-monotonicity checks.
- Side-channel leakage: ensure that random sampling or conditioning paths don’t leak entropy through electromagnetic or timing channels. Avoid easily observable patterns during conditioning.
- Supply-chain attacks: ensure TRNG IP or external modules come from trusted suppliers and are authenticity-checked.
- Software attacks: protect CSPRNG state, seed material, and health-test code; use secure storage and access control if persistent state is stored.
Design for defense-in-depth: multiple entropy sources, continuous health tests, conditioning, and secure key storage reduce single-point failures.
Integration with cryptographic stacks and OS
- Seed CSPRNG early: collect entropy at earliest feasible stage (secure boot) to seed the OS crypto subsystem.
- Reseeding policy: periodically reseed the CSPRNG from TRNG based on time, data volume generated, or entropy thresholds.
- API design: provide a secure, minimal API for requesting random bytes, with blocking/non-blocking options. Blocking calls should wait until sufficient entropy is available.
- Access control: restrict direct hardware TRNG access; prefer a kernel/firmware service that mediates access, health monitoring, and rate limiting.
- Secure persistence: if saving entropy state across reboots (stateful DRBG), protect it with device-specific keys and integrity checks. Consider stateless designs when persistent secure storage is unavailable.
- Compliance: follow platform standards (e.g., Linux getrandom()/urandom behavior), and align with NIST guidelines if certification is required.
Performance, power, and cost trade-offs
- Bit-rate vs. power: high-entropy sources (avalanche, photonic) may consume more power. Use a hybrid approach: TRNG seeds CSPRNG, which serves high-rate low-power requests.
- Circuit complexity vs cost: pure analog TRNGs add BOM and validation costs. Leverage existing peripherals (oscillators, ADC, ADC noise) where acceptable.
- Latency: boot-time entropy harvesting can delay cryptographic services. Mitigate by seeding a CSPRNG with modest entropy and marking some services as delayed until more entropy is collected.
- Throughput scaling: for devices that require high random throughput (secure communications stacks, high-frequency key generation), consider hardware accelerators or external TRNG modules.
Testing, validation, and certification
- Statistical test suites: run NIST STS, Dieharder, TestU01 as part of development testing. These detect many classes of bias but do not replace entropy source analysis.
- Entropy source characterization: measure environmental sensitivity, aging, manufacturing spread, and min-entropy per sample under worst-case conditions.
- Fault-injection testing: evaluate behavior under temp/vcc extremes, clock glitches, and EM injection.
- Code review and formal methods: review conditioning and health-test implementations; consider formal verification for critical modules.
- Certification: for high-assurance applications, follow certifications (Common Criteria, FIPS 140-3, NIST guidelines). Certification often requires reproducible test vectors, documentation, and independent evaluation.
Practical implementation checklist
- Select at least one primary and one supplementary independent entropy source.
- Design analog/digital front-end with proper filtering, amplification, and anti-tamper measures.
- Implement cryptographic conditioning (CSPRNG) using vetted algorithms (HMAC-DRBG, CTR-DRBG, or hash-based extractors).
- Implement continuous and startup health tests; define thresholds and fault responses.
- Seed CSPRNG at boot early; define secure reseeding policy and persistence practices.
- Protect TRNG and CSPRNG state in memory (secure RAM regions, memory erasure on fault).
- Perform extensive characterization (statistical tests, environmental variation) and document min-entropy estimates.
- Plan for firmware updates and a secure way to change health-test thresholds or algorithms if field data shows issues.
- Consider threat model: physical, remote, and environmental attacks — design mitigations accordingly.
Example: simple MCU TRNG pattern
- Use ring oscillator jitter sampler: sample a fast ring oscillator with a slower reference clock into a flip-flop.
- Collect batches, XOR-fold samples into a buffer.
- Feed buffer into HMAC-DRBG (seed) and use DRBG output for requests.
- Run basic health checks (run length, frequency) and reseed periodically or on event (boot, after generating N bytes).
- Protect access via a firmware API that enforces blocking until initial seed entropy is sufficient.
Closing recommendations
- Prefer multiple independent entropy sources and a conservative conditioning approach.
- Treat TRNG as one part of an overall security architecture: protecting keys, enforcing secure boot, and hardening software are equally important.
- Invest in characterization, health tests, and secure integration — most field failures stem from overlooked environmental or lifecycle effects.
- Use proven primitives and standards (NIST guidance, established DRBGs) rather than inventing custom conditioning.
Implementing a reliable TRNG in embedded devices is a balance of physics, secure engineering, and practical constraints. With careful source selection, robust conditioning, continuous health checks, and conservative integration, TRNGs can significantly strengthen device security without excessive power or cost overhead.
Leave a Reply