PROJECTS — 2023 / 2025

Built under
real
constraints

AI inference, distributed systems, 6G networking, robotics. Every project begins with a bottleneck and ends with a benchmark.

scroll
01 / 07 AI Inference Optimization

Cache­Pilot

KV cache eviction policy engine for LLM inference. Benchmarked against H2O, SCISSORHANDS, and SnapKV under real GPU memory pressure — TTFT, throughput, and perplexity degradation measured end-to-end.

TTFT reduction−38.2%
vs H2O at 16K ctx940ms → 582ms
Perplexity delta< 0.8 ppl
PythonCUDAPyTorchKV CacheLLM Inference
View on GitHub ↗
CachePilot
CategoryAI Inference
StackPython · CUDA · PyTorch
Year2024
02 / 07 6G / AI-Native Networking

Q-AIRAN
SLICE

Quantum-inspired RAN slicing for 5G/6G. QUBO-formulated slice allocation across heterogeneous UE profiles. Benchmarked against greedy and LP baselines under traffic variance — outperforms both at scale.

Throughput gain+18.7% vs LP
E2E latency9.4ms avg
Slice utilization91.3%
PythonQUBO6GO-RANQuantum Opt.
View on GitHub ↗
Q-AIRAN-SLICE
Category6G Networking
StackPython · QUBO · NumPy
Year2024
03 / 07 Distributed Systems

Nimbus
Mesh-X

Distributed KV cache sharing layer for multi-node LLM serving. Eliminates redundant prefill recomputation across inference workers in disaggregated GPU clusters — reduces TTFT under memory contention.

Cache hit rate82% warm
TTFT p95−44% vs baseline
Cluster nodes8-node tested
PythonKV CacheLLM InfraDistributed
View on GitHub ↗
NimbusMesh-X
CategoryDistributed Systems
StackPython · Redis · gRPC
Year2025
04 / 07 Data Engineering / Space

Satellite
Anomaly

High-throughput telemetry ingestion and anomaly detection for satellite systems. Kafka-based streaming, real-time signal classification, reproducible benchmarks against historical mission data at scale.

Ingest rate3.2M events/s
Anomaly F10.943
Kafka lag< 2ms
PythonKafkaStreamingTime-Series
View on GitHub ↗
Satellite Anomaly
CategoryData Engineering
StackPython · Kafka · Pandas
Year2024
05 / 07 Robotics / Edge AI

ArduPilot
Diagnosis

Automated flight log analysis and fault detection for autonomous drones. Real-time sensor fusion, ROS2 telemetry pipeline, sub-50ms anomaly classification running on embedded hardware.

Fault detection97% confidence
Inference latency< 48ms
Logs analyzed760s missions
ROS2PythonOpenCVC++Edge AI
View on GitHub ↗
ArduPilot
CategoryRobotics
StackROS2 · C++ · OpenCV
Year2024
06 / 07 Visual AI / SLAM

Visual
SLAM

Real-time visual SLAM for autonomous systems. Feature tracking, pose estimation, and loop closure on live camera feeds at 30fps on embedded hardware — tested on indoor and outdoor environments.

Frame rate30fps live
Pose RMSE0.021m
Tracked features180+ pts/frame
C++OpenCVSLAMROS2
View on GitHub ↗
Visual SLAM
CategoryComputer Vision
StackC++ · OpenCV · ROS2
Year2025
// what drives the work
01
AI Inference Optimization
KV cache constraints, attention approximation, speculative decoding. The gap between research throughput and production latency is the problem.
02
6G and AI-Native Networks
Intelligent RAN slicing, URLLC, O-RAN disaggregation. Networks that adapt to inference workloads — not the other way.
03
Distributed Systems
Disaggregated architectures, fault-tolerant pipelines, latency under contention. Systems that degrade gracefully, not catastrophically.
04
Compute Optimization
Memory bandwidth vs compute tradeoffs, CUDA kernel design, operator fusion. Every cycle is a constraint worth reasoning about.
05
Robotics and Edge AI
Real-time perception, sensor fusion, ROS2 pipelines. AI that works when the network doesn't exist and latency is physical.
06
Quantum Computing
QUBO, QAOA, quantum-inspired optimization. Already in Q-AIRAN-SLICE — next step is real circuit implementation.
GitHub Activity — aryanputta view profile ↗
GitHub contributions