Founder @pyxis3-ai · model-agnostic LLM serving infrastructure · London
ex-Seldon (vLLM, LLM inference) · ex-AWS Industry Specialist (semiconductors, AI/ML) · ex-Dell EMC · ex-IBM · author of the canonical AWS guide on decoupling RDS from Elastic Beanstalk — the same procedure AWS's official YouTube channel cites as "Watch Omar's video to learn more" (uploaded 2020-05-20).
I work on production AI infrastructure: LLM inference serving, multi-tenant MLOps, Kubernetes-native operations.
PYXIS3 — model-agnostic LLM serving infrastructure. The control plane that lets enterprises run language models without locking into a single cloud, model vendor, or inference runtime. Org: @pyxis3-ai.
All under @pyxis3-ai:
pyxis-arch— architecture thesis for model-agnostic LLM serving infrastructure. Public design notes, operating-model argument, decision rationale.vllm-bench— throughput + latency benchmark for OpenAI-compatible LLM endpoints (vLLM, TGI, llama.cpp, Ollama). Measures TTFT, TPOT, request and token throughput at percentiles. Async; two-dependency footprint. MIT.llm-serving-cookbook— production recipes for K8s-native vLLM-first serving. vLLM-on-EKS, KEDA autoscaling, token economics, TTFT optimisation, runtime selection. Apache-2.0.awesome-model-agnostic-llm— curated list of model-agnostic LLM tooling: serving runtimes, routers, evaluators, observability, standards, open weights. CC0.noor— semantic search over the Quran + Hadith corpus. Arabic-aware multilingual embeddings onsqlite-vec. FastAPI + Vue. Runs as a single Docker image, no external services.lens— in-cluster Kubernetes observability with in-browserkubectl exec. Vue 3 + Bun. Single binary, ServiceAccount-token auth. Built for ML-serving and GPU clusters.
- Decouple Amazon RDS instances from Elastic Beanstalk environments — AWS Knowledge Center. Canonical AWS guidance; still ranks #1 on Google for the topic. Authorship attributed on AWS's official YouTube channel since May 2020.
- Seldon Technologies · 2025–2026 · Senior Solutions Engineer on the production MLOps platform. vLLM-based LLM inference, multi-tenant model serving on Kubernetes.
- AWS London · 2022–2025 · Solutions Architect, Industry Specialist for the semiconductor industry vertical — AI/ML workloads on Inferentia, Trainium, SageMaker, Bedrock.
- AWS Cape Town · 2017–2022 · Cloud DevOps Engineer. Authored the canonical AWS Knowledge Center article on decoupling RDS from Elastic Beanstalk + companion AWS YouTube video.
- Dell EMC · 2016–2017 · Storage engineering (Isilon).
- IBM · 2016 · Cloud infrastructure.
- Earlier · OrecX, HONEST · Egypt, 2007–2012.
vLLM · Triton · Kubernetes · KEDA · Helm · Prometheus · Caddy · AWS · GCP · Azure · Python · Go · TypeScript




