Skip to content
View oabdrabo's full-sized avatar

Sponsoring

@vllm-project

Organizations

@pyxis3-ai

Block or report oabdrabo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
oabdrabo/README.md

Hi, I'm Omar 👋

Founder @pyxis3-ai · model-agnostic LLM serving infrastructure · London

ex-Seldon (vLLM, LLM inference) · ex-AWS Industry Specialist (semiconductors, AI/ML) · ex-Dell EMC · ex-IBM · author of the canonical AWS guide on decoupling RDS from Elastic Beanstalk — the same procedure AWS's official YouTube channel cites as "Watch Omar's video to learn more" (uploaded 2020-05-20).

I work on production AI infrastructure: LLM inference serving, multi-tenant MLOps, Kubernetes-native operations.


🚀 What I'm building

PYXIS3 — model-agnostic LLM serving infrastructure. The control plane that lets enterprises run language models without locking into a single cloud, model vendor, or inference runtime. Org: @pyxis3-ai.

🛠️ Public projects

All under @pyxis3-ai:

  • pyxis-arch — architecture thesis for model-agnostic LLM serving infrastructure. Public design notes, operating-model argument, decision rationale.
  • vllm-bench — throughput + latency benchmark for OpenAI-compatible LLM endpoints (vLLM, TGI, llama.cpp, Ollama). Measures TTFT, TPOT, request and token throughput at percentiles. Async; two-dependency footprint. MIT.
  • llm-serving-cookbook — production recipes for K8s-native vLLM-first serving. vLLM-on-EKS, KEDA autoscaling, token economics, TTFT optimisation, runtime selection. Apache-2.0.
  • awesome-model-agnostic-llm — curated list of model-agnostic LLM tooling: serving runtimes, routers, evaluators, observability, standards, open weights. CC0.
  • noor — semantic search over the Quran + Hadith corpus. Arabic-aware multilingual embeddings on sqlite-vec. FastAPI + Vue. Runs as a single Docker image, no external services.
  • lens — in-cluster Kubernetes observability with in-browser kubectl exec. Vue 3 + Bun. Single binary, ServiceAccount-token auth. Built for ML-serving and GPU clusters.

📚 Published

🏛️ Background

  • Seldon Technologies · 2025–2026 · Senior Solutions Engineer on the production MLOps platform. vLLM-based LLM inference, multi-tenant model serving on Kubernetes.
  • AWS London · 2022–2025 · Solutions Architect, Industry Specialist for the semiconductor industry vertical — AI/ML workloads on Inferentia, Trainium, SageMaker, Bedrock.
  • AWS Cape Town · 2017–2022 · Cloud DevOps Engineer. Authored the canonical AWS Knowledge Center article on decoupling RDS from Elastic Beanstalk + companion AWS YouTube video.
  • Dell EMC · 2016–2017 · Storage engineering (Isilon).
  • IBM · 2016 · Cloud infrastructure.
  • Earlier · OrecX, HONEST · Egypt, 2007–2012.

🧰 Stack

vLLM · Triton · Kubernetes · KEDA · Helm · Prometheus · Caddy · AWS · GCP · Azure · Python · Go · TypeScript


📫 Reach me

LinkedIn Email

Pinned Loading

  1. pyxis3-ai/lens pyxis3-ai/lens Public

    AI/LLM-serving observability for Kubernetes. Inspect vLLM/TGI/llama.cpp inference pods, browse resources, kubectl exec in-browser. Single Bun binary, in-cluster SA-token auth.

    Vue 1

  2. pyxis3-ai/noor pyxis3-ai/noor Public

    Multilingual AI semantic search (Arabic + English). Sentence-Transformers embeddings + sqlite-vec vector store + Arabic NLP normalisation. Sub-second on CPU. Quran + Hadith demo corpus.

    HTML 1

  3. pyxis3-ai/vllm-bench pyxis3-ai/vllm-bench Public

    Throughput + latency benchmark for OpenAI-compatible LLM endpoints (vLLM, TGI, llama.cpp, Ollama). TTFT, TPOT, throughput, percentiles. Model-agnostic.

    Python 1

  4. pyxis3-ai/awesome-model-agnostic-llm pyxis3-ai/awesome-model-agnostic-llm Public

    Curated list of model-agnostic LLM serving runtimes, routers, evaluators, and standards. Run LLMs without locking into one vendor.

    2

  5. pyxis3-ai/llm-serving-cookbook pyxis3-ai/llm-serving-cookbook Public

    Production recipes for running open-source LLMs on Kubernetes. vLLM-first, model-agnostic. vLLM-on-EKS, KEDA autoscaling, token economics, TTFT optimisation, runtime selection.

    1

  6. pyxis3-ai/pyxis-arch pyxis3-ai/pyxis-arch Public

    Pyxis architecture - public design notes, model-agnostic LLM serving infrastructure, and the operating-model argument behind the platform.

    1