About xieydd
π« If you wish to contact me, you can send an email to [email protected], or add my WeChat echo -n 'eGlleWRkX2hhaGEK' | base64 -d.
Follow me on Xiaohongshu (Chinese) and WeChat Official Account: δΈδ» AI Infra (Dongzai AI Infra)
- Xiaohongshu: δΈδ» AI Infra
Click to expand WeChat Official Account QR Code

π» As of 2026, I have over 8 years of experience in AI infrastructure, MLSys, Agent infrastructure, databases, and systems engineering. I have long worked on cloud-native AI training and inference platforms, managed vector database platforms, and systems performance optimization. I am familiar with Kubernetes, GCP, AWS, PostgreSQL, storage acceleration, resource scheduling, and Agent infrastructure, and can independently design and deliver large-scale distributed systems and cloud services. I also have extensive project management and cross-team collaboration experience, and keep contributing to AI Infra and cloud-native technology through technical exploration and open source work.
2023.5-present Tensorchord
- Leading the development of the serverless model inference platform ModelZ on GCP, optimizing inference performance around cold starts, image preheating, P2P distribution, Lazy Load, and model caching services:
- Reduced model service cold start time through cache model services and image preheating
- Deployed open source models and hosted user LoRA models with vLLM, improving LLM throughput and reducing TTFT and tail latency through KVCache and scheduling optimizations
- Cloud Team Leader, developing the vector database VectorChord’s cloud service and customer support VectorChord Cloud:
- Built a vector database based on PostgreSQL on AWS, achieving control and data plane separation, BYOC (Bring Your Own Cloud), BYOD (Bring Your Own Data) capabilities
- Implemented cloud-native architecture to achieve PostgreSQL storage and compute separation, high availability, Backup, PITR (Point-In-Time Recovery), In-Place Upgrade features
- Agent Harness and new business exploration:
- Built an enterprise-grade knowledge-base RAG system based on PostgreSQL as infrastructure for Agent memory; integrated 15 categories of workplace systems including Linear, Gmail, and GitHub, with hybrid vector and BM25 retrieval
- Developed eBPF-based Agent security projects, using CLI and Skills to provide Prompt Injection protection, Agent Behavior Detection, Agent Native security capabilities, and building a Skill Security SaaS platform to help enterprises improve Skill security and reduce supply-chain risk
2021.2-2023.5 Tencent Cloud
- Developed a large-scale AI platform for public cloud:
- Built a high-performance, scalable elastic offline training platform using Tencent Cloud EKS (Elastic Kubernetes Service)
- Integrated Tencent Cloud object storage and the GooseFS accelerator to create a high-performance cloud cache scheduling system and improve training data access efficiency
- Established FinOps infrastructure to help public cloud customers manage and optimize cloud costs more effectively, enhancing cloud resource utilization:
- Optimized scheduling and rescheduling, identified high and low priority tasks, and implemented intelligent elastic scaling
- Combined Tencent’s Ruyi kernel scheduler optimization and observability to optimize costs while maintaining service quality
- Launched a large-scale cost reduction initiative in the internal cloud, improving resource utilization through efficient resource allocation
2018-2021.2 (including internship) Unisound
- At the AI algorithm company Unisound, I was responsible for the development and operation of the Atlas supercomputing platform, supporting AI Labs teams running NLP, TTS, ASR, and CV model training. Key responsibilities included:
- Developing a large-scale intelligent scheduling system to optimize multi-tenant resource allocation and training resource utilization, improving MFU from 15% to 30%
- Enhancing the performance of the high-performance distributed file system Lustre
- Building a multi-layer cache cloud-native architecture to accelerate AI model training
- Worked on 8-bit training and inference optimization at Unisound, including visual model inference optimization on Kunlun chips and NVIDIA Edge Devices.
Skill set: Kubernetes, EKS, Kubeflow, GCP, AWS, Serverless, Lustre, JuiceFS, GooseFS, scheduling, eBPF, PostgreSQL, vector databases, RAG, AI training platforms, inference services, cold start optimization, GPU / Edge, Agent infrastructure etc.
Open Source Projects
π± Currently focusing on MLOps, MLSys, PostgreSQL, Agent infrastructure, eBPF, and FinOps, contributing to several open source projects:
- fluid Fluid, elastic data abstraction and acceleration for BigData/AI applications in the cloud. (Project under CNCF)
- crane Crane is a FinOps Platform for Cloud Resource Analytics and Economics in Kubernetes clusters. The goal is to help users manage cloud costs more easily while ensuring application quality.
- crane-scheduler Crane scheduler is a Kubernetes scheduler that can schedule pods based on actual node load.
- creator Creator is the brain of the crane project, containing the core algorithm module and evaluation module.
- openmodelz One-click machine learning deployment (LLM, text-to-image, etc.) at scale on any cluster (GCP, AWS, Lambda labs, your home lab, or even a single machine).
- clusternet [CNCF Sandbox Project] Managing your Kubernetes clusters (including public, private, edge, etc.) as easily as browsing the Internet
- vectorchord Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.
Recommended Blogs
| Type | Author/Company | Blog URL |
|---|---|---|
| Infra | Chris Riccomini | https://materializedview.io/ |
| Infra | Jack Vanlightly | https://jack-vanlightly.com/ |
| Math and Science | θεζ | https://kexue.fm/ |
| AI Infra | Colfax | https://research.colfax-intl.com/blog/ |
| Postgres | Gabriele Bartolini | https://www.gabrielebartolini.it/articles/ |
| AI | Sebastian Raschka | https://magazine.sebastianraschka.com/archive?sort=new |
| AI Algorithm | Tom Yeh | https://www.byhand.ai/ |
| AI Infra | Chip Huyen | https://huyenchip.com/blog/ |