Building at the frontier of
AI, data & health
Computer Engineer specialising in LLM deployment and data systems. Based in Ottawa, Canada. 10+ years turning complex data into real-world impact.
I'm Neel Shah — a Computer Engineer with deep roots in data engineering, NLP, and healthcare informatics, now focused on the rapidly evolving world of Large Language Models. My work spans building scalable data infrastructure, applying ML to health and social data, and integrating LLMs into production systems.
As Tech Lead at CIHI (Canadian Institute for Health Information), I lead large-scale PySpark pipelines processing over 1 billion Canadian health data points — covering all national registry, diagnosis, and pharma data — for government and NPO clients. I manage client relationships, lead the engineering team end-to-end through the SDLC, and operate across Azure, AWS, and Databricks. Working with PII at national population scale in a PIPEDA-regulated environment is daily reality — data governance and privacy compliance are non-negotiable. Before CIHI, at EXL Service, I built PySpark credit risk platforms for Goldman Sachs — powering Apple Card, Walmart Card, and GM Card risk decisioning. Previously, as a researcher at Lakehead University, I published peer-reviewed work in NLP and distributed data systems that has accumulated 89+ citations.
Today, my focus is on the AI layer: integrating Claude, GPT, and open-weight models into data workflows, deploying local LLMs for privacy-sensitive environments, and building the AI-ready datasets that make these systems actually work. I believe the next decade of impact will be won by engineers who can bridge raw data and LLM capabilities.
I also created emot — an early open source contribution that grew to 1M+ downloads. It's a reminder that the best tools solve one thing really well.
Originally from Vadodara, India — graduated 1st in my engineering class — I moved to Canada for graduate studies and have contributed to both the tech community and volunteer AI initiatives since.
- 📍 Ottawa, Ontario, Canada
- 🏢 CIHI (current)
- 🎓 Lakehead University
- 💻 10+ years experience
- 📄 3 research papers · 89+ citations
- 📦 1M+ open source downloads
- 🌍 5 languages
- English Native
- Hindi Native
- Gujarati Native
- French Elementary
- Sanskrit Limited
AI & LLM Skills
LLM Integration
Local LLM Deployment
AI-Ready Data Generation
ML & NLP
Other Technical Skills
Experience
Leads large-scale PySpark pipeline processing 1B+ Canadian health data points (registry, diagnosis, pharma) for government and NPO clients. Manages client relationships, leads engineering team end-to-end through full SDLC, and handles PII at national scale under PIPEDA and provincial privacy legislation.
Built PySpark-based credit risk management platforms for Goldman Sachs — covering Apple Card, Walmart Card, and GM Card portfolios. Processed high-volume financial transaction data with strict PII, audit, and regulatory compliance requirements.
Published 3 peer-reviewed papers on NLP and distributed social media analytics. Developed ML models and Elasticsearch-based pipelines for large-scale health data analysis.
Contributed to deep learning research, data analysis, and web infrastructure for an Indian AI research initiative.