N
Neel Shah
Academic Work

Research & Publications

Peer-reviewed work at the intersection of NLP, public health, and distributed data systems.

3
Papers published
89+
Total citations
64
Max citations (single paper)
2018–2020
Publication range
Wireless Networks · Springer · 2018 Most Cited 64 citations
Read paper ↗

A framework for social media data analytics using Elasticsearch and Kibana

Neel Shah, Darryl L. Willick, Vijay K. Mago

Presents a scalable framework for real-time social media data processing using Elasticsearch and Kibana. The distributed architecture handles large-scale social media datasets, enabling real-time analytics, visualisation, and pattern discovery from streaming data sources.

Key Findings
  • Real-time data pipeline for large-scale social media streams
  • Distributed architecture using Elasticsearch + Kibana
  • Scalable to multiple social media platforms
  • 64 citations — most-cited work
ElasticsearchKibanaBig DataReal-time AnalyticsDistributed Systems
DOI: 10.1007/s11276-018-01896-2
Frontiers in Public Health · 2020 25 citations
Read paper ↗

Assessing Canadians Health Activity and Nutritional Habits Through Social Media

Neel Shah, Gautam Srivastava, David W. Savage, Vijay Mago

NLP algorithms analyse Canadian social media posts to evaluate population health trends. A Random Forest classifier achieved 93.4% accuracy distinguishing food-related content. The study mapped caloric intake vs. expenditure ratios across Canadian provinces, revealing 77.92% of Canadians showed caloric imbalance.

Key Findings
  • 93.4% accuracy — Random Forest classifier on food tweets
  • 77.92% of Canadians showed caloric imbalance
  • Created "Food in One" dataset — 338,889 food items with nutrition data
  • Top tweets: Coffee (38,785), Burgers (35,166), Pizza (34,369)
NLPPublic HealthMachine LearningCanadaRandom Forest
DOI: 10.3389/fpubh.2019.00400
Conference Paper / Technical Report · 2019
Read paper ↗

The analysis of Canada's health through social media using machine learning

Neel Shah

An early study applying machine learning techniques to Canadian social media data for population health analysis. Laid the groundwork for the 2020 Frontiers in Public Health publication, refining methodology and expanding dataset coverage.

Key Findings
  • Foundational ML study on Canadian social media health data
  • Precursor to the 2020 Frontiers paper
  • Applied early NLP and classification techniques at scale
Machine LearningCanadaSocial MediaHealth Analytics

Research Interests

🤖
LLMs & Health AI
Applying large language models to clinical and population health data for decision support.
🏥
Public Health Analytics
Using social signals and large-scale data to understand population health at national scale.
🧠
NLP & Text Mining
Text classification, sentiment, emoji/emoticon semantics in real-world large datasets.
⚙️
Distributed Data Systems
Real-time pipelines, Elasticsearch-based architectures for high-throughput analytics.
📦
AI-Ready Data Curation
Building and sharing clean, structured datasets optimised for LLM fine-tuning and RAG.
📡
Social Media Mining
Extracting health and behavioural signals from social platforms at scale.