Sobhan Hota

Founder Solution Principal in Boston, Massachusetts, United States

Results-oriented and research-driven professional with more than couple decades of experiences in building software solutions and products, while adhering to the principles of professional services.

Extensive experience in data integration, analytics, and ML/DL, with a strong focus on agentic and generative AI–driven NLP applications using LLMs and Retrieval-Augmented Generation (RAG) to address complex NLP and multimodal challenges. Demonstrated success in solving advanced text mining and information extraction problems—including high-accuracy sentiment analysis—through sophisticated feature engineering and selection techniques. This work has been presented at academic conferences and colloquia. Based on the background, founded CX Data & Analytics towards products and services on AI/NLP, Data Integration and Analytics.

Interest areas:

  • Gen-AI and LLMs OpenAI: [gpt-5.x, 4.x, 3.x, Mini, Turbo, Vision], Gemini 3,2.5 pro/falsh, Vision, Hugging-Face, DeepSeek, Qwen, TTS, OCR, LLama, Ollama, LM-Studio
  • Deep Learning (CNN/RNN, BERT, Col-BERT and Col-BERTv2)
  • Machine Leaning(Information Extraction) and Deep Learning towards solving NLP problems using reasonably higher accuracy. Tasks include [sentiment analysis (Voice of Customer data, reviews), NER, translation human languages], IR-information retrieval, Web crawling, parsing-xml/json,GIS data.
  • Enterprise Data Integration (DI) via (ETL/ELT), Analytics/Reporting (OLAP/OLTP) using products like: Pentaho, Superset, Talend, Informatica, Linux scripting, towards building Lakes/Warehouses to glean analytics via Dashboards on Pentaho, Tableau, Streamlit, Python libraries
  • Expertise in using observability platforms from Grafana Labs, OpenTelemetry, Splunk
  • Cloud-AWS, ELK, LAMP

Apps in Streamlit (LLMs/NLP/ML/DL/Statistics/Maths): https://hotstore.cxloop.co/

Presented/Authored:Insightful results with reasonably higher accuracies in industry, academic conferences, journals and symposiums. Below are just few.

Performing Gender: Automatic Stylistic Analysis of Shakespeare's Characters - https://dh-abstracts.library.virginia.edu/works/593

Stylistic text classification using functional lexical features: https://dl.acm.org/doi/10.5555/1234722.1234730

https://www.digitalhumanities.org/dh2007/dh2007.fullprogram.pdf

Scaling Financial Data Operations with Cloud-Ready ETL: https://pentaho.com/insights/blogs/scaling-financial-data-operations-with-cloud-ready-etl/

Slides:

Sentiment Analysis on VoC (Text mining ): https://www.slideshare.net/slideshow/sentiment-analysis-25568788/25568788

Gender in Shakespeare using feature selection via Text Mining: https://www.slideshare.net/slideshow/dhcs/16736998

Book Review- Communication in a Virtual Organization: https://www.semanticscholar.org/paper/Communication-in-a-Virtual-Organization-Hota/8ccc7f18bf5213721faf878eefabf21164175ba3

  • Memberships:
    • ACM (Senior)
    • SIG-AI
    • SIG-HCI
  • Fueling Innovation to Illinois Tech
  • Kiva (NGO)
  • Work
    • CX Data & Analytics LLC
  • Education
    • Illinois Institute of Technology
    • University of Hyderabad
    • Utkal University