Sobhan Hota
Founder Solution Principal in Boston, Massachusetts, United States
Results-oriented and research-driven professional with more than couple decades of experiences in building software solutions and products, while adhering to the principles of professional services.
Extensive experience in data integration, analytics, and ML/DL, with a strong focus on agentic and generative AI–driven NLP applications using LLMs and Retrieval-Augmented Generation (RAG) to address complex NLP and multimodal challenges. Demonstrated success in solving advanced text mining and information extraction problems—including high-accuracy sentiment analysis—through sophisticated feature engineering and selection techniques. This work has been presented at academic conferences and colloquia. Based on the background, founded CX Data & Analytics towards products and services on AI/NLP, Data Integration and Analytics.
Interest areas:
- Gen-AI and LLMs OpenAI: [gpt-5.x, 4.x, 3.x, Mini, Turbo, Vision], Gemini 3,2.5 pro/falsh, Vision, Hugging-Face, DeepSeek, Qwen, TTS, OCR, LLama, Ollama, LM-Studio
- Deep Learning (CNN/RNN, BERT, Col-BERT and Col-BERTv2)
- Machine Leaning(Information Extraction) and Deep Learning towards solving NLP problems using reasonably higher accuracy. Tasks include [sentiment analysis (Voice of Customer data, reviews), NER, translation human languages], IR-information retrieval, Web crawling, parsing-xml/json,GIS data.
- Enterprise Data Integration (DI) via (ETL/ELT), Analytics/Reporting (OLAP/OLTP) using products like: Pentaho, Superset, Talend, Informatica, Linux scripting, towards building Lakes/Warehouses to glean analytics via Dashboards on Pentaho, Tableau, Streamlit, Python libraries
- Expertise in using observability platforms from Grafana Labs, OpenTelemetry, Splunk
- Cloud-AWS, ELK, LAMP
Apps in Streamlit (LLMs/NLP/ML/DL/Statistics/Maths): https://hotstore.cxloop.co/
Presented/Authored:Insightful results with reasonably higher accuracies in industry, academic conferences, journals and symposiums. Below are just few.
Performing Gender: Automatic Stylistic Analysis of Shakespeare's Characters - https://dh-abstracts.library.virginia.edu/works/593
Stylistic text classification using functional lexical features: https://dl.acm.org/doi/10.5555/1234722.1234730
https://www.digitalhumanities.org/dh2007/dh2007.fullprogram.pdf
Scaling Financial Data Operations with Cloud-Ready ETL: https://pentaho.com/insights/blogs/scaling-financial-data-operations-with-cloud-ready-etl/
Slides:
Sentiment Analysis on VoC (Text mining ): https://www.slideshare.net/slideshow/sentiment-analysis-25568788/25568788
Gender in Shakespeare using feature selection via Text Mining: https://www.slideshare.net/slideshow/dhcs/16736998
Book Review- Communication in a Virtual Organization: https://www.semanticscholar.org/paper/Communication-in-a-Virtual-Organization-Hota/8ccc7f18bf5213721faf878eefabf21164175ba3
- Memberships:
- ACM (Senior)
- SIG-AI
- SIG-HCI
- Fueling Innovation to Illinois Tech
- Kiva (NGO)