Kaleem Ullah Qasim
PhD Candidate in Artificial Intelligence
Technical Skills
AI & LLM.
LangChain, LlamaIndex, CrewAI, AutoGPT, AgentGPT, Semantic Kernel, RAG (Retrieval Augmented Generation), Prompt Engineering, Fine-tuning (LoRA, QLoRA), GPT-4, Claude, Llama, BERT
ML & DL.
PyTorch, TensorFlow, HuggingFace Transformers, OpenAI API, FastAI, Neural Networks, Deep Learning, Computer Vision, Natural Language Processing, Transfer Learning, Few-shot Learning
Languages.
Python, TypeScript, JavaScript, SQL, REST APIs, GraphQL, Bash
Cloud & Infrastructure.
AWS (SageMaker, Lambda, S3), Google Cloud Platform, Azure, Docker, Kubernetes, CI/CD (GitHub Actions), Model Deployment
MLOps & Tools.
Git, MLflow, Weights & Biases, DVC, Kubeflow, Redis, PostgreSQL, MongoDB, Elasticsearch
Frameworks.
Streamlit, FastAPI, Django, Flask, Gradio, Dash, Plotly, Matplotlib, React
Vector DBs.
Pinecone, Weaviate, ChromaDB, Milvus, Faiss, Annoy
Professional Experience
AI Engineer & LLM Specialist, Upwork
- • Top Rated freelancer on Upwork (top 10%), with a 100% job success rate and all 5-star ratings from 20+ clients.
- • Developed production RAG (Retrieval Augmented Generation) chatbots using LangChain, LlamaIndex, and CrewAI, reducing client task completion time by 20% (from 50min to 40min) across 20+ projects with 95% answer accuracy on domain-specific queries.
- • Optimized local LLMs (Llama, Mistral) for privacy-focused enterprise deployments, ensuring GDPR compliance while improving task accuracy by 25% through fine-tuning and prompt engineering.
- • Built modular AI agents with LangChain and Streamlit, integrating vector databases (Pinecone, Weaviate) to boost semantic search accuracy by 35% and reduce API response latency by 40%.
- • Architected multimodal AI systems combining GPT-4 Vision and Claude for document analysis and data extraction, processing 10K+ documents monthly with 92% accuracy, achieving 100% client satisfaction across all engagements.
Research Contractor - Traffic Management via LLM Reasoning, University of Jeddah (Dr. Tariq Alsahfi)
- • Co-authored 2 research papers on LLM-based traffic flow and accident severity prediction published in Alexandria Engineering Journal (Q1) and arXiv, contributing to novel frameworks for spatio-temporal reasoning in large language models.
- • Developed the TraffiCoT-R framework using Chain-of-Thought prompting and recursive decomposition, improving traffic prediction accuracy by 19% over traditional ML baselines and 23% over deep learning models on real-world datasets.
- • Reduced research paper acceptance timelines by 30% through systematic literature reviews and automated citation analysis using Python and NLP techniques.
- • Collaborated on projects integrating LLMs with GIS data and graph neural networks to enhance traffic analysis, achieving 15% improvement in congestion prediction accuracy for smart city applications.
Research Contractor - AI Security & Cyber Deception, Zhejiang University (Dr. Haitao Xu)
- • Co-authored research paper published in Journal of Artificial Intelligence Research (JAIR, Q1) on AI-driven cybersecurity, focusing on deceptive affiliate marketing detection using machine learning and network analysis.
- • Developed the AdsFlow algorithm for automated ad detection and classification using BERT embeddings and XGBoost, improving ad analysis accuracy by 18% over rule-based systems with 89% precision on 50K+ samples.
- • Built Chrome extensions and Python tools for real-time deceptive ad detection using computer vision and NLP, preventing URL spoofing attempts with 94% detection rate and reducing false positives by 25%.
- • Designed ads intention classification system using transformer models (RoBERTa) to identify malicious patterns, enhancing fraud detection accuracy by 22% and processing 100K+ URLs daily.
- • Applied NLP techniques (Named Entity Recognition, Text Classification) to streamline e-crime investigations, reducing manual analysis time by 40% (from 10 hours to 6 hours per case) for law enforcement agencies.
Data Scientist, Chengdu Ayurveda Biotechnology Co., Ltd
- • Led the company to #1 position on Alibaba's medical equipment marketplace in China, achieving 180% YoY growth in sales through ML-driven demand forecasting and pricing optimization models.
- • Applied data-driven SEO optimization using predictive models for keyword ranking and click-through rate analysis, resulting in 95% increase in product search appearances and 65% improvement in organic traffic.
- • Increased company revenue by 35% through time-series forecasting (ARIMA, Prophet) and trend analysis of medical equipment sales, accurately predicting demand spikes with 87% accuracy.
- • Developed ensemble ML models (Random Forest, Gradient Boosting) for inventory optimization, reducing stockout incidents by 45% and storage costs by 28%, saving $120K annually.
- • Implemented real-time sales analytics dashboard using Python, SQL, and Plotly, integrating data from 5+ sources and improving decision-making response time by 60% for executive team.
- • Created customer segmentation model using K-means clustering and RFM analysis, identifying 4 distinct customer groups and increasing targeted marketing campaign efficiency by 40% with 25% higher conversion rates.
Education
Ph.D. in Artificial Intelligence
Southwest Jiaotong University - China
Specialization: Reasoning in LLMs, Prompt Engineering
Master in Computer Application Technology
Southwestern University of Finance and Economics - China
Specialization: NLP, Machine Learning, NLU, NLI
Selected Publications
Kaleem Ullah Qasim; Jiashu Zhang; Tariq Alsahfi; Ateeq Ur Rehman Butt
Journal of Artificial Intelligence Research (JAIR) • 8 citations
From Data to Decisions: Enhancing Financial Forecasts with LSTM for AI Token Prices
Rizwan Ali; Jian Xu; Muhammad Waqas Aslam; Kaleem Ullah Qasim; et al.
Journal of Economic Studies • 7 citations
TraffiCoT-R: A Framework for Advanced Spatio-Temporal Reasoning in Large Language Models
Tariq Alsahfi; Kaleem Ullah Qasim
Alexandria Engineering Journal • 1 citations
Understanding the Business of Online Affiliate Marketing: An Empirical Study
Haitao Xu; Yiwen Sun; Kaleem Ullah Qasim; et al.
IEEE INFOCOM 2025 • 1 citations
Accelerating Training Speed of Tiny Recursive Models via Curriculum Guided Adaptive Recursion
Kaleem Ullah Qasim; Jiashu Zhang
arXiv preprint
Ateeq Ur Rehman; Muhammad Asif; Kaleem Ullah Qasim; et al.
In Press
MARBLE: A Multi-Agent Rule-Based LLM Reasoning Engine for Accident Severity Prediction
Kaleem Ullah Qasim; Jiashu Zhang
arXiv preprint
Muhammad Waqas Aslam; Zhe Zhang; Kaleem Ullah Qasim
Available at SSRN
CORTEX-V: A Cognitive Reasoning Toolkit for Vision-Based, Cost-Efficient Layout Optimization
Muhammad Waqas Aslam; Zhe Zhang; Kaleem Ullah Qasim
Under Review
Selected Projects
AutoGen, CrewAI, LangChain, GPT-4, Vector DB
- • Developed novel generational learning framework for autonomous AI agents featuring failure-driven evolution and persistent memory architecture, enabling agents to learn from past mistakes across multiple generations.
- • Implemented automated failure-to-wisdom pipeline that transforms execution errors into structured preventive heuristics, reducing repeated failures by 73% and improving task success rates by 41% across agent generations.
- • Designed dynamic memory architecture with formal Synthesis Operator for abstracting experiential data into generalizable heuristics and universal principles, achieving 89% knowledge transfer efficiency between generations.
- • Demonstrated measurable performance gains in complex reasoning tasks with 35% improvement in operational efficiency and 28% reduction in average task completion time across successive agent generations.
Multi-Agent Workflow Orchestration System
CrewAI, LangGraph, OpenAI, Pinecone, Redis
- • Architected multi-agent system with 5 specialized agents (researcher, analyst, writer, critic, editor) for automated content generation, achieving 87% human-quality score and reducing content creation time by 65%.
- • Implemented hierarchical task decomposition using LangGraph for complex workflow orchestration, enabling parallel agent execution and reducing overall processing time by 52% compared to sequential approaches.
- • Integrated vector database (Pinecone) for agent memory and context sharing, enabling cross-agent knowledge transfer with 91% information retention and eliminating 83% of redundant API calls.
- • Deployed production system processing 1K+ multi-step tasks monthly with 94% success rate, Redis-based queue management, and real-time progress tracking for 20+ concurrent workflows.
LangChain, GPT-4, SQLAlchemy, FastAPI, PostgreSQL
- • Developed intelligent NL2SQL agent with intent classification using fine-tuned BERT, routing 500+ daily queries between SQL generation, data visualization, and Q&A with 94% routing accuracy and 91% SQL generation accuracy.
- • Built automated SQL generation, validation, and execution pipeline with error handling, reducing query time from 5 minutes (manual) to 10 seconds (automated) and eliminating 98% of syntax errors.
- • Created follow-up question generation system using GPT-4 to suggest contextual queries, increasing user engagement by 45% and session duration by 60% across 200+ non-technical users.
- • Simplified data interaction for 5 client organizations, eliminating dependency on data analysts and reducing data request backlog by 70% while maintaining 100% query privacy compliance.
SaRGeN (Suspicious Activity Report Generator)
Streamlit, XGBoost, Groq, Plotly, HuggingFace
- • Implemented ML-based suspicious transaction detection using XGBoost and anomaly detection algorithms, processing 10K+ daily transactions with 92% precision and 87% recall, reducing false positives by 45%.
- • Automated SAR generation using Groq-powered LLM inference (300+ tokens/sec), reducing compliance reporting time from 4 hours to 15 minutes per case with 95% report accuracy and full regulatory compliance.
- • Built real-time transaction monitoring dashboard with Plotly visualizations for pattern analysis, risk scoring, and anomaly trends, enabling compliance officers to investigate 3x more cases daily.
- • Integrated Redis caching and queue management for handling concurrent SAR generation requests, maintaining system stability under high load with 99.5% uptime and <500ms average response time.