datacortex.in — agentic
$
> ✓ LLMs ✓ RAG ✓ Multi-Agent ✓ 7+ years

I build production-grade AI systems (LLMs, RAG, Agents) that actually work.

AI Engineer with 7+ years. Scalable data platforms, multi-agent systems, LLM-powered applications.

Full-time, contract, consulting|Remote / India
$datacortex.stats
7+years11+PyPI100+sourcesIndia · Hong Kong · France · US
datacortex.stack — pip list
$pip list --datacortex
> Package Name · Version · Status
PythonPython
FastAPIFastAPI
LangChain
PlaywrightPlaywright
PostgreSQLPostgreSQL
MongoDBMongoDB
DockerDocker
AzureAzure
GCPGCP
LLM APIs
RAG
11 packages in active use
datacortex --capabilities
$datacortex --capabilities
> AI systems that go beyond demos — built for reliability, scale, and real-world usage.
DataIngestRetrieveReasonOutput
cap-1

LLM Applications & Agents

  • RAG systems
  • Copilots
  • Multi-agent workflows
cap-2

AI System Architecture

  • Ingestion → retrieval → reasoning → output
  • End-to-end pipelines
  • Search→RAG hybrid flows
cap-3

Backend AI Engineering

  • FastAPI
  • Pipelines
  • Async systems
  • APIs
cap-4

Data & Intelligence Systems

  • Web-scale data extraction
  • Pipeline orchestration
  • Structured extraction
  • LLM-driven extraction from web, documents, and images
  • Validation
cap-5

Optimization

  • Latency
  • Cost
  • Reliability
  • Fallback systems
I design and ship AI systems built for reliability, scale, and real-world usage.
$ reflecta.live

Reflecta — Voice AI System

Production-ready voice-enabled AI for real-time conversations, structured extraction, and post-call analytics.

Voice AI • Real-time • Structured extraction

app.getreflecta.com
LLM reasoningConversation memorySTT/TTS pipelinePost-call analytics
agentensemble.py
1# AgentEnsemble - Production multi-agent orchestration
2from agentensemble import Agent, Pipeline
3
4# Define agents with tools
5researcher = Agent(
6 role="researcher",
7 tools=[web_search, read_doc],
8)
9
10# Build pipeline
11pipeline = Pipeline(
12 agents=[researcher, writer],
13 workflow="sequential"
14)
15
16# Run with observability
17result = pipeline.run(prompt="...")
$ codebase

Production-ready code, not prototypes

Real snippets from libraries I maintain. AgentEnsemble, ragfallback, and others—used by developers worldwide.

View on GitHub
datacortex.pipeline — stages
$datacortex pipeline --show
> Most AI systems fail because they stop at the model. I focus on the full system — from ingestion to optimization.
step-1

Ingestion

Structured + unstructured pipelines with orchestration and fallback

step-2

Retrieval

Hybrid RAG with query variation fallback and retrieval confidence

step-3

Reasoning

LLM orchestration, multi-step workflows, agents

step-4

Evaluation

Validation gates, fallback strategies, output quality checks

step-5

Observability

Logging, metrics, cost tracking

step-6

Optimization

Latency, token usage, infra efficiency

Pipeline flow
architecture.svg
DataIngestRetrieveReasonEvalOutput
This is what makes AI systems production-ready — not just the model, but the entire pipeline from data to deployment.
resume --experience
$resume --experience
> 7+ years · AI, Data, Engineering
ROLES
  • Kuration AI

    Founding AI Engineer

    AI & Scalable Data Engineering

  • Luminous Power Technologies

    Senior Manager — Data & Analytics, R&D

    Enterprise analytics & BI

  • Brainsfeed

    Head of Data & Analytics

    AI research platform → acquisition

  • Lynk

    Data Analytics and Automation

    Data pipelines

  • RightCust Technologies

    Data Scientist

    ML & analytics

BUILT_FOR
  • Web-scale intelligence extraction
  • NLP search & knowledge systems
  • Business-critical analytics pipelines
  • Startups, Enterprise R&D, Global platforms
India · Hong Kong · France · US
Built systems used in startups, enterprise R&D, and global platforms.
pypi.org/user/irfanalidv — 11 packages
$pip search irfanalidv
> Production-ready tools for AI agents, retrieval, data extraction, NLP
AgentEnsemble

Build coordinated AI agents with ReAct, Swarm, Pipeline, Debate, and WorkflowGraph patterns. Includes routing, planning, tool usage, RAG integration, and cost tracking. Comparable to LangGraph and CrewAI.

AgentEnsemble PyPI downloads
AgentCare

Voice AI framework for healthcare: call intake, structured extraction, missing-data recovery, appointment orchestration, and post-call analytics. Built for HIPAA-aware voice workflows.

AgentCare PyPI downloads
ragfallback

Stop RAG systems from failing silently. Adds query rewriting, retrieval confidence scoring, fallback strategies, and retry logic. Improves answer quality when retrieval is uncertain.

ragfallback PyPI downloads
RAGNav

Navigation-first RAG for long documents (PDFs, papers). Routes queries to the right pages, follows cross-references, retrieves coherent evidence. Better than chunk-and-embed for structured docs.

RAGNav PyPI downloads
scrapeflow-py

Production web scraping on Playwright. LLM extraction, hybrid selectors, session persistence, rate limiting, anti-detection. Workflow engine for large-scale data acquisition.

scrapeflow-py PyPI downloads
AskPandas

Query CSV data with natural language. Uses local LLMs for privacy—no data leaves your machine. AI-powered data engineering and analytics for tabular data.

AskPandas PyPI downloads
lingo-nlp-toolkit

Lightweight NLP toolkit bridging traditional pipelines and transformer-ready workflows. Fast preprocessing, tokenization, and language-powered features for ML applications.

lingo-nlp-toolkit PyPI downloads
PyroChain

Agentic feature engineering: PyTorch + LangChain agents that automate feature extraction from text, images, and multimodal data. AI agents collaborate to understand and process complex inputs.

PyroChain PyPI downloads
toxic-comment-classifier

Classify toxic comments using deep learning. Detects obscene language, threats, insults, and identity hate. Returns per-category scores and overall toxicity. Useful for content moderation and community safety.

toxic-comment-classifier PyPI downloads
datacortex.opportunities — available
$datacortex opportunities --available
> Seeking: teams building real AI products — not experiments
BuildScaleSolveShip
ROLES_I_THRIVE_IN
  • Building production-grade AI systems (LLMs, RAG, Agents)
  • Designing end-to-end architectures from data → reasoning → deployment
  • Solving messy, real-world problems where AI needs to actually work
  • Early-stage (0→1) or scaling systems (1→100)
ENGAGEMENT_TYPES

Full-time roles

Remote / India

Contract / freelance

Typical: ₹50,000–₹1,50,000/month (~$600–$1,800/month) depending on scope

Early-stage startups

Builder role, high ownership

Short-term consulting

Architecture, system design, debugging

VALUE_WHEN

I'm most useful when:

  • Your AI system works in demo but breaks in production
  • Your RAG pipeline is inconsistent or hallucinating
  • You need to move from prototype → real product
  • You want to build agent-based workflows, not just chatbots
  • You're dealing with complex data + LLM reasoning together
  • You need pipeline orchestration with reliable fallback across stages
  • You need to turn unstructured web data into structured tables at scale
EXPECT
  • End-to-end ownership (not just model work)
  • Strong system thinking (not "prompt hacks")
  • Fast execution with clean, scalable architecture
  • Honest technical decisions (build vs buy vs simplify)
PRIORITIZING_NOW
  • AI-native startups building core products
  • Teams working on agentic systems / copilots / automation
  • Roles where I can contribute to architecture + execution
Get in touch to discuss your project, role, or architecture review.
whoami --verbose
$whoami --verbose
> Senior AI systems builder — turning messy data into production intelligence
founder.png
Irfan Ali - AI Engineer & Data Scientist
Irfan AliLLM Systems · Agentic AI · Data Platforms
SYSTEM_THINKINGI focus on:
Reliabilityoverhype
Systemsoverscripts
Long-term maintainabilityovershort-term hacks

I care about failure modes, cost constraints, data quality, and real-world deployment challenges.

NAME

Irfan Ali

EDUCATION

M.Sc. Data Science (IISER Tirupati) · B.Tech CSE (Alliance University) · ISEP Paris Exchange

FOCUS

Designing and deploying production-grade AI systems at the intersection of LLM architectures, agentic workflows, and large-scale data platforms.

I build systems that ingest fragmented, real-world data and transform it into reliable, decision-ready intelligence.

BACKGROUND

Built and scaled AI/data platforms across startups and enterprise R&D (Kuration AI, Luminous, Brainsfeed). Owned systems end-to-end — from data acquisition and enrichment to modeling, orchestration, and deployment.

  • -11+ Python libraries on PyPI (AI/NLP/data systems)
  • -Architected autonomous data extraction & enrichment pipelines operating at web scale
  • -Designed cost-optimized, multi-LLM systems with intelligent routing and fallback logic
  • -Published research in neural-symbolic NLP and temporal topic modeling
ACHIEVEMENTS
  • Part of winning team — Philips Digital Healthcare Conclave
  • Led global, cross-functional data teams (India, Hong Kong, Europe, US)
  • Built production AI systems influencing real business decisions (not internal demos)
  • Designed platforms that contributed to international business expansion and acquisitions

I don't just build models — I build systems that survive production.

datacortex.contact
$datacortex contact
> Have a project in mind? Drop a message or schedule a call — Currently available for new projects — I'll respond within 24 hours.
formSend a message

Tell me about your project, role, or what you're building.

Quick & Secure

Your information is protected and will only be used to respond to your inquiry.

Privacy Protected • We'll respond within 24 hours

$cal.com/datacortex/30min

Prefer to talk?

Book a 30-minute call to discuss your project, role, or architecture review.

Fast response

Typically within 24 hours on business days.

Privacy first

Your info is only used to respond — never shared.

What I can help with

LLM SystemsRAG PipelinesAI AgentsArchitecture ReviewConsulting0→1 Builds

What to expect

1Reply within 24 hours
2Schedule a call if it's a fit
3Discuss next steps together
Get in Touch — I'm here to help with your AI systems.