Add Local Ollama NLP Backend for Natural-Language Computed Fields #491

kkhatke · 2025-11-29T10:33:55Z

Overview

This PR adds a complete local LLM-powered backend service for generating SQL expressions for custom computed fields in Graphic Walker using Ollama. The service runs entirely on-premises with no external API dependencies, ensuring privacy, cost-efficiency, and offline capability.

📹 Includes demonstration video: NLP-Based Compute Field

Purpose

Enable users to generate SQL computed field expressions from natural language descriptions using their local Ollama installation, providing a seamless integration between Graphic Walker's UI and local language models.

Key Features Implemented

1. Natural Language to SQL Conversion

Local LLM processing via Ollama integration
Intelligent prompt engineering for proper SQL generation
Support for conditional logic (CASE WHEN...THEN...ELSE...END)
Field-aware SQL expression generation
Mathematical and arithmetic operations support

2. Advanced SQL Processing

Comprehensive SQL processor (sql_processor.py):
- Markdown and code block removal
- Ollama-specific response pattern cleaning
- Model hallucination detection and filtering
- SQL validation and syntax checking
- Complex expression extraction from mixed content
- Configurable regex patterns for maximum flexibility

3. Robust Service Architecture

Ollama client (ollama_client.py):
- Async/await support for non-blocking operations
- Model availability validation and fallback logic
- Graceful error handling with specific error categorization
- Health checking with timeout protection
Service layer (ollama_service.py):
- Request processing with comprehensive logging
- Error categorization (timeout, connection, model, validation errors)
- Detailed request tracking and monitoring
Model management (model_manager.py):
- Dynamic model selection and validation
- Fallback model support
- Model availability caching
- Recommended models identification

4. Health Monitoring & Observability

Health monitoring (health_monitor.py):
- Comprehensive startup validation
- Multi-level health checks (connectivity, model availability, performance, resources)
- Health status tracking and history
- Degraded state detection
Metrics collection (metrics_collector.py):
- Request metrics tracking (response times, success rates, error types)
- Performance time series data collection
- Request history with filtering
- Detailed performance reports and trend analysis
- Thread-safe metrics collection
Monitoring dashboard (monitoring_dashboard.py):
- Real-time metrics aggregation
- Performance trend analysis
- Health status formatting
- Report generation (hourly, daily, weekly)
- Alert system for anomalies

5. Enterprise-Grade Logging

Structured logging (logging_config.py):
- JSON-formatted logs for log aggregation
- Contextual logging with request tracing
- Error categorization and detailed error logging
- Multiple output handlers (console, file)
- Configurable log levels and formats

6. FastAPI Integration

Complete REST API (app.py):
- Text-to-SQL endpoint with model override support
- Detailed SQL generation endpoint with processing info
- Raw SQL response processing endpoint
- Comprehensive health check endpoints
- Service information and model listing endpoints
- Full monitoring dashboard endpoints
- Metrics export endpoints (JSON, Prometheus formats)

Technical Implementation Details

API Endpoints

Primary Endpoints

POST /api/ollama-text2sql - Basic SQL generation from natural language
POST /api/ollama-text2sql/detailed - SQL generation with detailed processing information
POST /api/sql/process - Process raw SQL responses for testing/debugging

Health & Monitoring Endpoints

GET /api/health - Basic health check
GET /api/health/detailed - Comprehensive health status
GET /api/health/metrics - Health metrics and statistics
GET /api/health/history - Recent health check history
GET /api/service-info - Service configuration and status
GET /api/models - Available models from Ollama

Monitoring & Analytics Endpoints

GET /api/monitoring/dashboard - Full monitoring dashboard data
GET /api/monitoring/real-time - Real-time metrics
GET /api/monitoring/health-status - Health status for monitoring systems
GET /api/monitoring/performance-trends - Performance trends (configurable period)
GET /api/monitoring/report/{report_type} - Comprehensive reports
GET /api/metrics/summary - Metrics summary
GET /api/metrics/export?format=json|prometheus - Metrics export
GET /api/metrics/time-series/{metric_name} - Time series data
GET /api/metrics/requests - Request history

Configuration

Environment variables for flexibility:

OLLAMA_BASE_URL - Ollama server URL (default: http://localhost:11434)
OLLAMA_MODEL - Primary model (default: codellama:7b-instruct)
OLLAMA_FALLBACK_MODEL - Fallback model (default: llama3.2:latest)
OLLAMA_TIMEOUT - Request timeout in seconds (default: 90)
LOG_LEVEL - Logging level (default: INFO)
LOG_FORMAT - Log format: json or text (default: json)
LOG_FILE - Optional log file path
METRICS_RETENTION_HOURS - Metrics retention period (default: 24)
MAX_POINTS_PER_METRIC - Maximum metrics data points (default: 10000)

SQL Processing Pipeline

Input Processing: Validate and prepare raw response
Markdown Removal: Strip code blocks and formatting
Ollama Pattern Cleaning: Remove model-specific response patterns
Whitespace Cleanup: Normalize formatting and spacing
Validation & Extraction: Validate SQL keywords and extract core expression
Metadata Generation: Create processing metadata and complexity estimation

Error Handling Strategy

Error Categories:

TIMEOUT_ERROR - Request exceeded timeout
CONNECTION_ERROR - Cannot connect to Ollama
MODEL_ERROR - Model unavailable or generation failed
VALIDATION_ERROR - Invalid input
CONFIGURATION_ERROR - Configuration issues
UNKNOWN_ERROR - Other errors

Fallback Mechanism:

Try primary model
On failure, automatically attempt fallback model
Return error only if both fail

Files Added

Core Service Files

app.py - FastAPI application with all endpoints
config.py - Configuration management
ollama_client.py - Ollama integration client
ollama_service.py - Service layer

Processing & Validation

sql_processor.py - Advanced SQL processing and cleaning
model_manager.py - Model selection and validation

Monitoring & Observability

health_monitor.py - Health checking and monitoring
metrics_collector.py - Metrics collection and analytics
monitoring_dashboard.py - Dashboard data aggregation
logging_config.py - Structured logging configuration

Documentation & Configuration

requirements.txt - Python dependencies
README.md - Quick start guide
OLLAMA_NLP_BACKEND_GUIDE.md - Comprehensive guide
.env.example - Environment configuration template

Dependencies

fastapi - Web framework
uvicorn - ASGI server
python-dotenv - Environment configuration
ollama - Ollama Python client
psutil - System resource monitoring
pytest - Testing framework
pytest-asyncio - Async testing
pytest-cov - Coverage reporting

Testing

The implementation includes comprehensive testing capabilities:

Async operation support with pytest-asyncio
Mock health checks and model availability
Edge case handling (empty prompts, malformed responses)
Error categorization and fallback testing
Metrics collection validation
Performance benchmarking support

Deployment Considerations

Prerequisites

Ollama server running (default: localhost:11434)
At least one model pulled (recommended: codellama:7b-instruct)
Python 3.8+

Quick Start

cd packages/nlp-backend
pip install -r requirements.txt
uvicorn app:app --reload --port 3002

Production Deployment

Use production ASGI server (Gunicorn + Uvicorn)
Configure logging to file
Set up monitoring/alerting
Use environment variables for configuration
Implement load balancing if needed

Performance Characteristics

Typical Response Time: 3-15 seconds (model-dependent)
Fallback Logic: <1 second additional delay
Health Check Overhead: <10ms
Metrics Collection: Lock-based thread-safe implementation
Memory Usage: ~50-200MB base + model size

Security Considerations

SQL Injection Prevention:
- Basic pattern detection for dangerous keywords
- User disclaimer for custom expressions
- SQL validation before output
Local Processing:
- No external API calls
- Data remains on user's infrastructure
- No cloud dependencies
Input Validation:
- Prompt length checking
- Empty input rejection
- Response validation

Backward Compatibility

Service is new, no breaking changes to existing code
Can be optionally enabled/disabled in UI
Falls back to manual SQL entry if service unavailable

Documentation Provided

Comprehensive setup guide with platform-specific instructions
API documentation for all endpoints
Configuration reference
Usage examples
Troubleshooting section

Demonstration

Video Walkthrough

A comprehensive demonstration of the Ollama NLP Backend for Custom Computed Fields:

📹 Download: NLP-Based Compute Field (Right-click → Save link as... to download)

The video covers:

✅ Service startup and initialization
✅ Health check system in action
✅ Natural language to SQL conversion examples
✅ Fallback model mechanism
✅ Error handling and recovery
✅ Monitoring dashboard overview
✅ Metrics collection and analysis
✅ Real-world usage scenarios

Example Conversions

Example 1: Conditional Logic

Prompt: "If revenue > 1000 then High else Low"
Generated SQL: CASE WHEN revenue > 1000 THEN 'High' ELSE 'Low' END

Example 2: Mathematical Operation

Prompt: "Calculate 10% of price"
Generated SQL: price * 0.1

Example 3: Complex Condition

Prompt: "If status is active then Premium else Basic"
Generated SQL: CASE WHEN status = 'active' THEN 'Premium' ELSE 'Basic' END

Example 4: Multi-field Logic

Prompt: "If revenue > cost then profit else loss"
Generated SQL: CASE WHEN revenue > cost THEN 'profit' ELSE 'loss' END

Related Issues

Enables custom computed fields feature in Graphic Walker
Provides foundation for AI-powered UI enhancements

Testing Checklist

✅ Service startup and initialization
✅ Health check endpoints
✅ Basic SQL generation
✅ Model fallback on failure
✅ Error categorization
✅ Metrics collection
✅ Request history tracking
✅ Monitoring dashboard data
✅ Different model types
✅ Edge cases (empty input, timeouts, etc.)

Notes for Reviewers

SQL Processing: The regex-based cleaning pipeline is designed to handle various Ollama model output formats
Async Implementation: Uses thread pool for sync Ollama client operations
Health Monitoring: Comprehensive but can be extended for specific infrastructure needs
Metrics: Thread-safe collection with configurable retention
Logging: Structured JSON format for easy log aggregation

…es#171) * fix: modal height * feat: color palette * fix: didnot fetch data in sometimes * feat: range scale

* feat: delete chart * feat: add slider and menu

* fix: typo * feat: fold new version * fix: remove console * fix: fold with computed fields

* fix: remove props from pure renderer * fix: show map when geoData is not set

* refactor: store * feat: add schema json * fix: setLayout * fix: methods for external use * feat: add slider and menu * fix: fix filter * fix: merge error * fix: export get computation * feat: add code export & import * fix: add fold Fields * fix: error panels * fix: props * fix: config desc * fix: change layout not to fire computation

* fix: make map center effective * fix: manual upload topojson

* fix: enter visSegment after import csv * fix: react-vega render deps * fix: properities * fix: default segment * chore: tabs behaivor * fix: pivottable reactive * fix: chart name idx * fix: props * fix: ux * fix: fold menu z-index * fix: add channel limit * chore: change getTemporalRange * fix: hook deps * fix: menu in dark mode

* feat: pivot sorting * fix: sorting when data is incomplete * fix: pivottable reactive

* fix: inner field cannot move to filter & meta * fix: field stat with computed field * feat: workflow with computed field * fix: computed filter not in graph

* feat: add new dataset * feat: force semantic for geo field

* fix: map mark with color, size, opacity scales * fix: size & opacity

* feat: add component and hooks * fix: filter * chore: add dataSource for filter mod * chore: code readabilty update * fix: count field id

* fix: absolute ele cause page hight wrong * feat: optimize misc config

* feat(tablewalker): add options to disable filter sorting and semantic type * doc: add showcase in storybook * chore: remove code review action

* fix: height style * fix: field height * chore: gallery page style

* chore: report event when edited * chore: add disposer for reactions

* fix: paint info * fix: container export * fix: paintmap is null

Co-authored-by: GitHub Action <action@github.com>

Kanaries#462)

* feat: select all when have keyword * fix: copilot * Update packages/graphic-walker/src/fields/filterField/tabs.tsx Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Elwynn Chen <270001151@qq.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* feat: workable poc for observable plot * fix: facet and repeat * fix: axis title * fix: single var view * fix:automark + tick color * fix: ts err * fix: temporal fix for export * feat(for dev): renderer switch * fix: stack direction * fix: temporal field crash * feat: stack for most marks * fix: mark directions * fix: support boxplot * feat(observable-plot): add color legend support (Kanaries#464) * fix(renderer): differentiate point and circle in observable plot (Kanaries#465) * feat(core): allow setting default renderer (Kanaries#466) * feat(renderer): default to vega-lite (Kanaries#467) * test: add observable plot transform cases (Kanaries#468)

* chore: add props of hide profiling in the main component * chore: update testing * fix: playground

Co-authored-by: GitHub Action <action@github.com>

* fix: vlSpec import and export * fix: map chart name

* feat: upgrade to react-19 * fix: Restore styled-components global * fix: command

Co-authored-by: GitHub Action <action@github.com>

vercel · 2025-11-29T10:34:02Z

@kkhatke is attempting to deploy a commit to the kanaries Team on Vercel.

A member of the Team first needs to authorize it.

kkhatke · 2025-11-29T10:45:08Z

Hi! This PR is ready for review.
Since this comes from a fork, GitHub requires maintainers to approve the workflow run.
Please approve the checks whenever convenient.
Thank you!

islxyqwe and others added 30 commits September 28, 2023 01:40

feat: add range, domain for channels & color scheme selection (Kanari…

5af72ce

…es#171) * fix: modal height * feat: color palette * fix: didnot fetch data in sometimes * feat: range scale

feat: optimize map color scheme

40cb4ee

Feat delete chart (Kanaries#172)

ba11b8e

* feat: delete chart * feat: add slider and menu

fix: analytic type transform bug (Kanaries#175)

79c0f94

Feat fold new (Kanaries#174)

85eb2c3

* fix: typo * feat: fold new version * fix: remove console * fix: fold with computed fields

release: 0.4.7

12089ea

fix: remove initial primary color (Kanaries#178)

40d8459

fix: themeConfig type & form re-design

43c0ae3

design: fix form design in light

11348ab

Fix: pure renderer props (Kanaries#182)

9f0ff2d

* fix: remove props from pure renderer * fix: show map when geoData is not set

feat: change temporal range (Kanaries#183)

947824c

feat: export function to mod filter (Kanaries#184)

0a0f613

release: 0.4.20

282a84a

fix: make eaiser to import old format data (Kanaries#185)

477786f

fix: make map center effective (Kanaries#188)

c0fb197

* fix: make map center effective * fix: manual upload topojson

feat: pivot sorting (Kanaries#191)

150e492

* feat: pivot sorting * fix: sorting when data is incomplete * fix: pivottable reactive

fix: computed field with filter (Kanaries#192)

90fccd9

* fix: inner field cannot move to filter & meta * fix: field stat with computed field * feat: workflow with computed field * fix: computed filter not in graph

release: 0.4.21 (Kanaries#194)

33181bf

feat: add gog lint (Kanaries#195)

7725826

feat: add new dataset (Kanaries#197)

6b596a2

* feat: add new dataset * feat: force semantic for geo field

fix: timeout img not working (Kanaries#200)

307033b

release: 0.4.22 (Kanaries#202)

87bba87

fix: add report compuation error (Kanaries#199)

74ce5b9

fix: pure renderer pivot table (Kanaries#204)

17108d2

fix: map mark with color, size, opacity scales (Kanaries#203)

a5eae83

* fix: map mark with color, size, opacity scales * fix: size & opacity

chore: control VizEmbedMenu in GLBOAL_CONFIG (Kanaries#205)

72f7f8a

Feat export filter mod (Kanaries#198)

dcbbe5c

* feat: add component and hooks * fix: filter * chore: add dataSource for filter mod * chore: code readabilty update * fix: count field id

doc: fix typo (Kanaries#206)

adda99a

ObservedObserver and others added 27 commits May 17, 2025 11:25

feat: auto detect uploaded file type (Kanaries#441)

d29a3ea

optimize misc config ui (Kanaries#443)

090ab07

* fix: absolute ele cause page hight wrong * feat: optimize misc config

Add disabling options for TableWalker (Kanaries#445)

d6a2c28

* feat(tablewalker): add options to disable filter sorting and semantic type * doc: add showcase in storybook * chore: remove code review action

Add export chart example (Kanaries#447)

9b8daf8

feat: add theme selection (Kanaries#449)

91063eb

Fix newOffsetDate handling of zero timestamp (Kanaries#451)

ce4d687

chore:improve height style (Kanaries#457)

74d0db0

* fix: height style * fix: field height * chore: gallery page style

chore: report edited (Kanaries#454)

3af37a0

* chore: report event when edited * chore: add disposer for reactions

fix: paint info (Kanaries#455)

6eb3170

* fix: paint info * fix: container export * fix: paintmap is null

fix(theme): ensure text mark visible in dark mode (Kanaries#459)

cc64c53

chore: hide tab nav (Kanaries#456)

f5a2628

fix: use sparse array (Kanaries#453)

94e5140

release: Update graphic-walker to 0.4.76 (Kanaries#458)

6f15b0f

Co-authored-by: GitHub Action <action@github.com>

fix: drag issue when dragging into color field cause of scroll context (

353eeb5

Kanaries#462)

fix: warning in boxplot (Kanaries#473)

c14c9ce

doc: add runcell

75f4c5f

chore: upgrade node engine

1aeb5f3

chore: add props of hide profiling in the main component (Kanaries#484)

9d391f4

* chore: add props of hide profiling in the main component * chore: update testing * fix: playground

fix: github actions (Kanaries#485)

d291683

release: Update graphic-walker to 0.4.77 (Kanaries#486)

e19d2f9

Co-authored-by: GitHub Action <action@github.com>

fix: import and export with vega-lite (Kanaries#489)

db7d20c

* fix: vlSpec import and export * fix: map chart name

feat: upgrade to react-19 (Kanaries#487)

709214f

* feat: upgrade to react-19 * fix: Restore styled-components global * fix: command

release: Update graphic-walker to 0.5.0 (Kanaries#490)

a4467ca

Co-authored-by: GitHub Action <action@github.com>

Add local Ollama NLP backend for natural-language to SQL computed fields

f9c2bd4

Added Compressed Demo Video

3d25908

Merge branch 'main' into feature/nlp-custom-field

78b38c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Local Ollama NLP Backend for Natural-Language Computed Fields #491

Add Local Ollama NLP Backend for Natural-Language Computed Fields #491

Uh oh!

kkhatke commented Nov 29, 2025

Uh oh!

vercel bot commented Nov 29, 2025

Uh oh!

kkhatke commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

Add Local Ollama NLP Backend for Natural-Language Computed Fields #491

Are you sure you want to change the base?

Add Local Ollama NLP Backend for Natural-Language Computed Fields #491

Uh oh!

Conversation

kkhatke commented Nov 29, 2025

Overview

Purpose

Key Features Implemented

1. Natural Language to SQL Conversion

2. Advanced SQL Processing

3. Robust Service Architecture

4. Health Monitoring & Observability

5. Enterprise-Grade Logging

6. FastAPI Integration

Technical Implementation Details

API Endpoints

Primary Endpoints

Health & Monitoring Endpoints

Monitoring & Analytics Endpoints

Configuration

SQL Processing Pipeline

Error Handling Strategy

Files Added

Core Service Files

Processing & Validation

Monitoring & Observability

Documentation & Configuration

Dependencies

Testing

Deployment Considerations

Prerequisites

Quick Start

Production Deployment

Performance Characteristics

Security Considerations

Backward Compatibility

Documentation Provided

Demonstration

Video Walkthrough

Example Conversions

Related Issues

Testing Checklist

Notes for Reviewers

Uh oh!

vercel bot commented Nov 29, 2025

Uh oh!

kkhatke commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants