Skip to content

An AI-powered agentic data analysis system that converts natural language queries into optimized SQL for Snowflake, generates insights, and creates interactive visualizations. Built with LangGraph for agent orchestration, FastAPI for scalable backend services, and Streamlit for an intuitive analytics interface.

Notifications You must be signed in to change notification settings

vishalpatel72/Snowflake-Data-Analyst

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“Š Snowflake Data Analyst Agent

An AI-powered data analysis system that converts natural language queries into SQL and provides comprehensive insights and visualizations for Snowflake data warehouses.

πŸš€ Features

  • Natural Language to SQL: Convert human-readable queries into optimized SQL for Snowflake
  • Multi-Portfolio Analysis: Analyze specific portfolios or all portfolios for a client
  • Intelligent Insights: AI-generated insights and key metrics from query results
  • Interactive Visualizations: Automatic generation of charts and graphs using Plotly
  • Agentic Architecture: Built with LangGraph for robust workflow management
  • Modern UI: Beautiful Streamlit interface with real-time data analysis
  • RESTful API: FastAPI backend for easy integration and scalability

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Streamlit     β”‚    β”‚   FastAPI       β”‚    β”‚   Snowflake     β”‚
β”‚   Frontend      │◄──►│   Backend       │◄──►│   Database      β”‚
β”‚   (Port 8501)   β”‚    β”‚   (Port 8000)   β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                       β”‚   LangGraph     β”‚
                       β”‚   Agent         β”‚
                       β”‚   (Groq LLM)    β”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Technology Stack

  • Frontend: Streamlit
  • Backend: FastAPI
  • AI/ML: LangGraph, Groq (LLaMA3-8b-8192)
  • Database: Snowflake
  • Visualization: Plotly
  • Data Processing: Pandas
  • Language: Python 3.11+

πŸ“‹ Prerequisites

  • Python 3.11 or higher
  • Snowflake account with appropriate permissions
  • Groq API key
  • Network access to Snowflake and Groq APIs

πŸš€ Quick Start

1. Clone and Setup

git clone https://github.com/vishalpatel72/Snowflake-Data-Analyst.git
cd Snowflake-Data-Analyst

2. Install Dependencies

# Install using uv (recommended)
uv sync

# Or using pip
pip install -e .

3. Configure Environment

# Copy the example environment file
cp env.example .env

# Edit .env with your credentials
nano .env

Required environment variables:

# Snowflake Configuration
SNOWFLAKE_USER=your_username
SNOWFLAKE_PASSWORD=your_password
SNOWFLAKE_ACCOUNT=your_account
SNOWFLAKE_WAREHOUSE=your_warehouse
SNOWFLAKE_DATABASE=your_database
SNOWFLAKE_SCHEMA=your_schema

# Groq API Configuration
GROQ_API_KEY=your_groq_api_key

4. Run the Application

Option 1: Run Both Components

# Start both frontend and backend
python main.py

Option 2: Run Components Separately

Backend Only:

# Using the main script
python main.py --backend-only

# Or using uvicorn directly
uvicorn api.server:app --host 0.0.0.0 --port 8000 --reload

Frontend Only:

# Using the main script
python main.py --frontend-only

# Or using streamlit directly
streamlit run frontend/app.py --server.port 8501

Option 3: Check Setup Only

# Check dependencies and environment without starting services
python main.py --check-only

5. Access the Application

πŸ“– Usage

1. Web Interface

  1. Open http://localhost:8501 in your browser
  2. Enter your Client ID in the sidebar
  3. Optionally specify a Portfolio ID (leave empty for all portfolios)
  4. Type your natural language query
  5. Click "Analyze Data" to process
  6. View results, insights, and visualizations

2. API Usage

import requests

# Process a query
response = requests.post("http://localhost:8000/analyze", json={
    "client_id": "12345",
    "portfolio_id": "P001",  # Optional
    "user_query": "Show me the total portfolio value"
})

result = response.json()
print(result["analysis"])
print(result["insights"])

3. Example Queries

Portfolio Performance

  • "Show me the total portfolio value for client 12345"
  • "What is the performance of portfolio P001 over the last 6 months?"
  • "Compare returns across all portfolios for client 12345"

Asset Allocation

  • "What is the current asset allocation for portfolio P001?"
  • "Show me the top 10 holdings by value for client 12345"
  • "What percentage is invested in technology stocks?"

Risk Analysis

  • "Calculate the volatility of portfolio P001"
  • "Show me the Sharpe ratio for all portfolios"
  • "What is the maximum drawdown for client 12345?"

Transaction Analysis

  • "Show me all transactions for portfolio P001 in the last month"
  • "What are the largest trades made for client 12345?"
  • "Calculate the average trade size by asset class"

πŸ”§ Configuration

Snowflake Schema Requirements

The application expects your Snowflake database to have tables with the following structure:

-- Example portfolio table
CREATE TABLE portfolios (
    client_id STRING,
    portfolio_id STRING,
    portfolio_name STRING,
    total_value DECIMAL(18,2),
    created_date DATE,
    updated_date TIMESTAMP
);

-- Example holdings table
CREATE TABLE holdings (
    client_id STRING,
    portfolio_id STRING,
    asset_symbol STRING,
    asset_name STRING,
    quantity DECIMAL(18,4),
    market_value DECIMAL(18,2),
    allocation_percent DECIMAL(5,2),
    as_of_date DATE
);

-- Example transactions table
CREATE TABLE transactions (
    client_id STRING,
    portfolio_id STRING,
    transaction_id STRING,
    transaction_date DATE,
    asset_symbol STRING,
    transaction_type STRING,
    quantity DECIMAL(18,4),
    price DECIMAL(18,4),
    total_amount DECIMAL(18,2)
);

Customizing the Agent

You can customize the agent behavior by modifying:

  • SQL Generation: Edit prompts in agents/snowflake_agent.py
  • Visualizations: Modify VisualizationGenerator class
  • Insights: Customize QueryAnalyzer class
  • API Endpoints: Add new endpoints in api/main.py

πŸ“Š Features in Detail

Natural Language Processing

  • Converts complex queries into optimized SQL
  • Handles portfolio-specific and client-wide analysis
  • Supports time-based queries and aggregations

Data Visualization

  • Time Series: Line charts for temporal data
  • Distributions: Histograms and bar charts
  • Correlations: Scatter plots for relationships
  • Summary Tables: Statistical overviews

Insights Generation

  • Statistical analysis of numeric columns
  • Categorical data distribution analysis
  • Key metrics and trends identification
  • Financial analyst-style summaries

Error Handling

  • Graceful handling of connection issues
  • SQL error recovery and suggestions
  • User-friendly error messages
  • Comprehensive logging

πŸ”’ Security Considerations

  • Store sensitive credentials in environment variables
  • Use Snowflake role-based access control
  • Implement proper API authentication for production
  • Secure network connections to Snowflake
  • Regular credential rotation

πŸš€ Deployment

Production Setup

  1. Environment: Use a production Python environment
  2. Process Management: Use systemd or Docker
  3. Reverse Proxy: Configure nginx for SSL termination
  4. Monitoring: Implement health checks and logging
  5. Scaling: Use multiple API instances behind a load balancer

Docker Deployment

FROM python:3.11-slim

WORKDIR /app
COPY . .

RUN pip install -r requirements.txt

EXPOSE 8000 8501

CMD ["python", "main.py"]

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

For issues and questions:

  1. Check the documentation
  2. Review existing issues
  3. Create a new issue with detailed information
  4. Include error logs and configuration details

πŸ”„ Updates

Stay updated with the latest features and improvements:

git pull origin main
pip install -e .

Built with ❀️ using LangGraph, Groq, FastAPI, and Streamlit

About

An AI-powered agentic data analysis system that converts natural language queries into optimized SQL for Snowflake, generates insights, and creates interactive visualizations. Built with LangGraph for agent orchestration, FastAPI for scalable backend services, and Streamlit for an intuitive analytics interface.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages