An AI-powered data analysis system that converts natural language queries into SQL and provides comprehensive insights and visualizations for Snowflake data warehouses.
- Natural Language to SQL: Convert human-readable queries into optimized SQL for Snowflake
- Multi-Portfolio Analysis: Analyze specific portfolios or all portfolios for a client
- Intelligent Insights: AI-generated insights and key metrics from query results
- Interactive Visualizations: Automatic generation of charts and graphs using Plotly
- Agentic Architecture: Built with LangGraph for robust workflow management
- Modern UI: Beautiful Streamlit interface with real-time data analysis
- RESTful API: FastAPI backend for easy integration and scalability
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Streamlit β β FastAPI β β Snowflake β
β Frontend βββββΊβ Backend βββββΊβ Database β
β (Port 8501) β β (Port 8000) β β β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β LangGraph β
β Agent β
β (Groq LLM) β
βββββββββββββββββββ
- Frontend: Streamlit
- Backend: FastAPI
- AI/ML: LangGraph, Groq (LLaMA3-8b-8192)
- Database: Snowflake
- Visualization: Plotly
- Data Processing: Pandas
- Language: Python 3.11+
- Python 3.11 or higher
- Snowflake account with appropriate permissions
- Groq API key
- Network access to Snowflake and Groq APIs
git clone https://github.com/vishalpatel72/Snowflake-Data-Analyst.git
cd Snowflake-Data-Analyst# Install using uv (recommended)
uv sync
# Or using pip
pip install -e .# Copy the example environment file
cp env.example .env
# Edit .env with your credentials
nano .envRequired environment variables:
# Snowflake Configuration
SNOWFLAKE_USER=your_username
SNOWFLAKE_PASSWORD=your_password
SNOWFLAKE_ACCOUNT=your_account
SNOWFLAKE_WAREHOUSE=your_warehouse
SNOWFLAKE_DATABASE=your_database
SNOWFLAKE_SCHEMA=your_schema
# Groq API Configuration
GROQ_API_KEY=your_groq_api_key# Start both frontend and backend
python main.pyBackend Only:
# Using the main script
python main.py --backend-only
# Or using uvicorn directly
uvicorn api.server:app --host 0.0.0.0 --port 8000 --reloadFrontend Only:
# Using the main script
python main.py --frontend-only
# Or using streamlit directly
streamlit run frontend/app.py --server.port 8501# Check dependencies and environment without starting services
python main.py --check-only- Frontend: http://localhost:8501
- API Documentation: http://localhost:8000/docs
- API Health Check: http://localhost:8000/health
- Open http://localhost:8501 in your browser
- Enter your Client ID in the sidebar
- Optionally specify a Portfolio ID (leave empty for all portfolios)
- Type your natural language query
- Click "Analyze Data" to process
- View results, insights, and visualizations
import requests
# Process a query
response = requests.post("http://localhost:8000/analyze", json={
"client_id": "12345",
"portfolio_id": "P001", # Optional
"user_query": "Show me the total portfolio value"
})
result = response.json()
print(result["analysis"])
print(result["insights"])- "Show me the total portfolio value for client 12345"
- "What is the performance of portfolio P001 over the last 6 months?"
- "Compare returns across all portfolios for client 12345"
- "What is the current asset allocation for portfolio P001?"
- "Show me the top 10 holdings by value for client 12345"
- "What percentage is invested in technology stocks?"
- "Calculate the volatility of portfolio P001"
- "Show me the Sharpe ratio for all portfolios"
- "What is the maximum drawdown for client 12345?"
- "Show me all transactions for portfolio P001 in the last month"
- "What are the largest trades made for client 12345?"
- "Calculate the average trade size by asset class"
The application expects your Snowflake database to have tables with the following structure:
-- Example portfolio table
CREATE TABLE portfolios (
client_id STRING,
portfolio_id STRING,
portfolio_name STRING,
total_value DECIMAL(18,2),
created_date DATE,
updated_date TIMESTAMP
);
-- Example holdings table
CREATE TABLE holdings (
client_id STRING,
portfolio_id STRING,
asset_symbol STRING,
asset_name STRING,
quantity DECIMAL(18,4),
market_value DECIMAL(18,2),
allocation_percent DECIMAL(5,2),
as_of_date DATE
);
-- Example transactions table
CREATE TABLE transactions (
client_id STRING,
portfolio_id STRING,
transaction_id STRING,
transaction_date DATE,
asset_symbol STRING,
transaction_type STRING,
quantity DECIMAL(18,4),
price DECIMAL(18,4),
total_amount DECIMAL(18,2)
);You can customize the agent behavior by modifying:
- SQL Generation: Edit prompts in
agents/snowflake_agent.py - Visualizations: Modify
VisualizationGeneratorclass - Insights: Customize
QueryAnalyzerclass - API Endpoints: Add new endpoints in
api/main.py
- Converts complex queries into optimized SQL
- Handles portfolio-specific and client-wide analysis
- Supports time-based queries and aggregations
- Time Series: Line charts for temporal data
- Distributions: Histograms and bar charts
- Correlations: Scatter plots for relationships
- Summary Tables: Statistical overviews
- Statistical analysis of numeric columns
- Categorical data distribution analysis
- Key metrics and trends identification
- Financial analyst-style summaries
- Graceful handling of connection issues
- SQL error recovery and suggestions
- User-friendly error messages
- Comprehensive logging
- Store sensitive credentials in environment variables
- Use Snowflake role-based access control
- Implement proper API authentication for production
- Secure network connections to Snowflake
- Regular credential rotation
- Environment: Use a production Python environment
- Process Management: Use systemd or Docker
- Reverse Proxy: Configure nginx for SSL termination
- Monitoring: Implement health checks and logging
- Scaling: Use multiple API instances behind a load balancer
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
EXPOSE 8000 8501
CMD ["python", "main.py"]- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- Check the documentation
- Review existing issues
- Create a new issue with detailed information
- Include error logs and configuration details
Stay updated with the latest features and improvements:
git pull origin main
pip install -e .Built with β€οΈ using LangGraph, Groq, FastAPI, and Streamlit