Project Overview and Architecture
System Overview
The BYU-Idaho Support Agent is an enterprise-grade AI-powered student support system designed to provide instant, accurate, and values-aligned responses to student inquiries. The system leverages state-of-the-art language models, vector search capabilities, and cloud-native architecture to deliver 24/7 support services.
High-Level Architecture
The system follows a modern microservices architecture with clear separation of concerns:
┌──────────────────────────────────────────────────────────┐
│ Azure Cloud Platform │
├──────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌──────────────────┐ │
│ │ Frontend │ HTTP/SSE │ Backend API │ │
│ │ React/TS ├──────────►│ Python/FastAPI │ │
│ │ Container App │ │ Container App │ │
│ └─────────────────┘ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ AI Services & Data Layer │ │
│ ├─────────────────┬───────────────┬───────────────┤ │
│ │ OpenAI API │ Pinecone │ Azure │ │
│ │ (GPT-5.1) │ Vector DB │ Key Vault │ │
│ └─────────────────┴───────────────┴───────────────┘ │
│ │
└──────────────────────────────────────────────────────────┘
Core Components
React + TypeScript (Frontend)
The user interface is built with React and TypeScript. React is the industry-standard for building modern, responsive web apps. It enables component-based development, rapid prototyping, and easy state management. Developed by Facebook, React is a powerful tool for building complex user interfaces that are easy to maintain and scale.
TypeScript adds type safety to JavaScript, which drastically reduces bugs, increases maintainability, and is highly favored by senior developers for larger or mission-critical applications. Developed by Microsoft, TypeScript is a superset of JavaScript that adds static typing, which helps catch errors at compile time rather than runtime.
Here are some of the key takeaways:
- Single-page application providing the user interface
- Real-time streaming chat interface with markdown support
- Responsive design supporting desktop, tablet, and mobile devices
- Theme management (light/dark/terminal modes)
- Markdown support for rich text formatting
Python + FastAPI (Backend)
The backend is built with Python and FastAPI. Python has become the de facto standard for AI and machine learning applications, powering everything from research prototypes to production-scale systems at organizations like Google, Meta, and OpenAI. Its dominance in the AI space stems from several key strengths:
- Rich AI Ecosystem: Native support for industry-leading frameworks including OpenAI Agents SDK, Claude Agent SDK (Claude Code), Model Context Protocol (MCP), LangChain, CrewAI, Hugging Face Transformers, scikit-learn, and more
- Simplicity and Readability: Clean, expressive syntax that prioritizes developer productivity and code maintainability
- Rapid Prototyping: Dynamic typing and extensive standard library enable quick iteration and experimentation
- Scientific Computing: Robust support for numerical operations, data manipulation, and statistical analysis through NumPy, Pandas, and SciPy
- Community and Resources: Largest community of AI/ML practitioners, ensuring extensive documentation, tutorials, and third-party libraries
Python's integration with cutting-edge AI services makes it the optimal choice for building intelligent applications that require seamless interaction with language models, vector databases, and other AI infrastructure.
The AI agent system is built with the OpenAI Agents SDK, a powerful tool for building AI agents. The OpenAI Agents SDK is an official OpenAI framework that allows for the creation of completely agentic systems for tasks ranging from simple question answering to complex multi-agent orchestration systems. It lays the groundwork for future advancements in AI-powered support systems, making it a valuable asset for expansion into multi-agent systems.
FastAPI is a modern Python framework designed for speed, flexibility, and developer productivity. It offers automatic OpenAPI (Swagger) doc generation, async-first architecture, and type hints, making it easy for engineers to build high-performance APIs rapidly and reliably.
AI Agent System
- Agentic architecture with tool-calling capabilities
- Retrieval-Augmented Generation (RAG) for knowledge grounding
- Multi-agent system with specialized sub-agents
- Context-aware session and memory management
- Gaurdrails for ethical and secure behavior
- Tracing and logging for monitoring, evaluation, and fine-tuning
Technology Stack and Dependencies
Frontend Stack
Core Technologies:
- Language: TypeScript
- Framework: React
- Build Tool: Vite
- Styling: Tailwind CSS
- State Management: React Context API
- Routing: React Router
Key Dependencies:
- Material UI - Component library
- Lucide React - Icon library
- React Markdown - Markdown rendering
- Highlight.js - Code syntax highlighting
- Prism React Renderer - Enhanced code blocks
Backend Stack
Core Technologies:
- Language: Python
- Framework: FastAPI
- Server: Uvicorn
- AI Framework: OpenAI Agents SDK
Infrastructure Stack
Container & Orchestration:
- Docker (multi-stage builds)
- Azure Container Apps
- Azure Container Registry
Cloud Services:
- OpenAI API
- Azure App Configuration
- Pinecone Vector Database
Data Protection
Encryption:
- In Transit: TLS 1.3 for all communications
- At Rest: AES-256 encryption for sensitive data
Performance Specifications
Response Time Requirements
API Response Times (P95):
- Health Check: < 100ms
- Authentication: < 500ms
- Chat Initial Response: < 1000ms
- Streaming Token Rate: > 30 tokens/second
Throughput Specifications
System Capacity:
- Concurrent Users: 1000+
- Requests per Second: 500
- Streaming Sessions: 100 concurrent
- Vector Search QPS: 1000
Caching Strategy
Cache Layers:
- Browser Cache: Static assets (1 year TTL)
- CDN Cache: Public content (1 hour TTL)
- Application Cache: Session data (30 minutes TTL)
- Vector Cache: Embedding results (24 hours TTL)
Integration Points
External Service Integrations
Azure OpenAI Service
- Model: GPT-5.1
- Rate Limit: 300 RPM
- Fallback: OpenAI API
Pinecone Vector Database
- Environment: Production
- Dimensions: 3072
- Metric: Cosine similarity
Tavily Search API
- Purpose: Web search for research agent
- Rate Limit: 1000 queries/day
- Timeout: 10 seconds
Monitoring Integrations
Azure Application Insights
- Custom metrics for AI performance
- Distributed tracing enabled
- Log aggregation and analysis
- Alert rules for critical metrics
Development Workflow and Standards
Code Quality Standards
Frontend Standards:
- TypeScript strict mode enabled
- ESLint configuration enforced
- Prettier formatting required
- Component testing coverage > 80%
Backend Standards:
- Type hints required
- Black formatting
- Pylint score > 9.0
- Unit test coverage > 85%
CI/CD Pipeline
Build Pipeline:
- Code checkout
- Dependency installation
- Linting and formatting checks
- Unit test execution
- Security scanning
- Docker image build
- Container registry push
Deployment Pipeline:
- Infrastructure validation
- Blue-green deployment
- Health check verification
- Smoke test execution
- Traffic shifting (0% → 10% → 50% → 100%)
- Rollback on failure
Development Tools
Required Tools:
- Cursor IDE
- Docker Desktop for containerization
- Azure CLI/MCP for cloud operations
- Chrome DevTools for frontend debugging
Appendices
A. Glossary
- RAG: Retrieval-Augmented Generation
- SSE: Server-Sent Events
- LLM: Large Language Model
- FERPA: Family Educational Rights and Privacy Act
B. References
C. Contact Information
Technical Lead:
- Ron Vallejo, AI Engineer, BYU-Idaho
- Email: vallejor@byui.edu
Support:
This document is maintained by the BYU-Idaho AI Engineering Team and should be updated with each major release.