Feature: Improve Chatbot Functionality and User Data Handling (!4) · Merge requests · Alex Rudaev / Assistant-Systems-Project

Alex Rudaev requested to merge feature/chatbot-improvement into refac/docs_and_personas Nov 14, 2024

This pull request introduces significant enhancements to the chatbot's functionality, user data management, and overall system robustness within the Assistant Systems project. The updates focus on expanding user data inputs, refining recommendation flows, improving fallback mechanisms, and updating configuration and dependency files to support these changes.

Key Enhancements:

1. Expanded User Data Collection:

src/app.py:
- Updated the user input form to capture additional health metrics, including average glucose level, marital status, work type, residence type, and smoking status.
- Implemented storage of this extended user data in JSON files located in data/user_data/, ensuring persistent and session-based data management.
- Displayed the stored user data within the application for user verification.

2. Enhanced Recommendation Flow:

actions/actions.py:
- ActionGenerateRecommendation:
  - Modified to utilize the newly collected user data fields for generating more accurate stroke risk predictions.
  - Updated to load and preprocess comprehensive user data from JSON files based on the session ID.
  - Improved error handling and logging for data loading, model processing, and prediction generation.
  - Introduced a risk_level slot to manage the flow of recommendations, triggering additional advice if the risk is high.
- ActionProvideStrokeRiskReductionAdvice:
  - Added to provide personalized lifestyle recommendations to users identified as high-risk.
- ActionFallback:
  - Implemented to handle unrecognized user inputs gracefully, prompting users to rephrase their queries.

3. Improved Data Analysis Reporting:

actions/actions.py:
- ActionShowDataAnalysis:
  - Enhanced to display key factors correlating with stroke risk, presenting correlation coefficients and descriptive explanations.
  - Updated to identify and report the best-performing machine learning model based on evaluation metrics.
  - Added comprehensive error handling to ensure reliability in data reporting.

4. Updated Configuration and Rules:

config.yml:
- Upgraded to version 3.1, updating the pipeline with random_seed and constrain_similarities for more consistent model behavior.
- Simplified pipeline configurations by removing redundant ResponseSelector entries.
data/rules.yml:
- Added new rules to manage the flow of providing stroke risk reduction advice based on user responses.
- Implemented rules to handle various chitchat intents and fallback scenarios effectively.
data/nlu.yml:
- Expanded intents with additional examples to cover more user interactions and synonyms for affirmative and negative responses.
data/stories.yml:
- Created new stories to handle scenarios where users seek recommendations and either accept or decline additional advice.
- Included fallback story to manage unrecognized user inputs.

5. Domain and Dependency Updates:

domain.yml:
- Updated intents, slots, and responses to accommodate new functionalities and user interactions.
- Defined new actions and slots, such as risk_level, to support enhanced recommendation flows.
requirements-actions.txt & requirements.txt:
- Specified exact versions for dependencies like numpy, joblib, scikit-learn, and rasa-sdk to ensure compatibility and stability.
- Organized dependencies for clarity and maintainability.

6. Docker and Ignore Configurations:

.gitignore & .ignore:
- Updated to exclude sensitive and unnecessary files from version control and Docker builds, including user data JSON files and various environment and cache directories.
docker-compose.yml:
- Refined volume mounts to streamline model management within Docker containers, ensuring organized access to chatbot and data analysis models.

7. Utility Script Enhancements:

dir_to_json.py:
- Improved to handle ignore patterns more effectively, ensuring that specific files and directories are appropriately excluded or represented in JSON outputs.

Summary:

This pull request significantly enhances the chatbot's ability to collect and utilize comprehensive user health data, providing more personalized and accurate stroke risk recommendations. By refining the data analysis reporting, improving fallback mechanisms, and updating configurations and dependencies, the system becomes more robust, user-friendly, and maintainable. These changes contribute to a more reliable and informative user experience within the Assistant Systems project.

Feature: Improve Chatbot Functionality and User Data Handling