{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting Started with ySights\n", "\n", "This tutorial introduces the basics of ySights, a Python library for analyzing data from YSocial simulations.\n", "\n", "## What You'll Learn\n", "\n", "- How to initialize the YDataHandler\n", "- Loading and exploring simulation data\n", "- Working with Agents and Posts\n", "- Basic data queries\n", "\n", "## Prerequisites\n", "\n", "You need:\n", "- ySights installed (`pip install ysights`)\n", "- A YSocial simulation database file (`.db` format)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Importing ySights\n", "\n", "First, let's import the main components we'll be using:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2025-10-17T12:37:39.959853Z", "start_time": "2025-10-17T12:37:39.539352Z" } }, "outputs": [], "source": [ "from ysights import YDataHandler\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "# Set up matplotlib for nice plots\n", "plt.style.use('seaborn-v0_8-darkgrid')\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Initializing the Data Handler\n", "\n", "The `YDataHandler` is your main interface to the simulation database. It provides methods to query and analyze the data.\n", "\n", "**Note**: Replace `'path/to/your/simulation.db'` with the actual path to your YSocial database file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Initialize the data handler\n", "# Replace this path with your actual database path\n", "db_path = 'ysocial_db.db'\n", "\n", "try:\n", " ydh = YDataHandler(db_path)\n", " print(\"✓ Successfully connected to the database!\")\n", "except FileNotFoundError:\n", " print(\"✗ Database file not found. Please check the path.\")\n", " print(\" For this tutorial, we'll show the expected outputs.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Exploring the Simulation\n", "\n", "Let's get some basic information about the simulation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get simulation time range\n", "time_range = ydh.time_range()\n", "print(\"Simulation Time Range:\")\n", "print(f\" Min Round: {time_range['min_round']}\")\n", "print(f\" Max Round: {time_range['max_round']}\")\n", "print(f\" Duration: {time_range['max_round'] - time_range['min_round']} rounds\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get total number of agents\n", "num_agents = ydh.number_of_agents()\n", "print(f\"Total Agents in Simulation: {num_agents}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Working with Agents\n", "\n", "Agents represent the users in the simulation. Let's explore their properties." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get all agents\n", "agents = ydh.agents()\n", "print(f\"Retrieved {len(agents.get_agents())} agents\")\n", "\n", "# Look at the first agent\n", "first_agent = agents.get_agents()[0]\n", "print(\"\\nFirst Agent Properties:\")\n", "print(f\" ID: {first_agent.id}\")\n", "print(f\" Age: {first_agent.age}\")\n", "print(f\" Gender: {first_agent.gender}\")\n", "print(f\" Education: {first_agent.education}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Filtering Agents by Feature\n", "\n", "You can filter agents based on specific attributes:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get agents with age 25\n", "young_agents = ydh.agents_by_feature('age', 25)\n", "print(f\"Agents aged 25: {len(young_agents.get_agents())}\")\n", "\n", "# Get agents by gender\n", "female_agents = ydh.agents_by_feature('gender', 'F')\n", "print(f\"Female agents: {len(female_agents.get_agents())}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Age Distribution Visualization" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Collect ages\n", "ages = [agent.age for agent in agents.get_agents()]\n", "\n", "# Plot age distribution\n", "plt.figure(figsize=(10, 6))\n", "plt.hist(ages, bins=20, edgecolor='black', alpha=0.7)\n", "plt.xlabel('Age', fontsize=12)\n", "plt.ylabel('Number of Agents', fontsize=12)\n", "plt.title('Age Distribution of Agents', fontsize=14, fontweight='bold')\n", "plt.grid(True, alpha=0.3)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Working with Posts\n", "\n", "Posts represent the content created by agents in the simulation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get posts by a specific agent\n", "agent_id = 1 # Change this to any valid agent ID\n", "agent_posts = ydh.posts_by_agent(agent_id)\n", "\n", "print(f\"Agent {agent_id} created {len(agent_posts.get_posts())} posts\")\n", "\n", "# Examine the first post\n", "if agent_posts.get_posts():\n", " first_post = agent_posts.get_posts()[0]\n", " print(\"\\nFirst Post Details:\")\n", " print(f\" Post ID: {first_post.id}\")\n", " print(f\" Author: {first_post.user_id}\")\n", " print(f\" Round: {first_post.round}\")\n", " print(f\" Topic: {first_post.topics}\")\n", " print(f\" Emotion: {first_post.emotions}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Agent Interest Profiles\n", "\n", "Each agent has an interest profile showing their engagement with different topics." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get interest profile for an agent\n", "agent_id = 1\n", "profile = ydh.agent_interests(agent_id)\n", "\n", "print(f\"Interest Profile for Agent {agent_id}:\")\n", "for topic, score in list(profile.items())[:5]: # Show top 5 topics\n", " print(f\" Topic {topic}: {score:.3f}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing Interest Profile" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get top topics for visualization\n", "sorted_topics = sorted(profile.items(), key=lambda x: x[1], reverse=True)[:10]\n", "topics = [f\"Topic {t[0]}\" for t in sorted_topics]\n", "scores = [t[1] for t in sorted_topics]\n", "\n", "# Create bar plot\n", "plt.figure(figsize=(12, 6))\n", "plt.barh(topics, scores, color='steelblue', alpha=0.8)\n", "plt.xlabel('Interest Score', fontsize=12)\n", "plt.ylabel('Topics', fontsize=12)\n", "plt.title(f'Top 10 Topics for Agent {agent_id}', fontsize=14, fontweight='bold')\n", "plt.grid(True, alpha=0.3, axis='x')\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 7. Custom Queries\n", "\n", "For more complex analysis, you can execute custom SQL queries:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Example: Get top 5 most active agents\n", "query = \"\"\"\n", " SELECT user_id, COUNT(*) as post_count \n", " FROM post \n", " GROUP BY user_id \n", " ORDER BY post_count DESC \n", " LIMIT 5\n", "\"\"\"\n", "\n", "results = ydh.custom_query(query)\n", "print(\"Top 5 Most Active Agents:\")\n", "for i, row in enumerate(results, 1):\n", " print(f\" {i}. Agent {row[0]}: {row[1]} posts\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "In this tutorial, you learned:\n", "\n", "✓ How to initialize `YDataHandler` with your simulation database \n", "✓ Basic exploration of simulation time range and agent count \n", "✓ Working with `Agents` and filtering by features \n", "✓ Retrieving and examining `Posts` \n", "✓ Analyzing agent interest profiles \n", "✓ Creating visualizations of simulation data \n", "✓ Executing custom SQL queries for advanced analysis \n", "\n", "## Next Steps\n", "\n", "Continue with:\n", "- **Network Analysis Tutorial**: Learn how to extract and analyze social networks\n", "- **Algorithms Tutorial**: Explore profile similarity and recommendation metrics\n", "- **Visualization Tutorial**: Create advanced visualizations of simulation data" ] }, { "metadata": {}, "cell_type": "code", "outputs": [], "execution_count": null, "source": "" } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.13" } }, "nbformat": 4, "nbformat_minor": 4 }