Getting Started with ySights
This tutorial introduces the basics of ySights, a Python library for analyzing data from YSocial simulations.
What You’ll Learn
How to initialize the YDataHandler
Loading and exploring simulation data
Working with Agents and Posts
Basic data queries
Prerequisites
You need:
ySights installed (
pip install ysights)A YSocial simulation database file (
.dbformat)
1. Importing ySights
First, let’s import the main components we’ll be using:
[1]:
from ysights import YDataHandler
import matplotlib.pyplot as plt
import numpy as np
# Set up matplotlib for nice plots
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline
2. Initializing the Data Handler
The YDataHandler is your main interface to the simulation database. It provides methods to query and analyze the data.
Note: Replace 'path/to/your/simulation.db' with the actual path to your YSocial database file.
[ ]:
# Initialize the data handler
# Replace this path with your actual database path
db_path = 'ysocial_db.db'
try:
ydh = YDataHandler(db_path)
print("✓ Successfully connected to the database!")
except FileNotFoundError:
print("✗ Database file not found. Please check the path.")
print(" For this tutorial, we'll show the expected outputs.")
3. Exploring the Simulation
Let’s get some basic information about the simulation.
[ ]:
# Get simulation time range
time_range = ydh.time_range()
print("Simulation Time Range:")
print(f" Min Round: {time_range['min_round']}")
print(f" Max Round: {time_range['max_round']}")
print(f" Duration: {time_range['max_round'] - time_range['min_round']} rounds")
[ ]:
# Get total number of agents
num_agents = ydh.number_of_agents()
print(f"Total Agents in Simulation: {num_agents}")
4. Working with Agents
Agents represent the users in the simulation. Let’s explore their properties.
[ ]:
# Get all agents
agents = ydh.agents()
print(f"Retrieved {len(agents.get_agents())} agents")
# Look at the first agent
first_agent = agents.get_agents()[0]
print("\nFirst Agent Properties:")
print(f" ID: {first_agent.id}")
print(f" Age: {first_agent.age}")
print(f" Gender: {first_agent.gender}")
print(f" Education: {first_agent.education}")
Filtering Agents by Feature
You can filter agents based on specific attributes:
[ ]:
# Get agents with age 25
young_agents = ydh.agents_by_feature('age', 25)
print(f"Agents aged 25: {len(young_agents.get_agents())}")
# Get agents by gender
female_agents = ydh.agents_by_feature('gender', 'F')
print(f"Female agents: {len(female_agents.get_agents())}")
Age Distribution Visualization
[ ]:
# Collect ages
ages = [agent.age for agent in agents.get_agents()]
# Plot age distribution
plt.figure(figsize=(10, 6))
plt.hist(ages, bins=20, edgecolor='black', alpha=0.7)
plt.xlabel('Age', fontsize=12)
plt.ylabel('Number of Agents', fontsize=12)
plt.title('Age Distribution of Agents', fontsize=14, fontweight='bold')
plt.grid(True, alpha=0.3)
plt.show()
5. Working with Posts
Posts represent the content created by agents in the simulation.
[ ]:
# Get posts by a specific agent
agent_id = 1 # Change this to any valid agent ID
agent_posts = ydh.posts_by_agent(agent_id)
print(f"Agent {agent_id} created {len(agent_posts.get_posts())} posts")
# Examine the first post
if agent_posts.get_posts():
first_post = agent_posts.get_posts()[0]
print("\nFirst Post Details:")
print(f" Post ID: {first_post.id}")
print(f" Author: {first_post.user_id}")
print(f" Round: {first_post.round}")
print(f" Topic: {first_post.topics}")
print(f" Emotion: {first_post.emotions}")
6. Agent Interest Profiles
Each agent has an interest profile showing their engagement with different topics.
[ ]:
# Get interest profile for an agent
agent_id = 1
profile = ydh.agent_interests(agent_id)
print(f"Interest Profile for Agent {agent_id}:")
for topic, score in list(profile.items())[:5]: # Show top 5 topics
print(f" Topic {topic}: {score:.3f}")
Visualizing Interest Profile
[ ]:
# Get top topics for visualization
sorted_topics = sorted(profile.items(), key=lambda x: x[1], reverse=True)[:10]
topics = [f"Topic {t[0]}" for t in sorted_topics]
scores = [t[1] for t in sorted_topics]
# Create bar plot
plt.figure(figsize=(12, 6))
plt.barh(topics, scores, color='steelblue', alpha=0.8)
plt.xlabel('Interest Score', fontsize=12)
plt.ylabel('Topics', fontsize=12)
plt.title(f'Top 10 Topics for Agent {agent_id}', fontsize=14, fontweight='bold')
plt.grid(True, alpha=0.3, axis='x')
plt.tight_layout()
plt.show()
7. Custom Queries
For more complex analysis, you can execute custom SQL queries:
[ ]:
# Example: Get top 5 most active agents
query = """
SELECT user_id, COUNT(*) as post_count
FROM post
GROUP BY user_id
ORDER BY post_count DESC
LIMIT 5
"""
results = ydh.custom_query(query)
print("Top 5 Most Active Agents:")
for i, row in enumerate(results, 1):
print(f" {i}. Agent {row[0]}: {row[1]} posts")
Summary
In this tutorial, you learned:
YDataHandler with your simulation databaseAgents and filtering by featuresPostsNext Steps
Continue with:
Network Analysis Tutorial: Learn how to extract and analyze social networks
Algorithms Tutorial: Explore profile similarity and recommendation metrics
Visualization Tutorial: Create advanced visualizations of simulation data
[ ]: