spirosgyros.net

Understanding Medium's Story Analytics: A Comprehensive Guide

Written on

Chapter 1: My Journey with Medium

Approximately ten months ago, I embarked on a writing journey on Medium, primarily to enhance my writing skills and maintain a record of my learnings. Initially, I paid little attention to the statistics of my articles until I noticed a spike in views for two specific stories about meta cache consistency and Twitter events. This prompted me to monitor story statistics more closely and analyze various metrics.

As I contemplated my next writing topic, the idea of creating a blog post about how Medium’s analytics operate emerged. This led me to conduct thorough research into event publishing in systems and data analysis methodologies.

Section 1.1: Exploring Medium's Event Publishing

In this blog, I will delve into several key areas:

  • The process of how events are transmitted while reading stories on Medium, which we will observe using developer tools.
  • The potential backend architecture responsible for processing these events.
  • The queries dispatched to retrieve data for the statistics page.

The focus will be on user-level monthly statistics and metrics for individual stories. Below, you'll find a screenshot illustrating the features I intend to cover.

Screenshot of Medium's story statistics features

Section 1.2: Understanding Story-Level Statistics

This blog will emphasize a crucial aspect that is vital for any platform: event processing and data analysis. We will examine some intriguing design choices that Medium may have implemented or could consider in the future. Please remember, this blog reflects my interpretation of how this system may be structured.

As you read any story, Medium actively tracks various interactions. Top-tier companies rely heavily on analytics to gauge everything from metrics to vital events. Medium, recognized as one of the premier blogging platforms, follows a similar approach. To observe this functionality in practice, follow these steps:

  1. Open any Medium post.
  2. Access the developer tools in your browser and navigate to the network tab.
  3. Select Fetch/XHR and review the batch operations. You'll notice that events are triggered as you interact with the page.

Chapter 2: Analyzing Event Payloads

When we scrutinize the payload sent, it appears as follows:

[

{

"key": "post.streamScrolled",

"data": {

"postIds": ["621b3456c9dc"],

"collectionIds": [""],

"sequenceIds": [""],

"sources": ["post_page"],

"tops": [101],

"bottoms": [11066],

"areFullPosts": [true],

"loggedAt": 1712237305802,

"timeDiff": 1001,

"scrollTop": 665,

"scrollBottom": 1662,

"scrollableHeight": 14941,

"viewStartedAt": 1712237301647,

"service": "lite",

"browserWidth": 1114,

"referrerSource": "your_stories_page"

},

"type": "e",

"timestamp": 1712237305802,

"eventId": "lul9vsnejbnsgug6yt"

}

]

This payload indicates that as we scroll through a story, various events are sent to the backend, providing valuable information such as the event type ("key") and the timestamp of the scroll action.

The first video titled "Medium Stats for Beginners (complete breakdown)" provides a comprehensive overview of how Medium's analytics are structured and the different metrics available to users.

Section 2.1: Backend Processing of Events

Upon receiving events in the backend, several actions must occur:

  • Store the event in a data store.
  • Execute processing jobs to analyze the event data.
  • Ensure that the user's monthly statistics can be quickly accessed.

For reliable storage, systems like HDFS or popular services such as Amazon Redshift and Amazon S3 are utilized. Once the data is collected, processing jobs are executed to analyze the events and store the results in a database. The choice of database technology becomes crucial here.

To understand the data retrieval process, let's examine what happens when a stats page loads on Medium.

The second video titled "How to Find Your Audience on Medium in 2024✍️ with Stats & Demographics" dives into audience engagement metrics and demographic insights available on Medium.

Chapter 3: Querying User Statistics

When the stats page is accessed, Medium sends a request to the backend to retrieve the user's monthly data. The query structure is as follows:

query UserMonthlyStoryStatsTimeseriesQuery($username: ID!, $input: UserPostsAggregateStatsInput!) {

user(username: $username) {

id

postsAggregateTimeseriesStats(input: $input) {

__typename

... on AggregatePostTimeseriesStats {

...MonthlyStoryStats_aggregatePostTimeseriesStats

__typename

}

}

__typename

}

}

This query retrieves detailed timeseries statistics regarding a user’s posts, focusing on metrics like views and readers. The input parameters specify the user and the time frame for the data requested.

In conclusion, this analysis of Medium's analytics architecture illustrates the critical role of event processing and data storage in providing valuable insights into user engagement. The system mirrors concepts from the batch layer of lambda architecture, although the speed layer is absent. Can you envision features that might be implemented using a speed layer?

For more in-depth insights, you can check out my other detailed blog posts:

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Embracing Change: The Journey from Addiction to Healing

A personal account of overcoming smoking addiction and finding healing through self-awareness and courage.

Understanding the Loss of Smell: A COVID-19 Symptom to Note

Anosmia and dysgeusia are emerging symptoms of COVID-19; understanding their implications is crucial.

Guide to Maintaining Motivation and Productivity During Holidays

Discover effective strategies to stay motivated and productive during the festive season while enjoying the holiday spirit.

Embracing a Harmonious Outlook on Life

Discover the importance of understanding diverse perspectives and fostering love in our interactions.

Unlocking Happiness: Overcoming Negative Memories for a Brighter Future

Explore how negative memories may hinder your happiness and discover ways to overcome them for a more fulfilling life.

Embracing Opportunities: The Myth of

Exploring the importance of seizing opportunities despite feeling unprepared.

A Comprehensive Overview of Generative AI for Beginners

Explore the fundamentals of generative AI, its differences from traditional AI, and its applications in language and image generation.

# Embrace Carbohydrates for a Longer, Healthier Life

Discover the benefits of carbohydrates for longevity and well-being, debunking low-carb diet myths.