Twitter Follower Value, revisited

In my last post, I presented a Groovy class for computing Twitter Follower Value (TFV), based on Nat Dunn’s definition of the term (number of followers / number of friends). That worked just fine. Then I moved on to calculating Total Twitter Follower Value (TTFV), which sums the TFV’s of all your followers. My solution ground to a halt, however, when I ran into a rate limit at Twitter.

It turns out I didn’t read the API carefully enough. I thought that to calculate TTFV, I would have to get all the follower ID’s for a given person and loop over them, calculating each of their TFV’s. That’s actually not the case. There is a call in the Twitter API to retrieve all of an individual’s followers, and the returned XML lists the number of friends and followers for each.

It’s therefore time to redesign my original solution. I first added a TwitterUser class to my system.

package com.kousenit.twitter

class TwitterUser {
    def id
    def name
    def followersCount
    def friendsCount

    def getTfv() { followersCount / friendsCount }

    String toString() { "($id,$name,$followersCount,$friendsCount,${this.getTfv()})" }
}

Putting the computation of TTV in TwitterUser makes more sense, since the two counts are there already.

The TwitterFollowerValue class has also been redesigned. First of all, it expects an id for the user to be supplied, and stores that as an attribute. It also keeps the associated user instance around so that doesn’t have to be recomputed all the time.

package com.kousenit.twitter

class TwitterFollowerValue {
    def id
    TwitterUser user

    def getTwitterUser() {
        if (user) return user
        def url = "http://api.twitter.com/1/users/show.xml?id=$id"
        def response = new XmlSlurper().parse(url)
        user = new TwitterUser(id:id,name:response.name.toString(),
            friendsCount:response.friends_count.toInteger(),
            followersCount:response.followers_count.toInteger())
        return user
    }

    // ... more to come ...

The getTwitterUser method checks to see if we’ve already retrieved the user, and if so returns it. Otherwise it queries the Twitter API for a user, converts the resulting XML into an instance of the TwitterUser class, saves it locally, and returns it.

The next method is something I knew I’d need eventually.

    // ... from above ...

    def getRateLimitStatus() {
        def url = "http://api.twitter.com/1/account/rate_limit_status.xml"
        def response = new XmlSlurper().parse(url)
        return response.'remaining-hits'.toInteger()
    }

    // ... more to come ...

Twitter limits the number of API calls to 150 per hour, unless you apply to be on the whitelist (which I may do eventually). The URL shown in the getRateLimitStatus method checks on the number of calls remaining in that hour. Since the XML tag is <remaining-hits>, which includes a dash in the middle, I need to wrap it in quotes in order to traverse the XML tree.

I added one simple delegate method to retrieve the user, which also initializes the user field if it hasn’t been initialized already.

def getTfv() { user?.tfv ?: getTwitterUser().tfv }

This uses both the safe dereference operator ?. and the cool Elvis operator ?: to either return the user’s TFV if the user exists, or find the user and then get its TFV if it doesn’t. I’m not wild about relying on the side-effect of caching the user in my get method (philosophically, any get method shouldn’t change the system’s state), but I’m not sure what the best way to do that is. Maybe somebody will have a suggestion in the comments.

(For those who don’t know, the Elvis operator is like a specialized form of the standard ternary operator from Java. If the value to the left of the question mark is not null, it’s returned, otherwise the expression to the right of the colon is executed. If you turn your head to the side, you’ll see how the operator gets its name. Thank you, thank you very much.)

Next comes a method to retrieve all the followers as a list.

def getFollowers() {
    def slurper = new XmlSlurper()
    def followers = []
    def next = -1
    while (next) {
        def url = "http://api.twitter.com/1/statuses/followers.xml?id=$id&cursor=$next"
        def response = slurper.parse(url)
        response.users.user.each { u ->
            followers << new TwitterUser(id:u.id,name:u.name.toString(),
                followersCount:u.followers_count.toInteger(),
                friendsCount:u.friends_count.toInteger())
        }
        next = response.next_cursor.toBigInteger()
    }
    return followers
}

The API request for followers only returns 100 at a time. If there are more than 100 followers, the <next_cursor> element holds the value of the cursor parameter for the next page. For users with lots of followers, this is going to be time consuming, but there doesn’t appear to be any way around that. The value of next_cursor seems to be randomly selected long value, so I just went with BigInteger to avoid any problems.

Note we’re relying on the Groovy Truth here, meaning that if the next value is not zero, the while condition is true and the loop continues.

Finally we have the real goal, which is to compute the Total TFV. Actually, it’s pretty trivial now, but I do make sure to check to see if I have enough calls remaining to do it.

def getTTFV() {
    def totalTTFV = 0.0

    // check if we have enough calls left to do this
    def numFollowers = user?.followersCount ?: getTwitterUser().followersCount
    def numCallsRequired = (int) (numFollowers / 100)
    def callsRemaining = getRateLimitStatus()
    if (numCallsRequired > callsRemaining) {
        println "Not enough calls remaining this hour"
        return totalTTFV
    }

    // we're good, so do the calculation
    getFollowers().each { TwitterUser follower ->
        totalTTFV += follower.tfv
    }
    return totalTTFV
}

That’s all there is to it. Here’s my test case, which shows how everything is supposed to work.

package com.kousenit.twitter;

import static org.junit.Assert.*;

import org.junit.Before;
import org.junit.Test;

class TwitterValueTest {
    TwitterFollowerValue tv

    @Before
    public void setUp() throws Exception {
        tv = new TwitterFollowerValue(id:'15783492')
    }

    @Test
    public void testGetTwitterUser() {
        TwitterUser user = tv.getTwitterUser()
        assertEquals '15783492', user.id
        assertEquals 'Ken Kousen', user.name
        assertEquals 90, user.friendsCount
        assertEquals 108, user.followersCount
    }

    @Test
    public void testGetTFV() {
        assertEquals 1.2, tv.tfv, 0.0001
    }

    @Test
    public void testGetFollowers() {
        def followers = tv.getFollowers()
        assertEquals 109, followers.size()
    }

    @Test
    public void testGetTTFV() {
        assertEquals 135.08, tv.getTTFV(), 0.01
    }
}

As you can see, my TTFV as of this writing is a little over 135, though my TTV is only about 1.2.

I also put together a script to use this system for a general user and to output more information:

package com.kousenit.twitter

import java.text.NumberFormat;

NumberFormat nf = NumberFormat.instance
TwitterFollowerValue tfv = new TwitterFollowerValue(id:'kenkousen')
total = 0.0
tfv.followers.sort { -it.tfv }.each { follower ->
    total += follower.tfv
    println "${nf.format(follower.tfv)}\t$follower.name"
}
println total

I need to supply an id when I instantiate the TwitterFollowerValue class. That id can either be numeric, as I used in my test cases, or just the normal Twitter id used with an @ sign (i.e., @kenkousen).

The cool part was calling the sort function applied after retrieving the followers. The sort method takes a closure to do the comparison. If this were Java, that would be the “int compare(T o1, T o2)” method from the java.util.Comparator interface, likely implemented by an anonymous inner class. I think you’ll agree this is better. 🙂 Incidentally, I used a minus sign because I wanted the values sorted from highest to lowest.

My result is:

12.135 Dierk König
10.077 Graeme Rocher
9.621 Glen Smith
4.667 Kirill Grouchnikov
3.89 Mike Loukides
3.1 Christopher M. Judd
3.01 Robert Fischer
3 Marcel Overdijk
2.847 Andres Almiray
2.472 jeffscottbrown
2.363 Dave Klein
2.322 GroovyEclipse
2.238 James Williams
2.034 Safari Books Online
...
0.037 HortenseEnglish
0.007 Showoff Cook
135.0820584094

Since this was all Nat’s idea, here’s his value as well:

6.281 Pete Freitag
5.933 CNY ColdFusion Users
3.085 Barbara Binder
2.712 Mike Mayhew
2.537 Jill Hurst-Wahl
2.406 Andrew Hedges
2.333 roger sakowski
2.138 Raquel Hirsch
1.986 TweetDeck
...
0.1 Richard Banks
0.092 Team Gaia
0.05 AdrianByrd
0.043 OletaMullins
0.039 SuzySharpe
122.8286508850

My TTFV is higher than his, but his TFV is higher than mine. Read into that whatever you want.

The next step is to make this a web application so you can check your own value. I imagine that’ll be the subject of another blog post.

Twitter Follower Value in Groovy

Nat Dunn, who runs the training company Webucator, posted an interesting idea on his blog. He was thinking about Twitter, and wondering about how the number of followers and friends (he called them “followees”) affected the likelihood of someone actually reading your tweets. If a person is following too many people, they can’t possibly read them all, and if they have millions of followers they can’t possibly pay attention to all of them.

He proposed a metric he called Twitter Follower Value (TFV), which is simply the ratio of the number of followers to the number of friends a person has. A person with a huge TFV like Tim O’Reilly (1,428,799 followers / 644 friends = 2,218.6) is a valuable person to have following you.

He also proposed a Total Twitter Follower Value (TTFV), which is the sum of the TFV’s of all of your followers. He then finished by saying, “So there it is. Anyone want to build a tool that calculates TTFV? That would be cool?”

Say no more. Twitter has a URL-based API that returns XML, making it ideal for Groovy experimentation.

package com.kousenit.twitter

class TwitterFollowerValue {	

    def countFriendsAndFollowers(id) {
        def url = "http://api.twitter.com/1/users/show.xml?id=$id"
        def response = new XmlSlurper().parse(url)
        [response.friends_count,response.followers_count]*.toInteger()
    }

    def getTFV(id) {
        def (numFriends,numFollowers) = countFriendsAndFollowers(id)
        numFollowers / numFriends
    }
}

The countFriendsAndFollowers method takes an id as an argument (either a Twitter name or an ID). It uses an XmlSlurper to make a GET request and parse the resulting XML response, returning a reference to the root of the tree. If you check the documentation for the show request, you’ll see that the root element is <user>, which has among its direct children elements <friends_count> and <followers_count>. In Groovy, each can be accessed with a simple dot operator. Here I extract both values and convert them to integers before returning them as a list.

The getTFV method invokes the countFriendsAndFollowers method and uses the cool Groovy 1.6+ capability of assigning multiple return values individually to variables. It then divides the number of followers by the number of friends and returns the result.

Here’s a test case to demonstrate how it all works. It’s an integration test because I decided not to mock Twitter (insert your own joke here), so it will fail if I get any more followers or friends, but as a first pass it worked fine.

package com.kousenit.twitter;

import static org.junit.Assert.*;

import org.junit.Before;
import org.junit.Test;

class TwitterValueTest {
    TwitterFollowerValue tv

    @Before
    public void setUp() throws Exception {
        tv = new TwitterFollowerValue()
    }

    @Test
    public void testCountFriendsAndFollowers() {
        def (friendsCount,followersCount) = tv.countFriendsAndFollowers('kenkousen')
        assertEquals 90, friendsCount
        assertEquals 108, followersCount
    }

    @Test
    public void testGetTFV() {
        assertEquals 1.2, tv.getTFV('kenkousen'), 0.0001
    }
}

(Yes, my Twitter id is @kenkousen)

What about TTFV? To do that, you need to find out who is following a person and then compute the TFV for each of them. The Twitter API includes a URL for retrieving the id’s of a given user, which makes that doable.

    // add to TwitterFollowerValue class...

    def getFollowerIds(id) {
        def url = "http://api.twitter.com/1/followers/ids.xml?id=$id&cursor=-1"
        def response = new XmlSlurper().parse(url)
        return response.ids.id
    }

    def getTTFV(id) {
        def totalTTFV = 0.0
        getFollowerIds(id).each { followerId ->
            totalTTFV += getTFV(followerId)
        }
        return totalTTFV
    }
}

The documentation for getting the follower id’s shows that the result is an XML tree with root <id_list>, which has a child element <ids>, and then all the individual id’s are wrapped in <id> grandchildren. Again this is trivial Groovy; you just traverse the tree and it all works.

There is a complication, in that the returned number of id’s is limited to 5000 a page. That’s what the cursor parameter is for. If someone has more than 5000 followers (and most of the big names do), then you’d have to go through the pages one by one to get all the id’s. As it turns out, that wasn’t my biggest problem, as you’ll see.

Computing the Total TFV is simply a case of computing TTV’s for each of the followers.

Now for the test to show that value. Unfortunately, I have no idea what that value is for me. I could have manufactured a test user and tried it out, but instead I just added the following to my test case:

    // add to test case above ...
    @Test
    public void testGetFollowerIds() {
        assertEquals 108, tv.getFollowerIds('kenkousen').size()
    }

// Serious rate-limiter problems guaranteed
//    @Test
//    public void testGetTTFV() {
//        println tv.getTTFV('kenkousen')
//    }

Checking the number of followers was easy. The problem was the commented-out method, which I was going to use just to see what the value was.

It turns out that the Twitter API limits the number of GET requests to 150/hour. If you apply and get on a whitelist, you can get that increased to 20,000/hour. For this calculation, though, that’s going to get used up really, really fast for almost anybody. Even checking my own followers (many of whom are no doubt spammers who follow tons of people) used up my quota almost right away.

Still, it was an interesting experiment, and any excuse to play with Groovy for an hour or so is a good one. 🙂

%d bloggers like this: