Booksoup
Booksoup allows you to analyse and traverse your downloaded facebook data, including features such as sentiment analysis and message frequency analysis over time.
Booksoup requires BeautifulSoup4 and TextBlob, and requires matplotlib to run the demo graphs.
Usage
BookSoup
Basic usage
from booksoup import BookSoup me = BookSoup("facebook-data") # Get a conversation by name convo = me.load_conversation("Jane Doe") # Print participants of the conversation print(convo.participants) # Print messages in the conversation for message in convo.messages: print(message.date, message.timestamp, message.name, message.content)
Interaction frequency
interaction_freq
me = BookSoup("facebook-data") convo = me.load_conversation("John Smith") times = convo.interaction_freq()
demo_interaction_frequency.py
Interaction timeline
interaction_timeline(name)
me = BookSoup("facebook-data") convo = me.load_conversation("Lewis, Andrew, Michelle and 4 others") times = convo.interaction_timeline(me.name)
demo_interaction_timeline.py
Another example below with one friend over a longer timeline:
Sentiment
Conversation.avg_sentiment(name)Conversation.sentiment_timeline
convo = me.load_conversation("David Grocer") # Print the average sentiment of David Grocer in the conversation print(convo.avg_sentiment("David Grocer")) # Print the timeline dictionary of my average sentiment in the conversation print(convo.sentiment_timeline(me.name))
Loading a conversation
A conversation can either be loaded using either the title of the conversation (as in all the previous examples) or the numerical ID of the conversation (the filename of the conversation's html file).
convo = me.load_conversation(40)
Specifying interval duration
monthdaymonthinterval
convo = me.load_conversation("David Grocer", interval="day")
Events
Booksoup can extract and categorise event information. This includes title, description, location, timestamp and a 2-element array containing the latitude and longitude of the event if available.
me = BookSoup("facebook-data") events = me.load_all_events() # Events are organised into attending, maybe, declined and no_reply: for event in events.attending: print(event.title, event.description, event.location, event.timestamp, event.latlon)