When I first started
blogging, I remember googling the question: what makes a good blog
post? Back then I wanted to set up a blog that was part diary, part
marketing tool for the guest house. Of course it has evolved along
the way and I've posted on a variety of topics with nature and fynbos
as the central themes. However, its all just been winging it and I've
never really known what makes a good blog post except from a personal
perspective; except now I have data to look at.
So, I've been blogging
ad-hoc for four years now, accumulating around 200 posts along the
way. Finally, using my blog as the data I can can try and answer the
following questions:
- Do longer posts (those with more words) get more hits?
- Do posts with more pictures get more hits?
- Do posts written on a particular day get more hits (from my facebook feed I know we're more social on the weekend for instance)?
- Is hit rate a function of time: i.e. are older posts getting more hits just because they have been around longer, or am I getting more hits now because I have more followers?
- Finally, and perhaps of greatest interest: does the sentiment of a post have an influence on hit-rate? By sentiment I mean are posts that are positive or negative in their overall tone impacting hit rate. This is an important question to a conservation biologist, where the fear is that the bad news that we are continually surrounded by may be putting people off. Sentiment is a difficult thing to measure, and I use the sentiment analysis tool Semantria to find out.
The short answer: ALL
of these are important, but some were important in ways I did not
expect.
- Do longer posts get more hits?
Generally, longer posts
do get more hits, but this is almost certainly because they are more
searchable over time. i.e. more words does not equal a good post, but
is good for long-term exposure. So that answers question 4: clearly I
have not been making inroads into gaining more readers, but I don't
blog regularly enough and I don't advertise, so I can't complain.
- Do posts with more pictures get more hits?
Definitely. The more
photos the better. Picture paints a thousand words bla bla. Since I
hardly ever uploaded more than 20 photos, I cannot advise if too many
pictures is a bad thing, but I'd guess that 10 or so pics is a good
rule of thumb.
- Do posts written on a particular day get more hits?
The results here
surprised me. For me: Wednesday is a good post day (mid-week hump?),
Thursday is very bad, and surprisingly, so is Saturday. Maybe too
much competition from other digital media on a Saturday? Overall, my
Saturday posts have been shorter, so perhaps that is a confounder
using these measures given the influence of time on hit rate as
Friday is a good day.
- Does the tone of an article influence hit-rate?
Anyone who works in the
conservation field knows that there are many depressing stories
around: climate change, species in endanger of extinction, pollution,
over-population etc etc. We also know that going on about these
things doesn't exactly make one the life of the party. So I try not
focus on the negative when I write.
Quantifying tone is
pretty difficult to do objectively. To do this I used a cool
analytical tool developed by the company Semantria
https://semantria.com/. You can try it out – they have a live-demo
on their website where you can post an article and it analyses words,
phrases, names and themes to come out with an overall score.
So while I was
encouraged to see that on balance my writing is neutral to positive
overall, the trend is towards negative articles having higher
hit-rate. Overall, the Semantria score was a poor predictor of
hit-rate though.
In summary, there are
many factors to take into account when writing up a good blog post,
and of course here I have only looked at trends from my blog –
factors could be very different for other blogs in other situations!
The technical bits (of
interest to data analysts only):
I used the R package
rvest to scrape and summarise my posts. I then used the MuMIn package
and the dredge function to choose the best model from these two
starting models:
model <-
glm(views~charactercount+semantria_score+photos+blogageDays+day,
data=blogdata, family=poisson)
lmmodel <-
lm(log(views)~charactercount+semantria_score+photos+blogageDays+fday,
data=blogdata)
I ran both because the
data followed a poisson distribution, but log transformed data were
gaussian and I find linear models easier to interpret.
The best poisson model
contained all variables in the final model, while the log transformed
data model dropped the Semantria score. Code and data available on request.
I don't worry about which day. Because of the 24/7 around the world my posts tend to get read over about 3 days. We are coming up to Google's deadline for slow-loading (especially on mobile) blogs so I'd be wary of too many photos, but 10 is my choice.
ReplyDeleteHi, and thanks for the note on the slow-load heads up! Yeah, day of the week is not really important.
Delete