Friday, 16 April 2021

The problem with Purrr (for Biologists)

 

So, it took me forever to complete the DataCamp course on Functional Programming with Purrr.

https://learn.datacamp.com/courses/foundations-of-functional-programming-with-purrr

With a bit of programming background, I’m a fan of functional programming, the aim of which is to avoid copy pasting errors and allows execution of functions on various subsets of data.

I started this course ages ago, and gave up repeatedly, electing to do other courses instead (with easy, standard R code). It should have been an ideal course for me, with famous ornithologist Auriel Fournier and some bird data.

Doing this course made me realise this version of reality:

Venn diagrams of the intersection between programmers and biologists.

In essence, all top blog posts on purrr are praiseworthy (this is a product of R code developer Messiah Hadley Wickham). The package implements ‘better’ versions of the ‘apply’ set of base R functions, which are ‘higher order’ functions, that really are useful (if you can put into the time to master them, since it requires multilevel thinking).

What is ‘better’? Well, I think there may well be two definitions, depending on where you sit on the coder-biologist spectrum. Better for a programmer is a ‘readability’ and ‘succinctness’. Better for biologists is also readability, but with the trust in the final answer a lot more important. I.e. things break down at the ‘succinctness’ level: because in a line of piped code, a biologist is wanting to know what happens at each level. A line of piped biologist code will have been built step by step to ensure that each line is doing what it should do. A programmer will weave these all together to achieve succinctness, while it may well make a lot more sense for a biologist to have ‘expanded’ code.

‘BOO!” say the programmers.

I say: “That is alright”.

The truth is, for ‘normal’ biologists, it will probably take less time to filter your data in an Excel spreadsheet and apply multiple functions across columns to get what you want compared to debugging your first attempts to code a line using apply().

What are my issues with purrr? Well, I hardly ever use lists, and the double bracket notation is just intimidating. A lack of familiarity with this data form doesn’t help, although it is seen everywhere in R output (take those GLMs output for example).  Actually, a simple form of purrr uses the map set of functions (which can return data as logical or vector or dataframe) and can be used on dataframes, pretty much like apply(), except you get ‘standardized’ output, which is apparently why it is cool with those that code on a daily basis rather than trudge around swamps (or deserts) with binoculars (i.e. ornithologists).  

Using purrr will require learning yet another packaging coding style. For instance ~.x is used …. To get stuff? Well, I wish I could give you an honest simple explanation, but I can’t. Here is an example from the course:

map_chr(sw_films, ~.x[["episode_id"]]) # readable? Not a chance.

map_chr(your_data, ~x[["column_name"]]) #sort of this, then return the data as a character vector.

Okay, I’m loosing you, so lets take a hectic example from the course.

This is the problem question :

What is the distribution of heights of characters in each of the Star Wars films?

That is simple right? Just make a histogram on a vector (or column with numeric data from a data frame).

Using the sw_films data from the ‘repurrrsive’ package:

library(repurrrsive); library(tidyverse)

data(sw_films)

This is the ‘clue’ code provided. WTF!? Hectic tidyverse vocabulary required. But okay: we just need to fill in the blanks so how hard can it be?

# Turn data into correct dataframe format

film_by_character <- tibble(filmtitle = map____(___, ___)) %>%

    mutate(filmtitle, characters = map(___, ___)) %>%

    unnest()

 

Damn hard. Line 1: When do you use ~.x; should you use [[]] or just “variable_name”, not to mention I initially thought map_df was the appropriate map function here.

Then line 2 – I just piped some data, so that should be available right? So should that be . or .x? Neither… need the data assignment spelled out.  And I presume I’ll use the ~.x [[]] again… nope, just “variable_name” this time.

So this is the solution (well, code that doesn’t return a fatal error, we’ll ignore the new warning from unnest for now. Screw you, code evolution):

film_by_character <- tibble(filmtitle = map_chr(sw_films, ~.x[["title"]])) %>%

  mutate(filmtitle, characters = map(sw_films, "characters")) %>%

  unnest()

 

Geez. That was step 1. We still need to solve this:

# Pull out elements from sw_people. Create a dataframe with the "height", "mass", "name", and "url" elements from sw_people.

sw_characters <- map____(___, `[`, c(___, ___, ___, ___))

 


Thank heavens the ‘[‘ is in there, because you’d have never solved that by yourself. This the solution.

sw_characters <- map_df(sw_people, `[`, c("height", "mass", "name", "url"))

 

 

Step 3: join the data frames. This should be a breeze! I mean, I use the join functions all the time.

But wait, what the hell is this c(“___”) thing? Surely, we could just do a rename on the fly? This screwed me…. I couldn’t do the join. At this stage, was it the join code that was the problem or the initial tibble creation?

# Join the two new objects

character_data <- inner_join(___, ___, by = c("___" = ___)) %>%

   # Make sure the columns are numbers

    mutate(height = as.numeric(height), mass = as.numeric(mass))

 

# My incorrect solution

character_data <- inner_join(sw_characters, film_by_character, by = c("characters" = url)) %>%

    mutate(height = as.numeric(height), mass = as.numeric(mass))

 

# My cheat code to get what they wanted:
character_data <- inner_join(film_by_character, rename(sw_characters, characters=url)) %>%

  mutate(height = as.numeric(height), mass = as.numeric(mass))

 

Thank G. Last step was make a facet ggplot chart, actually easy.

My horror: that was the foundational course, so I nearly died when on completing it that I was then recommended to do intermediate functional programming with purrr. Oh … my …. G...

I guess I just don't have a purrrsonality that purrrs. Miaow.

 

Wednesday, 24 March 2021

Britstown Bash

SABAP2 – The Britstown Bash

 

This bash was organized by Karoo birding legend Stefan Theron. Capitalizing on a local contact, a friend of his brother, Stefan managed to arrange free accommodation on the West Front Farm, 3 pentads southeast of Britstown. The purpose of the bash was personal for Stefan: he wanted to see if he could extend the range of Sclater’s Lark east, which we’d recorded in a visit prior to lockdown west of Britstown. Also, he wanted to validate if Karoo Lark really occur in this region (spoiler alert, they don’t). Most importantly, as with many Northern Cape regions, the area is chronically under atlassed, with many virgin pentads.

Given the limited facilities on the farm (we had access to a flat with kitchenette and just one bathroom), the bash was by invitation only to arguably the elite of Karoo’s atlassing community (Japie Claasen excluded): Salome Willemse (of the Namaqualand bird club), Alan Collett (Graaff Reinet), and Rudi Minnie (with special guest Henk Nel of Birdlasser).

The bash officially kicked off with a dinner hosted by Salome on Thursday evening (18th March), which I missed as I only arrived on the Friday: defence of the family’s orchard against raiding baboons my excuse. As such I arrived on Friday evening after casually atlassing up the back roads from Victoria West to a pentad map already well marked off, especially with the likes of Salome, who does 4-5 pentads a day. For the following day I was put down to do 2 ‘inaccessible’ pentads by bicycle.

However, I had a terrible night: while I’d been putting up my tent, the pole broke, tearing a massive hole in the flysheet. Just after midnight, it started to rain, necessitating I use the groundsheet as new flysheet, which required finding string and jury rigging something to keep off the rain.

The next morning I was exhausted, so I headed off in my vehicle instead to follow try and flesh out some partially covered pentads from the previous day, and see if I could get landowner permission to get into an ‘inaccessible’. I arrived at the landowner, Johan Viljoen’s, house at 6:30, which I reckoned shouldn’t be a problem since farmers normally get up early right? My arrival was announced by barking dogs: and with a lack of any signs of life from the house, my guilt levels started to rise. After about 10 minutes the stocky, bearded, grumpy farmer emerged. Luckily, armed with large 1:150 000 coverage map, plus 'Voels van die Karoo' booklets (printing sponsored by Western Cape Birding Forum, organized by Brian van der Walt), I was able to explain my presence in a way that even had Johan enthusiastic about the project and providing details of the farm and where to go to maximise the bird list. Certainly, memorable from that pentad were the numbers of Greater Kestrel: these were in large numbers throughout the region, attracted no doubt by the growing locust swarms.

I dawdled through the adjacent pentad, thinking it had already been covered, and took midday nap in the shade of some pine trees to catch up on the previous night. Then it was off further south to finish a pentad started the previous day close to Victoria West, and refuel: one low point on the versatile Suzuki Jimney is the 40 liter fuel tank, which sinks into the nervous zone too quickly to feel comfortable in this region with long distances between petrol stations. On the way back, passing my siesta spot by the pine trees, I saw a dead bird in the road. I thought I better stop to investigate (thinking maybe I’d get a new species for the pentad!). I picked up a headless Ring-necked Dove, breast-meat eaten and blood still fresh. Realising it had just been dropped, I looked left and right into the bushes for a potential perpetrator, and then up into the pine tree above me. I could not believe what I saw: the face of a Verreaux's Eagle Owl stared down at me not more than 3-4 meters above my head! My dash to the vehicle to get my camera unfortunately was less cautious than it could have been, startling the bird away. I was very grateful to Stefan for returning a few days later to get the proof of this special record. Nonetheless my arrival back at camp was rather late, making my camp companions rather nervous, given that Alan Collett had started his day with two punctures. Still, the eagle-owl was nearly as big news as the Black-collared Barbet, way out of range but clearly resident in the farm gardens.

Sunday saw most of the team except myself and Stefan heading back home. Stefan was up for hiking the bike route through the inaccessible pentads to the east. Heading off at 5:30, we got to the start of the first target pentad at 6:30 as the first sun rays hit the top of the Dolerite hills we would be navigating for the rest of the day. I dropped of Stefan as he marked off African and Plain-backed Pipit from a set of 3 birds flying silhouetted against the dawn: his call id skills defy the imagination. I headed to the farm, to do the introductions and get permission to access the inaccessible, where I would spend the rest of the morning, and rack up my highest tally of the trip, with 54 species. Meanwhile, Stefan would be racking up close to 80 species as a reward for his sweat and sore feet, including Cinnamon-breasted Warbler, which I’d tried for and failed. CBWs are a special obsession of mine.

Thus, in the afternoon, I got on my bike to head to the adjacent virgin pentad. The eastern edge of the pentad was bordered by beautiful dolerite cliffs promising African Rock Pipits, Raven, raptors and Cinnamon-breasted Warbler. Indeed, my cycle and hike would reward me with all except the raptors: Verreaux’s Eagle a stark absentee from my final trip list, despite many rumours of their presence from local landowners.

Back at the farm, Stefan got me to confirm that the garden White-eyes were indeed Orange River, which was achieved due to some great views from below with view of the distinct flanks. Interestingly, my only very good view of a White-eye from the previous day had been unmistakably Cape, with grey flanks. There has been an interesting encounter with Henk and Rudi identifying different species from opposite sides of the same tree the previous day, although they’d marked them off as Cape to be safe in the end. Another interesting species boundary here is Karoo and Black-chested Prinia: while Black-chested was certainly the default for the surveys, I was amazed to get clear views of Karoo together with hybrids in the mountains. The mountains offer thick vegetation, and very different avifaunal communities compared to the barren ‘vlakte’ in between dominated by Rufous-eared Warbler, Large-billed and Spike-heeled Larks, and distantly calling Northern Black or Karoo Korhaan.

On Monday Stefan headed out to confirm the Verreaux’s Eagle Owl and tick off the adjacent virgin pentads en-route back home. My targeted pentad turned out to have already been done, so I turned my attention to another inaccessible. By the time I’d figured out who the owner was and gone through the explanations and booklet delivery procedure, it was a bit later than I’d have liked. Then I ended up going through the wrong gate and getting lost on the wrong pentad. But the rest of the day was spent on the right pentad, rewarded by Secretarybird, a small pan with swarms of Lark-like Bunting and a curious armoured cricket. Knowing that no-one will do these inaccessible pentads again for the foreseeable future, with the sparsity of atlassers and need to conduct bashes in other parts of the country, I do like to put in the best effort and time I can when presented with the opportunity to do so.

I’d spend my last evening alone on the farm, getting up early to do two more virgin pentads along the N12 en-route home, with mediocre lists of the usual Karoo suspects, but with a feeling of satisfaction from having contributed immeasurably to the progress of SABAP2.

The bash ticked off over 40 pentads for the long weekend, including over 30 virgin pentads, a significant effort in the painting of the SABAP2 coverage map. Thanks again to Rikus van der Merwe and the many landowners that accommodated our endeavours, and especially Stefan and the other bash participants. Look forward to seeing you all again at the next one.

 



Young Ant-eating Chat

Young Jackal Buzzard

Young Large-billed Lark








 Before

 


After


 

 

 

 

 

 

 

  

 

 

 

 

  

 

 

Related Posts Plugin for WordPress, Blogger...