Guest post- About Style

Things have been a bit crazy lately in the bug counting lab. There’s a few personnel moving on to new things, and I’m in a rush to get projects wrapped up with them before I’m no longer interacting with them on a daily basis. Last week, it was all about bees, but not the bees I promised you in the previous post. These were urban bees. These bees had a faster paced lifestyle and were more demanding than the simple, salt-of-the-earth bees I’ve come to love in agricultural systems. Maybe I’m just reading too much into it.

To make up for my lack of gentle cajoling to get you and your data through, I’ve found a few like-minded people to fill in for me. This first guest post is by the lovely and talented Ignasi Bartomeus, who wants you to be consistent with your style. HEED HIS WORDS FOR THEY ARE WISE.

“What was venerated as style was nothing more than an imperfection or flaw that revealed the guilty hand.” – Orhan Pamuk –My name is red-

The highest compliment you could pay an Islamic miniaturist in the year 1500 was to say that his work was indistinguishable from that of the old masters. To have a style of one's own was a sign of imperfection. What has this to do with data management? well, I could argue that when creating your dataset you should follow the old masters, make your dataset formatting indistinguishable from any other dataset (not your data!). That would facilitate a lot the task of analyzing data (and specially #otherpeoplesdata), because everyone would use the same conventions. Unfortunately we don’t really have old masters to follow, the number of researchers creating data is too large and heterogeneous to agree in one style to rule them all*. So I am not going to recommend being styleless, but the opposite. Know your style, and make it easy to identify.

When I refer to style I am not talking about the substance (e.g. how you structure data in variables, observations and values**), but about the form (e.g. naming conventions). While having a non consistent style doesn’t affect the quality of the data, it can really help reusing code, and speed up the cleaning and analysis process. For example, be consisten on which file formats you use, how you name the variables, what symbol represent no data, and do it consistently among and within data tables. I am biased to the R world, and I personally like to use csv comma separated tables, use variable_name style, never use CAPS and use NA for values with no data. If your data is tab separated, you use variable.name and NULL I may complain about it once, but I can a) easily tweak your data to look exactly like I like data to look like e.g. colnames(data) <– gsub(“.”, “_”, colnames(data)). I may do that for combining it with other data. Or most likely b) just get used to your notation very easily. However, if you have no style and combine randomly VarName1, var_name2 and name.var3 I will complain any single time I call one of your variables, for example:

< head(data$Variable.name)
Error: object 'Variable.name' not found

Oh wait, maybe was:

< head(data$Variable_name)
Error: object 'Variable_name' not found either

mmm….maybe

< head(data$variable_name)
Error: object 'Variable.name' not found, again.

at this point you will be forced to call colnames(data) once again.

<colnames(data)
< head(data$variable.name)

So we can argue about which style is better, if_using_underscore_increase_readability, or if a dot is.faster.to.type, you can choose your own set of rules, but be consistent and you will gain lot of time on the long run when working in command line stile software.

*I don’t have a note for that, but as a guest, I am trying to follow the blog’s style of having lots of foot notes.

** See references about that here, here and here.

*** and No, using colors in excel is not stylish, is having too much free time (unless you are sinestesic)

Advertisements

About cbahlai

Hi! I'm Christie and I'm an applied ecologist and postdoc in the midwestern US. I am an #otherpeoplesdata wrangler, stats enthusiast, and, of course, a bug counter. I cohabitate with five other vertebrates: one spouse, one preschooler, one teeny baby and two cats.
This entry was posted in Uncategorized and tagged , , , , , , , , . Bookmark the permalink.

One Response to Guest post- About Style

  1. Pingback: I have a guest post in Practical Management blog | Marginally significant

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s