Getting meta

Friends, your resident bug-counter has been, unh, busy lately. I’ve been travelling a lot over the past two months, between interviews for faculty positions and conferences. And it’s not over- I’ve got two more trips in May. So far, I’ve interviewed at three terrific institutions, met loads of great people, and will be interviewing at a fourth in a few weeks. I’m not going to comment on any specifics beyond that until I know the outcome of this situation, but I will tell you this: faculty interviews take a lot out of you. In my downtime, I’ve been throwing myself into writing R code. Mostly, because coding means I’m not talking to anyone.

But! This week, I also attended a Software Carpentry Instructor training workshop at Mozilla‘s Toronto offices. The net result of that is that I’ll be helping out developing Software Carpentry’s data management curriculum. Yes, instead of being a lone wolf, pontificating on the finer points of arranging data in spreadsheets on the twitterz, I’m joining a pack of (fire)foxes* which will help keep my pontifications in check.**

Of course, all these things are going toward helping me become a better bug counter, and, I hope to share everything I’ve learned with all of you in the future. But, anyway, that’s enough meta about me, let’s get meta about metadata.

In my forays into coding isolation these past weeks, I’ve started doing something that I’ve found useful- I’ve started embedding the metadata for my data- specifically, the descriptions of each data field within my actual script file.*** I don’t know if anyone else does this- mostly people provide metadata with the actual data- but what I’m finding is that, when I’m working on multiple projects at once, having the metadata at the ready (and within the file) really helps me keep track of just what my data is. So here’s simple example from a project I’m working on where we’re examining how the population responses of Harmonia axyridis, the multicoloured Asian ladybeetle, have changed over time:

#####################################
# variables and descriptions
# Data has five fields
#
# Year - year sample was taken
# Ordinal_date- day of year sample was taken
# Captures- Total number of Harmonia axyridis
# observed
# Traps - Total number of traps reporting on
# that sampling date
# Per_trap- Average number of Harmonia per
# trap
######################################

This is a simple example, but I know that if I have a lot of variables, and (especially in the case of #otherpeoplesdata) if field naming conventions are irregular in any way, just having some metadata describing what each field is really really helps you keep track of things, and also gives you something to refer to when calling different variables within your code. Who hasn’t called an object but gotten an error because they forgot that object’s name was capitalized, abbreviated in an odd way, etc? A simple tip, but a useful one!

*And yes, I’m working for free. Please do not tell any of the universities I have interviewed at that I’m willing to work for free. My kid has to eat. If you are reading this from one of the universities at which I’ve interviewed, unh…HEY, LOOK OVER THERE!
**Cheque? I don’t even know anymore. I’ve switched between Canadian and American English so many times this week I don’t know my zees from my zeds and my pops from my sodas.
***We can get even more meta and talk about metadata formatting later, but that might actually be too meta for this post.

Advertisements

About cbahlai

Hi! I'm Christie and I'm an applied ecologist and postdoc in the midwestern US. I am an #otherpeoplesdata wrangler, stats enthusiast, and, of course, a bug counter. I cohabitate with five other vertebrates: one spouse, one preschooler, one teeny baby and two cats.
This entry was posted in Uncategorized and tagged , , , , , , , , . Bookmark the permalink.

One Response to Getting meta

  1. Fair Miles says:

    Connected to your “pontifications” about how to use Excel, I have recommended using one sheet (tab) in the same document for meta [and not for more data! 😉 ]. When importing that to other software, you can just copy-paste and add that info in your (R) script [or save that sheets as a text file such as ReadMe.txt]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s