What happens when you add one varied toolkit for digital methods, a research question, an enthusiastic team, the equivalent of a collective IV drip of caffeine, and pile them into a room for a week? Last week I had the pleasure of co-facilitating a project at the Digital Methods Initiative Winter School 2018 at the University of Amsterdam. The idea is that rather than trying to take analogue methods and apply them them to the internet, we take tools that are only possible because of the unique characteristics of the web. It’s an intense week of experimentation, research, and learning, all rolled into one; if you’ve worked in hackerspaces and startups, the general vibe of the place will be familiar. It’s like an explosion of several tangential whirlwinds that kind of finds a way to coagulate and settle within a week. While my organisation/control-freak tendencies were a little overwhelmed, with that one week sprint we probably saved ourselves a good three months of work where we would have been faffing around. I would like to perhaps save you some of the save headaches, by making it very clear some of the key methodological points & assumptions. (You know how I like to clarify assumptions).

1. Follow the question, not the tools

With such a glorious collection of tools for digital methods, it is tempting to just throw the tools at the data to see what happens. And, truth be told, this is what is needed a lot of the time. Yet the ‘throw everything at the wall and see what sticks’ approach can only be exploration, and does not the foundations of a sound methodology make. Once that’s done, there needs to be a process of reflecting in light of the questions, to be led by what is analytically interesting, and not to be led by what is technically possible.

2. Be open to an iterative evolution of your methodology

There is a while where it feels like you’re floating headlessly in space, unsure of your footing or where you’re going. After our first day, which I felt had been a complete, chaotic mess, I asked our facilitator how he felt our day went, and his first word was ‘structured. Because you have a clear research question’. Just to give you an idea. The long and short of it is that the experimental approach to new tools means you try things out, things break, you fix them, try again, and again. It is an incredibly iterative process without a clearly linear project plan, but instead morphs to what happens, and the deviations from the original line of thinking are also insightful.

3. You will still need local knowledge to do internet research

Image result for context meme Quantitative digital methods are not the be-all-and-end all; we need humans, insight, and local knowledge to make meaning of it all, much in the same way as a statistical test just spits out a number and you have to make sense of it. There are several examples of picking a topic, running data scraping on it, and finding absolutely nothing – only to be later told by somebody with more local knowledge that that particular issue was about vulnerable people who wanted to hide, rather than expose their views, on the internet, for fear of persecution. Just an example of how context, once again, is everything.

4. Mapping silences requires some serious methodological creativity

In the same vein as above, there are usually good reasons why you aren’t finding what you want to find. The trick, then, becomes whether the tools can *show* those silences, or that noise. The tools and representation have to then be inverted, triangulated, and brought into dialogue with one another – and mapping what isn’t there requires some lateral thinking.

5. You need large datasets; so think global

It’s not just that a small N is not statistically valid sample. It’s that many of the tools work on machine learning, which will only give a smidgen of accuracy if there is an opportunity for many iterations and a large data set. For instance, topic modelling on a small sample will produce absolutely useless nonsense (we tried, we really tried). In this sense, trying to adapt these digital methods to a more traditional ethnographic mindset is less helpful because your sample size is dramatically narrowed from the get-go; for example, searching for opinions on one particular policy during one election year in one particular country is very limited. Instead, think of issues and questions that could, theoretically, span and sweep the entire web.

6. Cleaning your data is vital, but long and laborious

Image result for clean data meme Statistics codes which assumptions are embedded in your data set, but this is still missing as our methods evolve into the digital. Especially in large datasets there will be a lot of things to clean out. Outliers, for one, can be interesting, but do need to be taken out. When doing text mining, a lot of words or phrases that are specific to your text will need to be cleaned out. This means you’ll need to take a look at the data itself, not just the tool’s interface, and keep going back and forth between one and the other. For instance, if you are scraping blog posts, in all likelihood you will have copyright phrases, ‘brought to you by WordPress’, and menu items or advertisement blocks. As you get through the last iterations, there is a certain joy in staring at the screen hoping that this time it’ll pop out something useable.

7. To use or not to use a clean research browser/smartphone/laptop

The vast majority of browsers and internet search engines track, and store, your behaviour on the internet, to create a profile of you over time, even if you have privacy settings in place. These profiles influence what you are shown – so if you are using the browser for your research, your results will be affected. In some cases, it is recommended to use a ‘clean’ research browser; one that has been unused, has no user profiles, and has no prior ‘life’ on it, so as not to skew results. However, in some cases and depending on the RQ, this may prove to be unhelpful – for instance, one group searching for queer narratives could not find them using a ‘clean’ browser, but only when using a browser that had been ‘trained’ (i.e. used over the last year) by a feminist. As always, either is fine as long as you’re aware and explicit.   With thanks to the wonderful team of folks who worked with us for a week – I couldn’t have asked for a more creative, reflective and dedicated group! Also thanks to the Digital Methods Initiative for organising the week. More to follow, probably.
0

I’ve been working at the Tilburg Institute of Law and Technology for a month and half now, and several times my jaw has practically dropped through the floor – not as a reflection of this awesome institute but more at the shock of uncovering how deep some of my own assumptions were. Chatting with lawyers has really challenged several of my assumptions about what it means to do research, to be an academic, and what is important to look at. I firmly believe that asking questions and accepting you don’t know everything is the only way to actually know more. Which requires admitting a certain vulnerability, and generally makes you a nicer person to work with because you can still know your shit but remain humble.  

The shock of disciplinary differences

It’s incredibly exciting to work as a social scientist in a law-dominated environment. I’d expected to learn a lot, and I am, though of course I could not have anticipated learning the things I have. Which, funnily enough, are a lot about methodology. The importance of methodology seems to be a lot on my mind recently – see a previous post on how methodology relates to ethical research and this tweet for, well, general astonishment. It’s not the first time that I am confronted with the gaping assumptions of my own discipline. Previously I worked at the department of geography urban development and international development studies at the University of Amsterdam. It was heavily influenced by human geography and anthropology as disciplines. When I first joined the department as a masters student, it was a shock to my system because I’d done my bachelors in a postivist psychology department, and talking with cultural anthropologists about the meaning of science and truth was rather a revelation. It can actually make you question everything you think you know – not only because of the assumptions statements are based on, but also because of the ways that those truths are arrived at. This is a big part of what motivated my work on flood management and how to get different communities of experts to actually speak the same language. But on to the fun stuff:

What I’ve learnt from lawyers

I am still learning, so this might be totally way off, but they’re my first impressions coming into a new field.
1. Research is not necessarily empirical.
From psychology to development studies, my understanding of research was very much about ‘doing’ something. Either you create an experiment and calculate the stats (quantitative), or you go out into the world and talk to people and analyze the text (quantitative). Mass over simplifications but you get the idea. So imagine my surprise when I realised that there are whole swathes of rigorous researchers who focus on ‘black letter law’, i.e. really looking at the what the law says and how to interpret it.  Desk research is still research, and does not neccesarily require ‘fieldwork’. It’s still a bit of a shock because so much of my research history has been around how to do/carry out/organise/analyse empirical work.  Personally, it feels a bit empty with out it.

  I can imagine that this must change the dynamics of how you build your network as a scholar.
2. Analytical is not necessarily better than descriptive.
The idea being that analytical work takes an analytical framework, a theory which strings together two concepts, which could then be used as a lens to look at the empirical data. In my masters, this critical perspective was very much a process if ‘elevating’ the work, and when I finally understood the difference between descriptive and analytical work it was a major step forward in my work.

So recently a colleague made a comment that they only wanted analytical and not descriptive work. To me this was incredibly self evident and the fact that they’d said it could only have been a benefit for up-and-coming researchers to understand the difference. What I hadn’t expected was for this to stir quite a discussion from my lawyer colleagues. The response was: ‘If there’s no room for description was is the point of lawyers?‘ This was earth-shatteringly groundbreaking. See, a large part is to see how the law is to be interpreted in relation with the other parts of the law and with society itself. This requires substantial description, and the ‘analytical framework’ becomes redundant in the exercise. Different purposes, different approaches. It also means that there is a distinctly different ‘flavour’ to styles of writing articles – with an emphasis on logical structure and argumentation, more often than not the legal articles i’ve read are structured in a much more linear way, rather than narratively, as you find more frequently in social sciences. That said, I have still to read a lot of legal work, so this may be a totally flimsy statement.
3. Not all textual analysis is about revealing the narrative
I guess I shouldn’t be surprised with my background in cognitive psychology, where we studied al ot about the structure of language and breaking down meaning into specific units to be rearranged as a reflection of the workings of the mind. Image result for narrative cartoon Still, working in social science, the idea of a discourse is incredibly central to explaining how we make sense of the world. We have visions or stories that we aspire to and these stories, the narrative, shapes how we talk about something and as a result what opportunities we see to solving the problem. Personally I find it fascinating to reveal these discourses, there’s something primal about it. Yet rather than focusing on the overarching narratives that shape the meanings of how we talk, lawyers have a tendency to analyse language in a very different, almost microscopic way: with an incredibly detailed focus on the meanings and interpretations of individual words. There is a reason for each, specific word, that can be unpacked for why we use that word and not another. And it matters. While you find this in the social sciences too – we should use this word and not that – the way of using individual words as a unit of anlaysis almost instinctively has been a completely different approach that what I’m used to.
4. Being an engaged scholar means keeping really up-to-date with the news
Perhaps this is slightly biased because I am in a department that a) is top-notch, and b) where everybody’s work is being reshaped by the upcoming General Data Protection Regulation, which is changing all the things. However, the lawyers that I’ve been speaking to are all incredibly engaged with current affairs. Following the new regulations and debates around them, writing in public arenas commentaries, keeping up to date with the proposed changes from government, etc. All logical, engaged things. Not everybody is like this. Some people work on historical events. Some people hear the off-cuff thing. And that’s fine. But in this case, people are working on something with direct, immediate relevance, and there is something incredibly powerful about that and the role that it brings to academia. And it is deemed relevant not just by scholars but also by everyday people. Questions of data governance are all the rage precisely because we can see for ourselves, as citizens and consumers and persons, that the world is changing before our eyes and we feel like we’re losing our grip just a wee tad. That’s why I’m on twitter more than ever before, so much to learn. It’s inspiring to have this injection of relevance and urgency. But that’s for another discussion, methinks.    Did I miss something? Dear new colleagues – don’t take it personally, I’m finding the meeting of disciplinary boundaries incredibly fascinating 😉
0

There is a growing call for ethical oversight of AI research, and rightly so. Problem is, ethical oversight hasn’t always stopped past research with questionable ethical compasses. Part of that, I argue, is that the ethical concerns raised largely by social scientists come from a completely different world view to those from a more technical background. While AI research is raising new problems, particularly with regards to correlation vs causation in research, but the tools we have to solve them haven’t changed that much. With this blog I want to question – can methodology help social and technical experts speak the same language? Since my masters degree I’ve been fascinated by the fact that people working in different disciplines or types of work will have completely different approaches to the same problem. Like in this article on flooding in Chennai, I found that ‘the answers’ to solving flooding all already existed on the ground, it’s just the variety of knowledges weren’t being integrated because of the different ways that they’re valued. I was recently speaking with my brilliant colleague and friend, who is a social constructivist scientist working in a very digital technology oriented academic department faculty. This orientation is important to note, because the methodology deployed for science and research there and the questions being asked are influenced to a large degree by the capacities and possibilities afforded by digital technologies and data. As a result, the space scientists see for answers can be very different. In reviewing student research proposals, she found she was struggling because some research hypotheses completely ignored the ethical implications of the proposed research. In talking it through, we realised that most of the problems arose from the assumptions that are made in framing those questions. To take a classic example, in the field of remote sensing to identify slums, it is relatively common to see that implicit assumption that what defines a slum is the area’s morphology, that that definition is by the city planners and not the residents, and how locals interpret the area or the boundaries of the neighbourhood may differ completely. The ethical problem, beyond epistemology, is what can then be done in terms of policy based on the answers that that research provides. To go back to  that paper that caused the controversy about identifying people’s sexual orientation from profile pictures downloaded from a dating site. It’s based on a pre-natal hormone theory of sexual orientation, which is a massive assumption in and of itself.  Even the responses to the article have basically boiled down to ‘AI can predict sexuality’, even though that’s blatant generalisation and doesn’t look at who was actually in the dataset (every single non-straight person? Only white people?). That, and then the fact that they basically ‘built the bomb to warn us of the dangers’, has a lot of assumptions about your view of ethics in the first place. Like my 10th grade history teacher used to say, to assume makes an ASS of U and ME. (Thanks Mr. Desmarais) More precisely, to assume without making the assumptions explicit. Not clearly articulating what your assumptions are is a *methodological* problem for empirical research, with ethical *implications*. Unexamined assumptions mean bad science. Confounding variables and all that. For reference, in statistics there is an entire elaborate, standardized system of dealing with assumptions by codifying them into different tests. You apply one statistical test, which has a particular name, because of the assumptions you have – i.e. I assume this data has a normal distribution. If you’re using mixed methods, it becomes much harder to have a coherent system to talk about assumptions because the questions that are asked may not yield data that is amenable to statistical analyses and therefore cannot be interpreted with statistical significance. All the all the more important here to make assumptions explicit so they can be discussed and scrutinized. Some ethical concerns can be dealt with more easily when we remember methodological scrutiny and transparency, bringing research back to the possibility of constructive criticism and not only fast publication potential. How this process is dealt with currently in academia is ethical review, hence the call for ‘ethical watchdogs’. Thing is, In terms of the process of doing science in academic settings, ethical review is often the final check before approval to carry out the research. When I did my BSC. In psychology, sending the proposal to the ethics review board felt like an annoyingly mandatory tick-box affair. The problem with this end-of-the-line ethical review is:
  • It’s not clear why the ethics is important to actually carry out the research
  • If the ethics board declines, you’re essentially back to the drawing board and have to start again.
Particularly under the pressure for fast publication, there aren’t many incentives to do good ethics unless you’re concerned about it from the outset.   Image result for assumptions cartoon   What if we shifted the focus from ethics as an evaluation to ethics as methodology? Rather than having an ethics review at the end of the process of formulating hypotheses and research proposals, could there be a way to incorporate an ethics review in the middle of the ‘research life cycle’? One would then get feedback not only on the ethics but it could provide the opportunity to explain the research’s unexamined assumptions which ultimately makes for better science. I understand this ideal situation implies quite a significant shift in institutional processes which are notorious for moving about as fast as stale syrup. Perhaps instead there could be a list of questions researchers could ask themselves to as a self-evaluation? In this way, you could open an entryway to an ethical discussion as a question of methodology, rather than ontology or ethics per se, which are far too easily just troubled waters in terms of interdisciplinary discussions. Do you know of any examples of structurally incorporating these ideas as a way to effective multidisciplinary dialogue?   My thanks go to my colleague who sparked this discussion and thought it through with me, who for reasons of their position, will remain anonymous.  
0