Developing well-constructed data gathering tools, or methods, for your study

I spent the better part of last week working with emerging researchers who are at the stage of their PhD work where they are either working out what data they will need and how to get it, or sitting with all their data and working out how to make sense of it. So, we are talking theory, literature, methodology, analysis, meaning making, and also planning. In this post I want to focus on planning your data gathering phase, specifically developing ‘instruments’, such as questionnaires, interview schedules and so on.

tools

Whether your proposed study is quantitative, qualitative or mixed methods,  you will need some kind of data to base your thesis argument on. Examples may include data gathered from documents in the media, in archives, or from official sources; interviews and/or focus groups; statistical datasets; or surveys. Whatever data your research question tells you to generate, so as to find an answer, you need to think very carefully about how your theory and literature can be drawn into developing the instruments you will use to generate or gather this data.

In a lot of the postgraduate writing I have read and given feedback on, there are two main trends I have noticed in the development of research methods. The first is what I considered ‘too much theory’, and the other ‘not quite enough’. In the first instance, this is seen in researchers putting technical or conceptual terms into their interview questions, and actually asking the research questions in the survey form or interview schedule. For example: ‘Do you think that X political party believes in principle of non-racialism?’ Firstly, this was the overall research question, more or less. Secondly, this researcher wanted to interview students on campus, and needed to seriously think about whether this question would yield any useful data  – would her participants know what she meant by ‘the principle of non-racialism’ as she understood it theoretically, or even have the relevant contextual knowledge? Let’s unpack this a bit, before moving on to trend #2.

The first issue here is that you are not a reporter, you are a researcher. This means you are theorising and abstracting from your data to find an answer that has significance beyond your case study or examples. Your research questions are thus developed out of a deep engagement with relevant research and theory in your field that enables you to see both the ‘bigger picture’ as well as your specific piece of it. If you ask people to answer your research question, without a shared understanding of the technical/conceptual/theoretical terms and their meanings, you may well end up conflating their versions of these with your own, reporting on what they say as being a kind of ‘truth’, rather than trying to elicit, through theorising, valid, robust and substantiated answers to your research questions, using their input.

This connects to the second issue: it is your job to answer your research question, and it is your participants’ job to tell you what they know about relevant or related issues that reference your research question. For example, if you want to know what kinds of knowledge need to be part of an inclusive curriculum, you don’t ask this exact question in interviews with lecturers. Rather, you need to try and find out the answer by asking them to share their curriculum design process with you, talk you through how they decide what to include and exclude, ask them about their views on student learning, and university culture, and the role of the curriculum, and knowledge, in education. This rich data will give you far more with which to find an answer to that question than asking it right out could. You ask around your research questions, using theory and literature to help you devise sensible, accessible and research-relevant questions. This also goes for criteria for selecting and collating documents to research, should you be doing a study that does not involve people directly.

analysis of data

Photo by Startup Stock Photos from Pexels

The second trend is ‘not enough theory’. This tends to take the form of having theory that indicates a certain approach to generating data, yet not using or evidencing this theory in your research instruments.  For example structuralist theories would require you finding out what kinds of structures lie beneath the surface of everyday life and events, and also perhaps how they shape people, events and so on. An example of disconnected interview questions could be asking people whether they enjoy working in their university, and whether there are any issues they feel could be addressed and why, and what their ideal job conditions would be, etc., rather than using the theoretical insights to focus, for example, on how they experience doing research and teaching, and what kinds of support they get from their department, and what kinds of support they feel they need and where that does and should come from, etc. You need to come back to using the theory to make sense of your data, through analysis, so you need to ensure that you use the theory to help you create clear, unambiguous, focused questions that will get your participants, or documents, talking to you about what matters to your study. Disconnecting the research instruments from your theory, and from the point of the research, may lead to a frustrating analysis process where the data will be too ‘thin’ or off point to really enable a rich analysis.

Data gathering tools, or methods for getting the data you need to answer your research questions, is a crucial part of a postgraduate research study. Our data gives us a slice of the bigger research body we are connecting our study to, and enables us to say something about a larger phenomenon or set of meanings that can push collective knowledge forward, or challenge existing knowledge. This is where we make a significant part of our overall contribution to knowledge, so it is really important to see these instruments, or methods, not as technical or arbitrary requirements for some ethics committee. Rather, we need to conceptualise them as tools for putting our methodology into action, informed and guided by both the literature our study is situated within as well as what counts as our theoretical or principled knowledge. Taking the time to do this step well will ensure that your golden thread is more clearly pulled through the earlier sections of your argument, through your data and into your analysis and findings.

 

 

Paper writing IV: analysing data

One of the trickiest areas for researchers working with data – either primary or secondary (data you have generated in ‘the field’, or that gleaned from texts etc) – is the analysis of that data. It can be a significant challenge to move from redescribing findings, observations or results, to showing the reader what these mean in the context of the argument that is being made, and the field into which the research fits. There are a few moves that need to be made in constructing an analysis, and these will be unpacked in this post.

Often, in empirical research, we make our contribution to knowledge in our field through the data we generate, and analyse. Especially in the social sciences, we take well-known theories and established methodologies and use these to look at new cases – adding incrementally to the body of knowledge in our field. Thus, analysis is a really important thing to get right: if all we do is describe our data, without indicating how it adds to knowledge in useful ways, what kind of contribution will we be making? How will our research really benefit peers and fellow researchers? After all, we don’t write papers just to get published. We conduct research and publish it so that our work can influence and shape the work of others, even in small ways. We write and publish to join a productive conversation about the research we are doing, and to connect our research with other research, and knowledge.

data 1

How to make a contribution to knowledge that really counts, though?

First things first, you can’t use all your data in one paper (or even in one thesis). You will need to choose the most relevant data and use it to further illustrate and consolidate your argument. But how do you make this choice – what data should you use, and why? The key tool used to make all the choices in a paper (or thesis) – from relevant literature, to methodology and methods, to data for analysis – is the argument you are making. You need to have, in one or two sentences, a very clear argument (sometimes referred to as a problem statement, or a main claim). In essence, whatever you call it, this is the central point of your paper. To make this point, succinctly and persuasively, you need to craft, section by section, support for this argument, so that you reader believes it to be valid and worth engaging with.

So, you have worked out your argument in succinct form, and have chosen relevant section of data that you feel best make or illustrate that argument. Now what? In the analysis section, you are making your data mean something quite specific: you are not just telling us what the data says (we can probably work that out from reading the quotes or excerpts you are including in the paper). To make meaning through analysis, you need to connect the specific with the general. By this I mean that your data is specific – to your research problem and your consequent choice of case study, or experiment, or archival search and so on. It tells us something about a small slice of the world. But, if all we did in our papers was describe small slices of the world, we would all be doing rather isolated or disconnected research. This would defeat the aim of research to build knowledge, and forge connections between fields, countries, studies and so on. Thus, we have to use our specific data to speak back to a more general or broader phenomenon or conversation.

data 2

The best, and most accepted way, of making meaning of your data is through theorising. To begin theorising your data, you need to start by asking yourself: What does this data mean? Are these meanings valid, and why? There are different kinds of theory, of course, and too many to go into here, but the main thing to consider in ‘theorising’ your data is that you need a point of reference against which to critically think about and discuss your data: you need to be able to connect the specifics of your data with a relevant general phenomenon, explanation, frame of reference, etc. You don’t necessarily need a big theory, like constructivism or social realism; you could simply have a few connected concepts, like ‘reflection’, ‘learning’ and ‘practice’ for example; but you do need a way of lifting your discussion out of the common sense, descriptive realm into the critical, analytical realm that shows that reader why and how the data support your argument, and add knowledge to your field.

Analysis and theorising data is an iterative process, whether you are working qualitatively or quantitatively. It can be difficult, confusing, and take time. This is par for the course: a strong, well-supported analysis should take time. Don’t worry if you can’t make the chosen data make sense in the first go: you may well need to read, and re-read your data, and write several drafts of this section of the paper (preferably with critical feedback) before you can be confident of your analysis. But don’t settle for the quick-fix, thin analysis that draft one might produce. Keep at it, and strive for a stronger, more influential contribution to your field. In the long run, it’ll be worth more to you,to your peers, and to your field.

‘Retrofitting’ your PhD: when you get your data before your theory

I gave a workshop recently to two different groups of students at the same university on building a theoretical framework for a PhD. The two groups of students comprised scholars at very different points in their PhDs, some just starting to think about theory, some sitting with data and trying to get the theory to talk to the data, and others trying to rethink the theory after having analysed their data. One interesting question emerged: what if you have your data before you really have a theoretical framework in place? How do you build a theoretical framework in that case?

I started my PhD with theory, and spent a year working out what my ‘gaze’ was. I believed, and was told, that this was the best way to go about it: to get my gaze and then get my data. In my field, and with my study, this really seemed like the only way to progress. All I had starting out was my own anecdotal issues, problems and questions I wanted answers to, and I needed to try and understand not just what the rest of my field had already done to try and find answers, but what I could do to find my own answers. I needed to have a sense of what kinds of research were possible and what these might entail. I had no idea what data to generate or what to do with it, and could not have started there with my PhD. So I moved from reading the field, to reading the theory, to building an internal language of description, to generating data, to organising and analysing it using the theory to guide me, to reaching conclusions that spoke back to the theory and the field – a closed circle if you will. This seems, to me certainly, the most logical way to do a PhD.

But, I have colleagues and friends who haven’t necessarily followed this path. In their line of work, they have had opportunities to amass small mountains of data: interview transcripts, documents, observation field notes, student essays, exam transcripts and so forth. They have gathered and collected all of these data, and have then tried to find a PhD in the midst of all of it. They are, in other words, trying to ‘retrofit’ a PhD by looking to the data to suggest a question or questions and through these, a path towards a theoryology.

Many people start their doctoral study in my field – education studies – to find answers to very practical or practice-based questions. Like: ‘What kinds of teaching practice would better enable students to learn cumulatively?’ (a version of my own research question) Or: ‘What kinds of feedback practices better enable students to grow as writers in the Sciences?’ And so on. If you are working as a lecturer, facilitator, tutor, writing-respondent, staff advisor or similar, you may have many opportunities to generate or gather data: workshop inputs, feedback questionnaires, your own field notes and reports, student essays and exam submissions, and so on. After a while, you may look at this mountain of data and wonder: ‘Could there be a thesis in all of this? Maybe I need to start thinking about making some order and sense out of all of this’. You may then register for a PhD, searching for and finding a research question in your data, and then begin the process of retrofitting your PhD with substantive theory and a theoryology to help you work back again towards the data so as to tell its story in a coherent way that adds something to your field’s understanding or knowledge of the issues you are concerned with.

The question that emerged in these workshops was: ‘Can you create a theoretical framework if you have worked so far like this, and if so, how?’ I think the answer must be ‘yes’, but the how is the challenging thing. How do you ask your data the right kinds of questions? A good starting point might be to map out your data in some kind of order. Create mind-maps or visual pictures of what data you have and what interests you in that data. Do a basic thematic analysis – what keeps coming up or emerging for you that is a ‘conceptual itch’ or something you really feel you want or need to answer or explore further? Follow this ‘itch’ – can you formulate a question that could be honed into a research question? Once you have a basic research question, you can then move towards reading: what research is being or has been done on this one issue that you have pulled from your data? What methodologies and what theory are the authors doing this research using? What tools have they found helpful? Then, much as you would in a more ‘traditional’ way, you can begin to move from more substantive research and theory towards an ontological or more meta-theoretical level that will enable you to build a holding structure and fit lenses to your theory glasses, such that you have a way of looking at your data and questions that will enable you to see possible answers.

Then you can go back to your data, with a fresh pair of eyes using their theory glasses and re-look at your data, finding perhaps things you expect to see, but also hopefully being surprised and seeing new things that you missed or overlooked before you had the additional dimension or gaze offered by your theoretical or conceptual framing. But working in this ‘retrofitted’ way is potentially tricky: if you have been looking and looking at this data without a firm(ish) theoretically-informed or shaped gaze, can you be surprised by it? Can you approach your research with the curious, tentative ‘I don’t know the answers, but let’s explore this issue to find out’ kind of attitude that a PhD requires? I think, if you do decide to do or are doing a PhD in what I would regard as a middle-to-front sort of way, with data at the middle, then you need to be aware of your own already-established ideas of what is or isn’t ‘real’ or ‘true’, and your own biases informed by your own experience and immersion in your field and your data. You may need to work harder at pulling yourself back, so that you can look at your data afresh, and consider things you may be been blind to, or overlooked before; so that you can create a useful and illuminating conversation between your data and your theory that contributes something to your field.

Retrofitting a PhD is not impossible – there is usually more than one path to take in reaching a goal (especially if you are a social scientist!) – but I would posit that this way has challenges that need to be carefully considered, not least in terms of the extra time the PhD may take, and the additional need to create critical distance from data and ‘findings’ you may already be very attached to.

Iterativity in data analysis: part 1

This post is a 2-parter and follows on from last week’s post about generating data.

The one thing I did not know, at all, during my PhD was that qualitative data analysis is a lot more complex, messy and difficult than it looks. I had never done a study of this magnitude or duration before, so I had never worked with this much data before. I had written papers, and done some analysis of much smaller and less messy data sets, so I was not a c0mplete novice, but I must say I was quite taken aback by the mountain of data I found I had once the data generation was complete. What to do now? Where to start? Help!

The first thing I did, on my supervisor’s advice, was get a license for Nvivo10 and uploaded all my documents, interview and video recordings and field notes into its clever little software brain so that I could organise the data into folders, and so that I could start reading and coding it. This was invaluable. Software that enables you to store, organise and code your data is a must, I think, for a study as large and long as a PhD. This is not an advert for Nvivo so I won’t get into all its features, and I am sure that other free and paid-for qualitative data analysis packages like Atlas Tii or the Coding Analysis Toolkit from UMass would do the job just as well. However, I will say that being able to keep everything in one place, and being able to put similar chunks of text into different folders without mixing koki colours or scribbling all over paper to the point of confusion was really useful. I felt organised, and that made a big difference to my mental ability to cope with the data analysis and sense-making process.

The second thing I did was keep very detailed notes in my research journal on my process as it unfolded. This was essential as I needed to narrate my analysis process to my readers in as much detail as possible in my methodology chapter. If a researcher cannot tell you how they ended up with the insights and conclusions they did, it is much harder to trust their research or believe what they are asking you to. I wanted to be believable and convincing – I think all researchers do. Bernstein (2000) wrote about needed two ‘languages of description (LoD)’ in research: the internal (InLoD) which is essentially where you create a theoretical framework for your study that coheres and explains how you are going to understand your problem in a more abstract way; and the external (ExLoD) where you analyse and explain the data using that framework, outlining clearly the process of bringing theory to data and discovering answers to your questions. The stronger and clearer the InLod and ExLoD, the greater chance other researchers then have of using, adapting, learning from your study, and building on it in their own work. When too much of your process of organising, coding, recoding, reading, analysing, connecting the data is hidden from the reader, or tacit in your writing about it, there is a real risk that your research can become isolated. By this I mean that no one will be able to replicate your study, or adapt your tools or framework to their own study while referencing yours, and therefore your research cannot be readily be built on or incorporated into a greater understanding of the problems you are interested in solving (and the possible solutions).

This was the first reason for keeping detailed notes. The second was to trace what I was doing, and what worked and what did not so that I could learn from mistakes and refine my process for future research projects. As I had never worked with a data set this large or varied before, I really didn’t know what to do, and the couple of qualitative research ‘textbooks’ I looked at were quite mechanical or overly instrumental in their approach, which didn’t make complete sense to me. I wanted a more ‘ground-up’ process, which I felt would increase the validity and reliability of my eventual claims. I also wanted to be surprised by my data, as much as I wanted to find what I thought I was looking for. The theory I was using further required that I not just ‘apply’ theory to data (which really can limit your analysis and even lead to erroneous conclusions), but rather engage in an open, multiple and iterative reading of the data in successive stages. Detailed notes were key in keeping track of what I was doing, what confused me, what made sense and so on. Doing this consciously has made me feel more confident in taking on similarly sized research projects in future, and I feel I can keep building and learning from this foundation.

This post is a more conceptual musing about the nature of qualitative data analysis and lays the groundwork for next week’s post, where I’ll get into some of the ‘tools’ or approaches I took in actually doing my analysis. Stay tuned… 🙂