‘Retrofitting’ your PhD: when you get your data before your theory

I gave a workshop recently to two different groups of students at the same university on building a theoretical framework for a PhD. The two groups of students comprised scholars at very different points in their PhDs, some just starting to think about theory, some sitting with data and trying to get the theory to talk to the data, and others trying to rethink the theory after having analysed their data. One interesting question emerged: what if you have your data before you really have a theoretical framework in place? How do you build a theoretical framework in that case?

I started my PhD with theory, and spent a year working out what my ‘gaze’ was. I believed, and was told, that this was the best way to go about it: to get my gaze and then get my data. In my field, and with my study, this really seemed like the only way to progress. All I had starting out was my own anecdotal issues, problems and questions I wanted answers to, and I needed to try and understand not just what the rest of my field had already done to try and find answers, but what I could do to find my own answers. I needed to have a sense of what kinds of research were possible and what these might entail. I had no idea what data to generate or what to do with it, and could not have started there with my PhD. So I moved from reading the field, to reading the theory, to building an internal language of description, to generating data, to organising and analysing it using the theory to guide me, to reaching conclusions that spoke back to the theory and the field – a closed circle if you will. This seems, to me certainly, the most logical way to do a PhD.

But, I have colleagues and friends who haven’t necessarily followed this path. In their line of work, they have had opportunities to amass small mountains of data: interview transcripts, documents, observation field notes, student essays, exam transcripts and so forth. They have gathered and collected all of these data, and have then tried to find a PhD in the midst of all of it. They are, in other words, trying to ‘retrofit’ a PhD by looking to the data to suggest a question or questions and through these, a path towards a theoryology.

Many people start their doctoral study in my field – education studies – to find answers to very practical or practice-based questions. Like: ‘What kinds of teaching practice would better enable students to learn cumulatively?’ (a version of my own research question) Or: ‘What kinds of feedback practices better enable students to grow as writers in the Sciences?’ And so on. If you are working as a lecturer, facilitator, tutor, writing-respondent, staff advisor or similar, you may have many opportunities to generate or gather data: workshop inputs, feedback questionnaires, your own field notes and reports, student essays and exam submissions, and so on. After a while, you may look at this mountain of data and wonder: ‘Could there be a thesis in all of this? Maybe I need to start thinking about making some order and sense out of all of this’. You may then register for a PhD, searching for and finding a research question in your data, and then begin the process of retrofitting your PhD with substantive theory and a theoryology to help you work back again towards the data so as to tell its story in a coherent way that adds something to your field’s understanding or knowledge of the issues you are concerned with.

The question that emerged in these workshops was: ‘Can you create a theoretical framework if you have worked so far like this, and if so, how?’ I think the answer must be ‘yes’, but the how is the challenging thing. How do you ask your data the right kinds of questions? A good starting point might be to map out your data in some kind of order. Create mind-maps or visual pictures of what data you have and what interests you in that data. Do a basic thematic analysis – what keeps coming up or emerging for you that is a ‘conceptual itch’ or something you really feel you want or need to answer or explore further? Follow this ‘itch’ – can you formulate a question that could be honed into a research question? Once you have a basic research question, you can then move towards reading: what research is being or has been done on this one issue that you have pulled from your data? What methodologies and what theory are the authors doing this research using? What tools have they found helpful? Then, much as you would in a more ‘traditional’ way, you can begin to move from more substantive research and theory towards an ontological or more meta-theoretical level that will enable you to build a holding structure and fit lenses to your theory glasses, such that you have a way of looking at your data and questions that will enable you to see possible answers.

Then you can go back to your data, with a fresh pair of eyes using their theory glasses and re-look at your data, finding perhaps things you expect to see, but also hopefully being surprised and seeing new things that you missed or overlooked before you had the additional dimension or gaze offered by your theoretical or conceptual framing. But working in this ‘retrofitted’ way is potentially tricky: if you have been looking and looking at this data without a firm(ish) theoretically-informed or shaped gaze, can you be surprised by it? Can you approach your research with the curious, tentative ‘I don’t know the answers, but let’s explore this issue to find out’ kind of attitude that a PhD requires? I think, if you do decide to do or are doing a PhD in what I would regard as a middle-to-front sort of way, with data at the middle, then you need to be aware of your own already-established ideas of what is or isn’t ‘real’ or ‘true’, and your own biases informed by your own experience and immersion in your field and your data. You may need to work harder at pulling yourself back, so that you can look at your data afresh, and consider things you may be been blind to, or overlooked before; so that you can create a useful and illuminating conversation between your data and your theory that contributes something to your field.

Retrofitting a PhD is not impossible – there is usually more than one path to take in reaching a goal (especially if you are a social scientist!) – but I would posit that this way has challenges that need to be carefully considered, not least in terms of the extra time the PhD may take, and the additional need to create critical distance from data and ‘findings’ you may already be very attached to.

Iterativity in data analysis: part 2

This post follows on from last week’s post on the iterative process of doing qualitative data analysis. Last week I wrote a more general musing on the challenges inherent in doing qualitative analysis; this week’s post is focused more on the ‘tools’ or processes I used to think and work my way through my iterative process. I drew quite a lot on Rainbow Chen’s own PhD tools as well as others, and adapted these to suit my research aims and my study (reference at the end).

The first tool was a kind of  ’emergent’ or ‘ground up’ form of organisation and it really helps you to get to know your data quite well. It’s really just a form of thematic organisation – before you begin to analyse anything, you have to sort, organise and ‘manage’ your mountain of data so that you can see the wood for the trees, as it were. I didn’t want to be overly prescriptive. I knew what I was looking for, broadly, as I had generated specific kinds of data and my methodology and theorology were very clearly aligned. But I didn’t really know what exactly all my data was trying to tell me and I really wanted it to tell its story rather than me telling it what it was supposed to be saying. I wanted, in other words, for my data to surprise me as well as to show me what I had broadly hoped to find in terms of my methodology and my theoretical framework.  So, the ‘tool’ I used organised the data ‘organically’ I suppose – creating very descriptive categories for what I was seeing and not trying to overthink this too much. As I read through my field notes, interview transcripts, video transcripts, documents, I created categories like ‘focusing on correct terminology’ and ‘teacher direction of classroom space’ and ‘focus on specific skills’. The theory is always informing the researcher’s gaze, as Chen notes in her paper (written with Karl Maton) but to rush too soon to theory can be a mistake and can narrow your findings. So my theory was here, underpinning my reading of the data, but I did not want to rush to organise my data into theoretical and analytical ‘codes’ just yet. There was a fair bit of repetition as I did this over a couple of weeks, reading through all my data at least twice for each of my two case studies. I put the same chunks of text into different categories (a big plus of using data software) and I made time to scribble in my research journal at the end of each day during this this process, noting emerging patterns or interesting insights that I wanted to come back to in more depth in the analysis.

An example of my first tool in action

An example of my first tool in action

The second process was what a quantitative researcher might call ‘cleaning’ the data. There was, as I have noted, repetition in my emergent categories. I needed to sort that out and also begin to move closer to my theory by doing what I called ‘super-coding’ – beginning to code my data more clearly in terms of my analytical tools. There were two stages here: the first was to go carefully through all my categories and merge very similar ones, delete unnecessary categories left over after the merging, and make sure that there were no unnecessary or confusing repetitions. I felt like the data was indeed ‘cleaner’ after this first stage. The second stage was to then super-code by creating six overarching categories, names after the analytical tools I developed from the theory. For example, using LCT gave me ‘Knowers’, ‘Knowledge’, ‘Gravity’ and ‘Density’. I was still not that close to the theory here so I used looser terms than the theory asks researchers to use (for example we always write ‘semantic gravity’ rather than just ‘gravity’). I then organised my ‘emergent’ categories under these headings, ending up with two levels of coded data, and coming a step closer to analysis using the theoretical and analytical tools I had developed to guide the study.

By this stage, you really do know you data quite well, and clearer themes, patterns and even answers to your questions begin to bubble up and show themselves. However, it was too much of a leap for me to go from this coding process straight into writing the chapter; I needed a bridge. So I went back to my research journal for the third ‘tool’ and started drawing webs, maps, plans for parts of my chapters. I planned to write chunks, and then connect these together later into a more coherent whole. This felt easier than sitting myself down to write Chapter Four or Chapter Five all in one go. I could just write the bit about the classroom environment, or the bit about the specialisation code, and that felt a lot less overwhelming. I spent a couple of days thinking through these maps, drawing and redrawing them until I felt I could begin to write with a clearer sense of where I was trying to end up. I did then start writing, and working on the chapters, and found myself (to my surprise, actually) doing what looked and felt like and was analysis. It was exciting, and so interesting – after being in the salt mines of data generation, and enduring what was often quite a tedious process of sitting in classrooms and making endless notes and transcribing everything, to see in the pile of salt beautiful and relevant shapes, answers and insights emerging was very gratifying. I really enjoyed this part of the PhD journey – it made me feel like a real researcher, and not a pretender to the title.

One of my 'maps'

Another ‘map’ for chapter writing

A different 'map' for writing

A ‘map’ for writing

This part of the PhD is often where we can make a more noticeable contribution to the development, critique, generation of new knowledge, of and in our fields of study. We can tell a different or new part of a story others are also busy telling and join a scholarly conversation and community. It’s important to really connect your data and the analysis of it with the theoretical framework and the analytical tools that have emerged from that. If too disconnected, your dissertation can become a tale of two halves, and can risk not making a contribution to your field, but rather becoming an isolated and less relevant piece of research. One way to be more conscious of making these connections clear to yourself and your readers is to think carefully about and develop a series of connected steps in your  data analysis process that bring you from you data towards your theory in an iterative and rich rather than linear and overly simplistic way. Following and trying to trust a conscious process is tough, but should take you forward towards your goal. Good luck!

keep calm

 

Reference: Chen, T-S. and Maton, K. (2014) ‘LCT and Qualitative Research: Creating a language of description to study constructivist pedagogy’. Draft chapter (forthcoming).

 

Iterativity in data analysis: part 1

This post is a 2-parter and follows on from last week’s post about generating data.

The one thing I did not know, at all, during my PhD was that qualitative data analysis is a lot more complex, messy and difficult than it looks. I had never done a study of this magnitude or duration before, so I had never worked with this much data before. I had written papers, and done some analysis of much smaller and less messy data sets, so I was not a c0mplete novice, but I must say I was quite taken aback by the mountain of data I found I had once the data generation was complete. What to do now? Where to start? Help!

The first thing I did, on my supervisor’s advice, was get a license for Nvivo10 and uploaded all my documents, interview and video recordings and field notes into its clever little software brain so that I could organise the data into folders, and so that I could start reading and coding it. This was invaluable. Software that enables you to store, organise and code your data is a must, I think, for a study as large and long as a PhD. This is not an advert for Nvivo so I won’t get into all its features, and I am sure that other free and paid-for qualitative data analysis packages like Atlas Tii or the Coding Analysis Toolkit from UMass would do the job just as well. However, I will say that being able to keep everything in one place, and being able to put similar chunks of text into different folders without mixing koki colours or scribbling all over paper to the point of confusion was really useful. I felt organised, and that made a big difference to my mental ability to cope with the data analysis and sense-making process.

The second thing I did was keep very detailed notes in my research journal on my process as it unfolded. This was essential as I needed to narrate my analysis process to my readers in as much detail as possible in my methodology chapter. If a researcher cannot tell you how they ended up with the insights and conclusions they did, it is much harder to trust their research or believe what they are asking you to. I wanted to be believable and convincing – I think all researchers do. Bernstein (2000) wrote about needed two ‘languages of description (LoD)’ in research: the internal (InLoD) which is essentially where you create a theoretical framework for your study that coheres and explains how you are going to understand your problem in a more abstract way; and the external (ExLoD) where you analyse and explain the data using that framework, outlining clearly the process of bringing theory to data and discovering answers to your questions. The stronger and clearer the InLod and ExLoD, the greater chance other researchers then have of using, adapting, learning from your study, and building on it in their own work. When too much of your process of organising, coding, recoding, reading, analysing, connecting the data is hidden from the reader, or tacit in your writing about it, there is a real risk that your research can become isolated. By this I mean that no one will be able to replicate your study, or adapt your tools or framework to their own study while referencing yours, and therefore your research cannot be readily be built on or incorporated into a greater understanding of the problems you are interested in solving (and the possible solutions).

This was the first reason for keeping detailed notes. The second was to trace what I was doing, and what worked and what did not so that I could learn from mistakes and refine my process for future research projects. As I had never worked with a data set this large or varied before, I really didn’t know what to do, and the couple of qualitative research ‘textbooks’ I looked at were quite mechanical or overly instrumental in their approach, which didn’t make complete sense to me. I wanted a more ‘ground-up’ process, which I felt would increase the validity and reliability of my eventual claims. I also wanted to be surprised by my data, as much as I wanted to find what I thought I was looking for. The theory I was using further required that I not just ‘apply’ theory to data (which really can limit your analysis and even lead to erroneous conclusions), but rather engage in an open, multiple and iterative reading of the data in successive stages. Detailed notes were key in keeping track of what I was doing, what confused me, what made sense and so on. Doing this consciously has made me feel more confident in taking on similarly sized research projects in future, and I feel I can keep building and learning from this foundation.

This post is a more conceptual musing about the nature of qualitative data analysis and lays the groundwork for next week’s post, where I’ll get into some of the ‘tools’ or approaches I took in actually doing my analysis. Stay tuned… 🙂