How to compile a book index – my trials and tribulations

Just some of many scribbled drafts
Just some of many scribbled drafts

I’ve just finished compiling the index for Technobiophilia: nature and cyberspace. It’s the first time I’ve done this because it’s the first of my books ever to need an index, and since it’s been an interesting journey I thought I’d share something of it here for others thinking about doing the same. Before I started, I took lots of advice and read lots of websites about indexing, and I can confirm that all of it was useful but it wasn’t until I actually immersed myself in the job that I understood why.

Pay for it or do it yourself?

My first task was to crowdsource this question on Facebook. It was pretty confusing because the morning began with a rush of people saying ‘pay for it, it’s a nightmare!’. I’d almost decided to do that when the second rush began, and this time everyone said ‘do it yourself, it’s a pleasure!’.  So, ever the optimist, I went with second lot, and I don’t regret it. Here’s what I learned:

Tell a story with your index

Although I’d been told this, I didn’t understand until I was nearly at the end of the task that an index is a great vehicle for telling the story of what’s in the book. I often check the index of a book to see whether it covers the kind of topics I want to read about, so I tried to imagine prospective readers doing the same with mine. (Incidentally, these days when you download a Kindle sample you get to see the Contents page but not the Index, and I think that’s a shame since it tells you a whole lot more.)

I’d been told that the index should contain concepts, and it was in the construction of concept headers and sub-headers that I began to grasp the essence of my own research. It also helped me clarify for myself how I wanted the reader to understand it. There are of course a million ways to configure words and concepts, so the decisions made during this process are closely related to the synergies you want people to understand. For example, my book is about nature and cyberspace. The advice is generally to exclude title words, or very closely related words such as computer or biophilia, but I couldn’t really do that since so much of it is about the connections between them. So I decided I would go against the advice and use the word cyberspace as a header, even though the term occurs numerous times in the book. The hard work then lay in figuring out the important concepts I wanted to flag up. This is what I ended up with (I don’t know if I’ve set it out correctly – I’ll wait for the editor’s verdict on that):

and biophilic design, 138
and California, 86
and consciousness, 60, 65, 110, 118, 171
and nature, 17, 40, 42, 61, 84, 138-139, 182, 201
and the body, 158, 161
and water metaphors, 151
as a frontier, 97
as a natural phenomenon, 44
as a sensory experience, 113
as a wilderness, 95, 112, 122
as an ecosystem, 21, 27, 109-110, 126-128, 172
as habitat, 68, 87, 118, 128-129, 138, 169
as home, 129
concept of, 22-23
Declaration of the Independence of, 97
geography of, 64, 87, 112, 120, 123, 201
lifeworld of, 61-62, 168
mapping, 19, 31, 97, 110-111, 120, 125-126
origin of the term, 22

I’m hoping this approach tells some kind of a story about the book’s preoccupations without being overly-detailed. But I have to admit I came to this part of the process last, because before that I went up some very dark blind alleys.

Indexing every single word

On the advice of a friend I purchased PDF Index Generator for the PC ($49.95). (He’s a Mac user so his experience may have been different from mine.) I found it very efficient in some regards and extremely frustrating in others. For example, it would be useful to be able to add flags and comments on items, and to be able to count up selected groups of words instead of all of them. It would also be handy to be able to view and group header families together. And so on. But in terms of collecting all the words in your book and all the pages they refer to, it’s brilliant. Soon I had about 4,000 words. (The book itself totals around 90,000 words but PDF Generator aggregates matched words and removes common ones such as ‘it’, ‘and’ etc, hence the 4,000. ) Quite a few of the items were proper names divided into two because the software cannot recognise them together so, for example, for Californian author and activist Stewart Brand you end up with one entry for Stewart and another for Brand, and they’re at opposite ends of a very long list. This is where I made my first mistake – I decided to begin by matching up all the names. And there are a lot of them. By the time I got to about the letter C, and it had taken me a whole day to get that far, I realised this may have been a mistake.

At this point I also contacted my editor to ask how many entries I could have, and discovered it would be around 360. Even if I matched up all the names manually, it wouldn’t reduce the list by very much. By this time I’d already printed out several lists of all the words in the belief that staring at them on paper might provide insights not achieved by staring at them on the screen. It didn’t, at least, not much.

So I set the names aside and began working on the concepts.  It was logical to try to decide on headers then file existing words beneath them. But the index was getting longer and longer and I was getting more and more confused. What exactly were the concepts  I was trying to impart? I seem to have lost my grip on what the book was about. Finally I realised that I couldn’t just move around the words that PDF Generator had spat out for me. Instead, I needed to reverse-engineer the book to think through the ideas.  It was interesting, also, to think about which terms a reader might expect to see and how to demonstrate that they are in there but perhaps with an unusual twist. For example, what I want to say about the internet, covers very specific areas, some of which cross over with the cyberspace entry and some of which don’t, and some of which may be a little provocative:

as a self-organised system, 147
as an evolving biological organism, 146-147
as new kind of space, 44, 110
being logged onto the, 113
birth of the, 87
body metaphors, 161
as different from the web, 21
foundational story of the, 27-28, 96, 122
If the internet were a landscape…, 18, 68, 89, 108, 151, 166
navigation of the, 119-124
wireless, 69, 110, 117-118, 152, 187, 189, 199

Of course, by the time I’d finished this massive conceptual exercise, I realised there was no room for most of the proper names I’d painstakingly started to match up. Maybe I’d have to have an index without any people in it at all? I grabbed some books from my shelves and checked – many of them indeed had very few names, but some had lots. And if they do have to be rationed, who are the most important people to include? I took nearly all of them out.

The next day,  after I’d pruned down the concepts and important terms really tightly, enough room had magically appeared to put quite a few of the names back.  Phew. It had felt wrong not to have them there.

So now it’s pretty well done, and sent off to my editor for checking. Five minutes after emailing it, I remembered someone I’d forgotten. Half an hour later, I remembered someone else. Writing this post, I discovered a mistake in the index. Oh well. I’ll wait for it to come back to me before I make the changes. Yet another correction is bound to pop into my head when I wake up tomorrow morning….

So do I have any tips?

My tips are much the same as all of the tips I read and forgot about before I started:

  • concept headers and subheaders are important. Some say you should start that list while you’re still writing the book. I wish I had.
  • it’s not really about the words. The list of 4,000 words really sent me down a rabbit-hole. Pruning a list of 4000+ down to 300+ is not fun and it’s not efficient. However…
  • software that generates page numbers from key words is incredibly useful. I don’t regret buying PDF Generator but I wish I’d used it differently from the start
  • begin with the concepts and work from there to the key words, not the other way around
  • allow a sustained block of time to attack the job because your head will be in a very strange place, and get plenty of sleep in between bouts so that your brain can generate new and exciting synergies you didn’t think of yesterday.

I’m sure there are many more but after my mammoth week of indexing my brain is fried. Please feel free to add your own thoughts below!

2 thoughts on “How to compile a book index – my trials and tribulations”

  1. oh no!

    ‘While working on the index repagination we have noticed that the page numbers inserted in the index are the page numbers of the PDF and not of the book.’

    Guess what I’ll be doing this weekend :(


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.