Holly Moir’s Weblog

Digitization: Comparing E-Texts of Anna Karenin, Google Books versus Project Gutenberg, Results: Google Very Easy to Use with Fewer Options NOT FULL TEXT, versus Gutenberg Clear Format with More Options for Translations, Audio Books: FULL TEXT always preferable for me!

Posted by: hmoir on: October 27, 2008

Digitization: Comparing E-Texts of Anna Karenin, Google Books versus Project Gutenberg, Results: Google Very Easy to Use with Fewer Options NOT FULL TEXT, versus Gutenberg Clear Format with More Options for Translations, Audio Books: FULL TEXT always preferable for me!

I have been ill with vertigo Sunday and Monday, so apologies if my blog seems odd today.  I found the assignment for this week regarding digital books to be rather intriguing.  I suppose it hadn’t occurred to me that all digital books are not created equal; as a bibliophile, I find that in general all hard-copy books are the same to me, as long as the text is legible and hasn’t been scratched out, then I am happy with the book.  But now I realize that although there are many advantages to the digital format, there are also disadvantages, including the fact that many digital books might not be a perfect copy. However, it appears at present that the advantages to digital books either outweigh the disadvantages, or the pros and cons are approximately balanced.  And of course as a grad student (and perhaps anyone in this economy), the price advantages of a free digital book over the price of a hard-copy book are much appreciated, so I went into this assignment expecting that digital books would be far from perfect, considering that they are free and that many of the sites used to access these books appear to be in the Beta phase.

I first visited Google Books because the professor had recommended going here first.  I searched for my favorite work of fiction, Anna Karenin by Leo Tolstoy, and I quickly received a page of 12 results, 6 of which were copies of the text in English, and 6 of which were related texts, such as books of commentary on the novel.  I liked the layout of the results page as it was clear and easy to use and deliberately modeled after the search results page of the Google search engine, which most people already know how to use, so this is a plus.  I really appreciated the inclusion of images of the front covers next to each search result.  I think most people would appreciate this feature, and especially me, as I was searching for the version recommended by my professor as an undergraduate, and I remember him saying that he didn’t like the other versions as much, as they didn’t balance the demands of clarity/precision of expression, with those of beauty of the language, as opposed to utility of the language—other versions were too literal in the translation and lost the spirit of the text in an attempt to mimic the letter of the text.  Anyway, so I quickly found the edition I desired, that translated by Rosemary Edmonds; I selected that search result and was pleased to be able to read the first 12 pages in a Preview, I think this is a very useful and helpful option.

 

Next I visited the Internet Archive, where I had trouble finding the text.  After being frustrated for 10-15 minutes, I decided to move on to a different site that I had used before, as I had never used the Internet Archive books section, and I was finding it difficult to use, and the fact that I was having trouble finding such a widely-read text made me skeptical of the quality of the site—this book should be easy to find!  So I visited a site I remembered from my undergrad years called Project Gutenberg, hosted by ibiblio.org.  Thankfully, this site was very easy-to-use and immediately I found the text, with the option to read a translation into French, Dutch, or English, which was not an option with Google books.  I selected English, yet I found that the first version of the text was not the edition I was hoping for.  It was the edition translated by Constance Garnett, although I appreciated the mention of an editor behind the digital version, as it said: e-text prepared by David Brannan.  I think the inclusion of a name might make the reader more comfortable, as then someone specific is accountable for the e-book—it certainly made me more comfortable.  I was then able to access a preview of the text, as in Google books.  I went back to the original search page to try to find a copy in English of my favorite version of the book, the one translated by Rosemary Edmonds, but it turns out that the only version offered in English by Project Gutenberg is that translated by Constance Garnett, so I would have to settle for that.

 

I then returned to Google books to search for the version translated by Constance Garnett so I could compare the exact same version across the two sites.  When I clicked on the search result for that version, I had to log into my Google account, which for many people could be a deterrent to using this version, as I don’t see why one has to log in for this version but not for the Rosemary Edmonds version.  But then I found the Constance Garnett version, which had been reissued as a Barnes and Noble Classics version.  I had a bit of trouble with both sites when I wanted to progress from simply previewing the books to downloading the books; I don’t know if this is a design flaw or if I am just tired.  I found it bizarre that I had to give a “nickname” to Google Books before I could download the full book, but probably most young people will not find this to be odd.

 

But once I finally found the same version on Google Books as on Gutenberg and downloaded the texts, things went well.  The download on Google seemed to be quicker, perhaps because Google isn’t full text, but all in all the two experiences were about balanced.  Counting the Introduction and the Supplementary material, including scans of the front and back covers of the book which were much appreciated, the Google Books version weighs in at 800 pages, as it consists of images of the Barnes and Noble Classics edition.  The Gutenberg e-book consists only of text without the explanatory material offered by Google books, such as the Chronology, the new updated Introduction, and the Endnotes, but then again the Gutenberg e-book is full text while the Google book is not.

 

I read the first and last paragraphs of each chapter in each version of the book, and I could not find any typos or instances in which words had been transposed or the order of words misplaced—I have to say I was very surprised at this high degree of accuracy for both texts.  I am not sure if this is always the case.  My impression in that since this is such an old text that has been studied since it was first published in book format in 1878, that perhaps scholars had typed up all or part of the book as part of their work, even prior to the advent of e-books, so that electronic formats of the book have been around for a while, and have had time to go through multiple revisions and improve each time.  I think I remember reading a version of the book on Project Gutenberg when I was in high school, so that would have been 2001-2002, and I think that was before the masses knew about e-books, although I could be incorrect.  So I assume that Anna Karenin has been around in e-text longer than other books, I just have that feeling, and this longevity could have contributed to such an unblemished e-text.  I think that from my browsing through Google Books and Project Gutenberg, the newer texts tend to have more errors, although they aren’t huge errors in general, and this seems especially true for the books that have been released this year, 2008.

 

So in sum, the first and last paragraphs of each chapter (they were mostly all there, even though Google Books is not a Full Text account, it appears that the missing pages are generally from within the chapters) all matched up and did not suffer from typographical or other transcription errors.  Overall, I am very pleased with the entire experience of finding and downloading e-books on both sites (well at least e-books of the novel Anna Karenin). I am very pleasantly surprised by how well the process went.  However, at first I was a bit puzzled that the books end slightly differently, and this should not be the case as they both claim to be translated by the same person, Constance Garnett.  Then I realized the reason I was puzzled, Gutenberg is a Full Text account whereas Google Books is not, and in Google Books the last page of the text proper is 754, while pages 752-753 are missing from the text—this is insanely frustrating, and really ruins the text as a work of art and as a useful e-book—what is the point of an e-book if sections of the text are missing???  So in sum, I started out really happy with both sites, then I ended up very unhappy with Google Books, as many of the alleged e-books are NOT FULL TEXT and this is NO GOOD.  Honestly, why bother to run Google Books website if the books aren’t all full text?  I assume Google Books appeals to the non-scholarly audience—perhaps these people want to read a few pages to determine if they want to buy a copy of the book?  I don’t really know; I don’t know why anyone would want a text with holes in it.  Certainly for academia, there is no contest: Project Gutenberg is vastly preferable, and Google Books is not useful at all for me!

 

I want to include the final section of the text of Anna Karenin, as I adore this book, this is the final scene between Kitty and her husband Levin, where Levin finally realizes he does love his new son, and yet the birth of his first child has not changed him completely into a mild-mannered man, but rather he realizes he will always have a feisty personality, although his soul does feel changed…

 

Final Section of Text of Anna Karenin from Project Gutenberg:

“What is it? you’re not worried about anything?” she said,

looking intently at his face in the starlight.

 

But she could not have seen his face if a flash of lightning had

not hidden the stars and revealed it.  In that flash she saw his

face distinctly, and seeing him calm and happy, she smiled at

him.

 

“She understands,” he thought; “she knows what I’m thinking

about.  Shall I tell her or not?  Yes, I’ll tell her.” But at the

moment he was about to speak, she began speaking.

 

“Kostya! do something for me,” she said; “go into the corner room

and see if they’ve made it all right for Sergey Ivanovitch.  I

can’t very well.  See if they’ve put the new wash stand in it.”

 

“Very well, I’ll go directly,” said Levin, standing up and

kissing her.

 

“No, I’d better not speak of it,” he thought, when she had gone

in before him.  “It is a secret for me alone, of vital importance

for me, and not to be put into words.

 

“This new feeling has not changed me, has not made me happy and

enlightened all of a sudden, as I had dreamed, just like the

feeling for my child.  There was no surprise in this either.

Faith–or not faith–I don’t know what it is–but this feeling

has come just as imperceptibly through suffering, and has taken

firm root in my soul.

 

“I shall go on in the same way, losing my temper with Ivan the

coachman, falling into angry discussions, expressing my opinions

tactlessly; there will be still the same wall between the holy of

holies of my soul and other people, even my wife; I shall still

go on scolding her for my own terror, and being remorseful for

it; I shall still be as unable to understand with my reason why

I pray, and I shall still go on praying; but my life now, my

whole life apart from anything that can happen to me, every

minute of it is no more meaningless, as it was before, but it has

the positive meaning of goodness, which I have the power to put

into it.”

 

Transcendent.  This is considered one of the greatest novels ever written in world history. 

 

Format: Regarding Format, the Format of Google Books I found to be slightly nicer than Project Gutenberg.  Google Books consists of images of scanned pages from the book; I assume these have been compiled on a flat bed scanner.  I think they are PDF’s which means I was not able to copy, cut, and paste text into my blog, and this was frustrating, although perhaps necessary as the Google copy is the scan of a version of the book produced by Barnes and Nobles Classics and reissued in the year 2003 with supplementary material, so that specific version of the book is under copyright to Barnes and Noble.  Whereas the Project Gutenberg version is TXT file, not a PDF, so I was able to copy, cut, and paste text from this version, which was very nice and also very useful for all users, ranging from amateurs to academics—I think it is very important to be able to take perfect quotations from a text without having to re-type them manually into a term paper or the like, otherwise why would one use an e-book, unless one was able to search and to copy, cut, and paste from the text easily?  The Gutenberg format is a bit stripped-down, but it serves the purpose well; it consists of full-screen pages of typed text in a very no-nonsense font which reminds me a bit of the font used when typing in commands into the ancient DOS operating system.  But the font is sans-serif and easy-to-read, and the project has inserted gaps between each paragraph and between lines of dialogue, and this makes it easy-to-read, as does the fact the text does not extend fully from one side of the screen to the other, but rather has a nice right-hand margin which makes reading on screen much less arduous, and also is very useful if one prints out sections of the text, then one can add lengthy annotations and notes in the margin.  In sum, all digitations are not equal, and all formats are not equal.  Although the PDF format of Google Books is more aesthetically pleasing, Gutenberg is overall a much better experience because one can manipulate the document and because it consists of only Full-Text editions, and this Full-Text digitization is vastly preferable to a partial digitization like the one found on Google Books.

Holly

1 Response to "Digitization: Comparing E-Texts of Anna Karenin, Google Books versus Project Gutenberg, Results: Google Very Easy to Use with Fewer Options NOT FULL TEXT, versus Gutenberg Clear Format with More Options for Translations, Audio Books: FULL TEXT always preferable for me!"

That was very interesting to read – as a long-term Gutenberg user, I used google books for the first time today and it was interesting to read your thoughts.

As for Anna Karenina being more accurate because it’s an older book: you might want to check out pgdp.net (or the post I wrote about it at http://booktrash.wordpress.com/2008/12/06/proofreading/). This site is the source of many PG books and all the books that go through it are checked many, many times, as you’ll see from my explanation. I think that it does depend on the ebook’s source as to how correct it is (also that such a well-known book has probably been made into an ebook several times and so PG could most likely include the best version).

Leave a Reply


  • notinparis: That was very interesting to read - as a long-term Gutenberg user, I used google books for the first time today and it was interesting to read your th
  • hmoir: Thanks, Andrea. I really took your comments into consideration when I revised the site, as I trust your opinion and I also like how your site turned
  • hmoir: Maureen, Thank you for your kind comment. I think I will use this as the idea for Clio II, although I am not taking that this Spring 2009 due to a

Categories