Wednesday, November 20, 2013

Some people talk about “orphaned” works… I prefer to think of them as “liberated”

I have many feelings about the three assigned articles this week, but I’m going to not talk about them. Instead, let’s tackle the equally abundant amount of thoughts I have. While reading Vaidhyanathan’s article (which the author has since expanded into a book) the first time through, I was puzzled as to their stance in regards to Google’s scanning project and fair use. The author seems to want the digitization done but not by Google. Even in the abstract, they claim that “Google’s Library Project threatens to unravel everything that is good stable about the copyright system” (2007) while truly, if the copyright system were “good and stable” it would not be threatened by the further distribution of information.

Google Books does not own the books it is digitizing and placing up on the web, but rather has partnered with libraries to scan the items in their collections which are in the public domain. Robert Frost’s A Boy’s Will is a staple of universities everywhere. In fact, WorldCat shows that every library in the WRLC owns a copy. Strangely enough, a search within our own WRLC catalog only shows that 2/10 libraries in the WRLC own the title, although if you search for the title within different universities it does indeed show that it is available (*head explodes*). This highly unscientific example shows that if Google Books digitization were done within our own consortium, some books would be purchased up to ten times! Given this overlap, it is safe to assume that items included in the Google digitization process will have been purchased multiple times, and that’s only in the print format. Publishers are still making money off items in the public domain by republishing, editing, and adding forewords to these same poems.

The non-public domain books as made available by Google will have about 10% of the content withheld from views (not from searches) in order to prevent unlimited access and reckless behavior by curious individuals. These curious individuals would only have used the available text in ways such as paper-writing and reading for the pure joy of it, anyway. All public domain items will be available, 100% online. As surveys have shown that readers with access to free books are less likely to purchase books (see page 2), this project will indeed be the end of the world of publishing and copyright as we know it. Thank goodness.

The Google Books digitization will bring more public domain books into the light and make them more convenient to search and use as data than ever before. Projects such as American Women’s Dime Novel Project ( are already using public domain items to research changing gender and class in the late 19th century and intends to digitize more items. Some of the libraries where are partners are making their own strides toward open access. University of California, which is permitting their entire library to be scanned (Vaidhyanathan, 2007) is now allowing all research articles published after November 1st, 2014 to be made available (free!) to the public (USC, 2013). First digitization, now open access to their scholarly articles! What’s next?

Now that we've established that people like access to things they like, let's move on.

Okay. I'm going to take a moment, dial it down a few, and change the perspective before we get all excited. Google is not the savior, hero, or even the reliable narrator in this story. They are a business providing a service from which both they and users benefit. Libraries are partners in this in order to make more information available to everyone. Content as data is one of the next large steps in research and digital libraries, and the participating libraries are helping to further this. The copyrighted materials which have been scanned are not free ebooks, they are not fully available to patrons, we cannot add them to catalogs or use them fully in library programs. It’s not open access, it’s fair use.

Vaidhyanathan has made an excellent point by asking the question that I have been afraid to consider: is Google the right agent to do this? Google is a business. A very powerful business, who recently shut down some very popular services (coughcoughREADERcough) and made some untested changes to a very popular video-viewing site which have yet to cause any improvement. Google is anything but transparent or talkative with their users, and they continue to make changes on their own terms.

So, where does this leave us? The Google Books scanning project is continuing on the side of the law, researchers have more data than ever, and hundreds of thousands of books will continue to resurface after being forgotten in dusty corners, and eventually our format standards will change and everything will have to be converted again. This lawsuit was a win for fair use, but the instrument with which the blow was struck was not wielded by a freedom-fighter.

We have our cake, but it was made by the slightly creepy coworker who works near the front door and knows everyone’s allergies.

Works Cited and Mentioned:
Vaidhyanathan, S. (2007). The Googlization of everything and the future of copyright. University of California, Davis, 40(3). 1207-1231.

University of California Open Access Policy (2013).

Monday, November 4, 2013

Alright. Everyone have a cup of coffee? Glass of wine? Great. This is going to be an interesting post. While reading these articles, I had a lot of tangenting thoughts for this post. I’d love to rant yet again about accessibility and how Web 2.0 is alienating an entire population while libraries are struggling to cross the digital divide on dwindling resources because reasons. However. I think that sentence does my ranting for me. Instead, let’s talk about the importance of shirking the restraints of Amazon's algorithms and how the web allows us to collaborate and make our own, better, products.

O’Reilly and Battelle make the bold statement that “data is the “Intel Inside” of the next generation of computer applications” (2009). Essentially, what lends power to the next generation (or this generation) of applications is the content generated by its users (Holmberg et. al, 2008). One of the main features of buying from Amazon is the “Customers who bought this item also bought” line which follows each item record [aside: does anyone else want to start referring to these as “bib records” since starting library school?]. Some of those books are really good, and they have been on our reading list for a long time, and hey, someone who has similarly impeccable taste thought it was good enough to buy! In short, the value of a customer’s purchase reaches far beyond that individual purchase. Amazon's partner GoodReads has a similar feature, enriched by the ability to add all the books you’ve ever read and thereby eliminate possible rereads. And oh look, a link to purchase the book on Amazon!

So. If data generated by users is so valuable, then why don’t libraries use it, too? Wouldn’t it be to our benefit to use these tools to connect our patrons to potential material, therefore increasing our circ statistics and overall patron satisfaction? Well, in a word, privacy. Using patron data for any purpose isn’t what we do — most libraries don’t even keep a patron’s checkout history in order to respect their right to privacy. While some may not be bothered by the NSA’s collection of metadata, anyone who understands how rich metadata has the potential to be may be appalled. Given a library’s respect for patron privacy, how can we provide similar services without compromising our own values?

The answer, I’m happy to say, is out there. One of the best parts of the Internet is that there are highly-skilled people in the world who want to share and build and collaborate. It is my belief that these people are the ones who will move us beyond the closed networks of Facebook and into full APIs by voluntarily sharing rather than by the obligatory social network connection. [note: for those who don’t know, APIs are essentially the creators going, “Hey! You! Come play with this toy that I made and we can play together!”] While Web 3.0 is still in its proverbial prenatal stages, the Internet as referred to by O’Reilly and Battelle is well on its way to being a temperamental teenager. There will be always be alternatives to the popular kids - the LibraryThing to the GoodReads, the to the Delicious Bookmarks.

In my searching for examples of library hacks, I found this beauty:

The Fortune Teller
Made by

It is, in essence, a small computer which produces randomized book recommendations on a small slip of receipt paper. There are instructions available for free online, and the GitHub code is up. I make the argument that, given a wifi connection and a small UPC/RFID reader, it wouldn’t be difficult to create a program which would scan the reader’s book, search LibraryThing for the title, and return a randomized recommendation from the book’s page. LibraryThing has an API which allows us to access its rich collected data, why not use it?

How, you may ask, is this any different from using people’s metadata to add value to items? LibraryThing’s recommendation feature uses only member-added tags and LC subject headings, so it pulls only from the crowd’s tags, which are opt-in and not default. It does not crawl through users’ accounts to compare books and purchases, only takes information which has been offered (yes, it’s possible to make a book/library/account completely private). Oh, and something else - LibraryThing also supports “work-to-work” relationships, adding the possibility to recommend the next book in the series or world as well as create richer recommendations.

While right now we live in a fascinating phase of the Internet, we are currently in the throes of growing pains. Discussions of personal privacy, terms & conditions, copyright law, DRM, and so-much-more occupy our digital airwaves. Oh, and that’s another funny thing — people talk about the teenage years as if they’re the most tumultuous of our lives, but really we just mature and become more accustomed to what life throws at us. Sure our hormones settle down, but really we just get better (read: more cynical) at dealing with all the crazy. The same is true in technology. Issues and controversies aren’t going to stop or settle down or even slow down, but we will get better at dealing with them. I hope. This community of collaboration is growing, evolving, and expanding, and I never want it to stop.

Wanna hear the coolest part of writing this post?
At first, I remembered the fortune teller somewhere in the back of my head and decided to search for it. All I could find, though, was the picture of it on Pinterest. Despite my ninja Googling skills, the only explanation of the fortune teller I could find was the original picture, posted by the creator. After a little more digging in their blog I managed to find that they posted directions! DIRECTIONS! Someone made it and then put up a DIY post. For free. Because they wanted to. As I reblogged the original post with directions, I put up a note about how neat it was. Within thirty minutes the creator messaged me and offered their help.

That’s the spirit of the web that I love.

Works Cited
Holmberg, K., et. al. (2009). What is Library 2.0? Journal of Documentation, 65(4), 668-681.
O'Reilly, T., Battelle, J. (2009). Web Squared: Web 2.0 Five Years On. Retrieved from