Managing electronic references

chrisfromparis, 123RF.com

chrisfromparis, 123RF.com

Attribute!

Technical writers often find that attributing and linking back to their sources is a pain. Fortunately, several applications are available to make that task easier.

Technical writers largely belong to a group fondly called perfectionists. Their contributions ideally are well-crafted, poignant, and comprehensible in terms of content and aim to awaken some curiosity within the minds of readers. However, without good research, nothing else matters.

Various software tools have been designed to keep authors from getting bogged down in a sea of references and to help them give credit where credit is due. In this article, I will take a look at what these tools provide and how successful they are.

One rule you must adhere to as a writer is: Credit your sources. Correct referencing is of great importance. References are made up of metadata that includes the author, title, publisher, location, and date of publication, and in the case of online material, the date and time of access.

The form of citations and references used depends on the subject matter, publisher, and common practices (see the "Reference Versions and Formats" box to see how it's done at Linux New Media).

Reference Versions and Formats

In this publication, sources are referenced in the text with Arabic numerals enclosed in square brackets, [N], following the practice used in computer science. A combination of author names and publication dates is also used.

For details on various reference formats, see the Wikipedia entry [2].

Many authors have disgraced themselves over the years for not correctly citing their sources [1]. Expertise, talent, accuracy, intellectual curiosity, and enthusiasm must come together for a publication, and copying content is certainly not an option.

Concepts and Variations

The type of data storage that you use while writing often depends on your work environment – that is, whether you are working at your home desk or on a mobile device – the availability of the data, and your working style. Electronic card catalogs can be a great help in many situations [3].

User management and permissions play less of a role for single authors. However, the situation can get more complicated when multiple authors collaborate, share data, and try to agree on a single text or source component. In this case, open data formats and version control is what you need.

The following models can be divided into terms of local and central storage. For the programs I'm presenting, I will consider the local storage of text files, graphical programs for the desktop, and bookmarks in a web browser. A centralized, cloud-based storage distinguishes between access over a special desktop client and a web browser.

Everything Is Text

A simple example is a text file with numbered entries. Ideally, the file is sorted thematically and alphabetically by title or author, which helps in managing it and further research.

On the one hand, all UNIX/Linux standard tools can search through and edit it; on the other hand, version control systems like SVN and Git will also be amenable. If you use the Markdown style [4][5] from AsciiDoc [6] to structure your text file, the text will still remain easily readable for you and as well as for others (Listing 1). You can also edit it without any difficulty or adapt it move it into a wiki or Etherpad [7] for collaboration with co-authors.

Listing 1

AsciiDoc Example

== Markup languages and tools ==
* AsciiDoc, text-based document generation, http://www.methods.co.nz/asciidoc/
== Package management with APT and Aptitude ==
* Bruce Byfield: Apt Mastery. Installing software the Debian way,
  Linux Magazine, 108/2009, p. 76, http://www.linux-magazine.com/
  Issues/2009/108/Command-Line-Apt-get/%28language%29/eng-US
* Axel Beckert, Frank Hofmann: Dynamic Duo. Comparing the apt-get and aptitude
  package tools, Ubuntu User 18/2013, p. 46, http://www.ubuntu-user.com/
  Magazine/Archive/2013/18/Comparing-the-apt-get-and-aptitude-
  package-tools/%28language%29/eng-GB
== Package management with RPM ==
* Edward C. Bailey: Maximum RPM. Taking the Red Hat Package Manager to the
  Limit, Red Hat Inc., 2000, ISBN 1-888172-78-9, http://www.rpm.org/max-rpm/

Markdown, as a simplified markup language, is similar to email notation and aims at reworking it without further conversion steps so that a simple viewer – text editor or even web browser – can read it. If you need to convert it to an HTML, PDF, or ePub document, you can do so using AsciiDoc or Pandoc [8].

Figure 1 shows the result of converting to HTML after executing the following command:

$ asciidoc -a toc source.txt

Somewhat more complicated are the two text formats BibTeX [9] (LaTeX) and DocBook (XML) [10].

Figure 1: Reference list created with AsciiDoc based on Listing 1.

BibTeX assigns each entry a publication type and thus differentiates among articles, reports in books and collections, brochures, manuals, scientific dissertations, and conference results. Depending on the publication type, you will be required to fill in different fields for each specific entry.

Listing 2 shows the example of an entry for Edward C. Bailey's book Maximum RPM from Listing 1, for which you need the fields author or editor , title , publisher , and year . The ISBN field is optional and is assigned when you use biblatext as the BibTeX style instead of the default [11]. Figure 2 shows the results after converting Listing 2 with pdflatex and bibtex to PDF.

Listing 2

BibTeX Entry for Book

@book{ BaileyRPM,
        author          = {Bailey, Edward C.},
        title           = {Maximum RPM. Taking the Red Hat Package Manager to the Limit},
        publisher       = {Red Hat Inc.},
        year            = {2000},
        ISBN            = {1-888172-78-9}}
Figure 2: Book entry created with BibTeX.

The XML coding for DocBook is much more extensive and the data is already heavily pre-structured. Getting over the learning curve, however, is very helpful later when generating additional formats from your text for different output media with DocBook, XSLT, or XSL-FO combined with cascading style sheets (CSS) and putting the data into a version control system such as SVN or Git. Listing 3 shows the example of an entry with two authors taken from Listing 1.

Listing 3

DocBook Entry for an Article with Two Authors

<biblioentry>
  <abbrev>BeckertHofmannApt</abbrev>
  <biblioset relation="article">
    <title>Dynamic Duo</title>
    <subtitle>Apt-get and Aptitude (Part 1)</subtitle>
    <authorgroup>
      <author>
        <firstname>Axel</firstname>
        <surname>Beckert</surname>
      </author>
      <author>
        <firstname>Frank</firstname>
        <surname>Hofmann</surname>
      </author>
    </authorgroup>
    <artpagenums>90ff.</artpagenums>
  </biblioset>
  <biblioset relation="journal">
    <title>Ubuntu User</title>
    <pubdate>2013</pubdate>
    <issuenum>18</issuenum>
  </biblioset>
  <copyright>
    <year>2013</year>
    <holder>LinuxNewMedia, Lawrence</holder>
  </copyright>
  <issn>2040-8080</issn>
</biblioentry>

Graphical Programs

With graphical apps, you can manage not only your book inventory but also the material according to bibliographical data sets within certain limits. Especially applicable good for e-books is the Calibre [12] tool. Unfortunately, Calibre does not yet offer integration with standard metadata, which may be a reason to look for other alternatives.

JabRef

JabRef [13][14] was developed in Java; it uses BibTeX references and is available for Linux, Mac OS X, and MS Windows platforms. Using the well-developed user interface, you can not only edit BibTeX entries (see Figure 3) but also browse through various online scientific databases, such as CiteSeer [15], the digital archive of the IEEE standards organization and the Association for Computing Machinery (ACM) [16][17].

Figure 3: An entry for an article managed by JabRef.

JabRef interacts with various programs and formats, including importing and exporting SQL, DocBook/XML, worksheets for OpenOffice Calc, and CSV data. The project page [13] also features a plugin named jabref-plugin-oo that is a direct interface between JabRef and OpenOffice or LibreOffice. You can use it to manage the direct use of JabRef datasets.

Docear

The Docear academic literature suite [18] is in a similar league to JabRef but tries to use a much more comprehensive approach to managing research data. Docear combines document management with an index database and a drawing tool for mindmaps. It relies on already installed tools for formatting PDF documents and other formats.

The result is a rather complex program that requires a large screen and a corresponding learning curve (Figure 4). The project-oriented working method pays off over time and provides a good overview of the research status of your project.

Figure 4: Reference management using Docear and the JabRef plugin.

This GPL software is also based on Java and is available for Linux, Mac OS X, and MS Windows from the project page. The starting point of its development is mindmap and the Freeplane [19] knowledge management software that includes a direct support for JabRef. Docear also profits from support from the Otto-von-Guericke University Magdeburg and the University of California, Berkeley.

KBibTeX

KBibTeX is a BibTeX tool with a graphical interface that is produced by the KDE team [20][21]. You can use it to modify BibTeX resources, add new ones, delete them, or sort them (Figure 5).

Figure 5: Adjusting an entry in KBibTeX.

The learning curve for the user interface is a bit lower than for Docear. Noticeable are the thumbnails (called previews) in the lower left corner of the editing window that give you an idea of how your references will look in the target document if you use the appropriate style.

Zotero

Zotero [22] is very popular, but goes in a totally different direction. Developed by the Center for History and New Media at George Mason University, it manages references as so-called bookmarks that you create in your web browser.

Zotero has split into two different branches. One is Zotero for Firefox [23], which extends the Firefox/Iceweasel web browser, and the other is called Zotero Standalone. Both branches are free software and are available on a PPA for Ubuntu. To install, do the following:

$ sudo add-apt-repository ppa:smathot/cogscinl
$ sudo apt-get update
$ sudo apt-get install zotero-standalone

Zotero scans for bibliographic metadata in websites or COinS (see below). When it finds them, the web page adds an extra symbol next to the address field. In Figure 6, you can see a little book symbol. When you click on this, Zotero extracts the embedded metadata in the web page and adds it to the local resources database (see Figure 7).

Figure 6: Firefox/Iceweasel with an activated Zotero add-on.
Figure 7: Managing the stored metadata in Firefox.

To extract scientifically accurate, correct, and complete references from the stored metadata, open the Zotero Standalone program. In the three-column format, your collection is shown on the left, in the middle are the references, and on the right are the details for the selected reference.

Figure 8 shows the dialog for selecting the format Zotero uses to convert the metadata. You get the selection by right-clicking for the context menu and using the Export Selected Entry option. For BibTeX format, you get output as in Figure 9.

Figure 8: Exporting the references from Zotero.
Figure 9: The exported reference data for a Wikipedia entry.

Web and Cloud-Based Solutions

Several commercial solutions are available for capturing and maintaining your references in an integrated platform. Some of the most interesting are those that provide a central data repository, a complete text processing tool, and tools for knowledge organizations and scheduling and planning.

Depending on the version, platform, and license, they may also provide collaborative functions for joint document editing. The top dogs among the research platforms in certain languages include Citavi [24] and Mendeley [25], with Refeus [26] providing some competition.

Citavi is based on the .NET framework and, therefore, native to the Windows platform. Mendeley sees itself as a content manager for an academic, social network where users exchange information and are apprised of the latest research results.

The software, called Mendeley Desktop, is available for Ubuntu (32-bit and 64-bit) from the project webpage. Requirements include registering on the website. The user interface is similar to that of JabRef.

Figure 10 shows a list of recently added documents and their references. You can export this data very easily, for example, into BibTeX format or XML for EndNote [27]. Access is facilitated through Droideley [28], a free app for the Android smartphone.

Figure 10: Details about a selected document.

Refeus is developed by a startup from Brandenburg and is marketed as a tool supporting "the entire process of collecting and managing content and sources all the way to writing and publishing scientific documents and other material" [26]. Refeus, therefore, combines a text editor based on OpenOffice and LibreOffice with a document and source management in a single, unified user interface.

Refeus is available as a free Refeus Basic version and a cost-based Refeus Plus version from its website. You can find software packages for Ubuntu 12.04, 14.04 and up, as well as for Mac OS X and Microsoft Windows.

All your data is stored using the free object-based ODABA [29] database. The package includes two very interesting features: an app for your smartphone and a plugin for the web browser. The app is called Refeus Mobile and makes it easier to grab sources on the go (Figure 11).

Figure 11: Refeus Mobile in operation.

The plugin called Refeus WebCollect is similar to Zotero and gathers the collected data in an inventory. (See the "Technologies for Automating Source Collection" for additional information.)

Automating Source Collections

Typing entries is a thing of the past. Nowadays, besides capturing them with QR codes, other technologies, such as COinS and DOI, are available for capturing sources automatically. These tools are based on the growing proliferation of mobile devices with daily Internet access.

The Context Objects in Spans (COinS) protocol describes integrating bibliographic metadata in (static) webpages [30], a directive built into the context of OpenURL 1.0. COinS are defined in "<span>" elements that have no effect on browser output [31]. Listing 4 shows the definition for a Z388 class with an ISBN of 1056-4438 to identify journals or a publishing series.

You can read this information using a browser plugin like Zotero (for Firefox) or Citavi (for Firefox, IE, and Google Chrome). Several Internet providers already support COinS, including various library catalogs and collaborative platforms such as Citebase [32], WorldCat [33], Mendeley, and ResearchGate [34]. A few additional plugins are available for WordPress, so that you can enhance your publications and use them with Zotero [35].

Digital Object Identifiers (DOIs) are object identifiers with which you can clearly identify physical, digital, or abstract objects. They have similarities with ISBN and ISSN but also integrate a function for localization. The previous application was for online articles for professional scientific journals. A few tools can process DOIs, among them JabRef, Zotero, Refeus, and the Citavi Picker.

Listing 4

COinS in Span Elements

<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3A
journal&amp;rft.issn=1045-4438">

Conclusion

This article provided an overview of free tools available for creating and maintaining your literary inventory. Each tool addresses different working styles and habits. The technical implementation varies between modular building blocks and complex all-inclusive solutions. All the tools shown here are based on existing established data formats and provide corresponding import and export functions. In this way, you're not trapped in proprietary formats and your research findings are always accessible, even when you change tools.

If these tools don't offer what you're looking for, you can take a look at other options, such as Bibus [36], TextCite [37], RefBase [38], and CiteULike [39]. These tools are also very interesting but outside the scope of this article.

Acknowledgments

The author thanks Dirk Deimeke, Wendi Wuestemann, Joerg Dolz, and Christian Bagdahn for their insight and comments while preparing this article.

Infos

  1. The Vroniplag project: http://en.wikipedia.org/wiki/VroniPlag_Wiki
  2. Reference Wikipedia entry: http://en.wikipedia.org/wiki/Reference
  3. "Using the Leo Editor" by Andreas Reitmaier, Ubuntu User , Issue 23, 2014: http://www.ubuntu-user.com/Magazine/Archive/2014/23/Using-the-Leo-editor/%28language%29/eng-GB
  4. Markdown Wikipedia entry: http://en.wikipedia.org/wiki/Markdown
  5. Markdown project page: https://daringfireball.net/projects/markdown/
  6. AsciiDoc: http://www.methods.co.nz/asciidoc/
  7. Etherpad: http://etherpad.org/
  8. Pandoc: http://johnmacfarlane.net/pandoc/
  9. BibTeX: http://www.bibtex.org/
  10. DocBook: http://docbook.org/
  11. BibTex – Show ISBN number?: http://tex.stackexchange.com/questions/52040/bibtex-show-isbn-number
  12. Calibre: http://calibre-ebook.com/
  13. JabRef: http://jabref.sourceforge.net/
  14. JabRef Debian package: https://packages.debian.org/wheezy/jabref
  15. CiteSeer Scientific Literature Digital Library: http://citeseer.ist.psu.edu/
  16. IEEE Xplore Digital Library http://ieeexplore.ieee.org/Xplore/home.jsp
  17. Digital Library of the Association for Computing Machinery (ACM): http://dl.acm.org/
  18. Docear: http://www.docear.org/
  19. Freeplane: http://www.freeplane.org/wiki/index.php/Main_Page
  20. KBibTeX: http://home.gna.org/kbibtex/
  21. KBibTeX Debian package: https://packages.debian.org/wheezy/kbibtex
  22. Zotero: https://www.zotero.org/
  23. Zotero Debian package: https://packages.debian.org/wheezy-backports/zotero-standalone
  24. Citavi: http://www.citavi.com/
  25. Mendeley: http://www.mendeley.com/
  26. Refeus: https://refeus.de/ (in German)
  27. EndNote: http://www.endnote.com/
  28. Droideley: https://github.com/petrvolny/Droideley
  29. ODABA: http://odaba.com/
  30. COinS Wikipedia entry: http://en.wikipedia.org/wiki/COinS
  31. HTML <span> tag at w3schools: http://www.w3schools.com/tags/tag_span.asp
  32. Citebase: http://www.citebase.org/
  33. WorldCat: http://www.worldcat.org/
  34. ResearchGate: http://www.researchgate.net/
  35. WordPress plugin for COinS (Zotero): https://www.zotero.org/support/plugins
  36. Bibus: http://sourceforge.net/projects/bibus-biblio/
  37. TextCite: http://textcite.sourceforge.net/
  38. RefBase: http://www.refbase.net/
  39. CiteULike: http://www.citeulike.org/