History & Background

This page provides information about the origin of the Heurist project, including its methodology and intended audience.

Origin & Develoment

Heurist was originally designed by Dr Ian Johnson, founding director of Arts eResearch (formerly the Archaeological Computing Laboratory), University of Sydney, and developed there from September 2005 to December 2013. Heurist continues to be developed and supported within the Faculty of Arts and Social Sciences and the University’s Research Portfolio, under the direction of Dr Johnson.

Heurist maintenance and development is the principal responsibility of Artem Osmakov, with the Heurist Network web site and Heurist documentation being developed by Vincent Sheehan.

Heurist is in rapid and ongoing development thanks to a number of major funded projects which use and support its infrastructure, including:

  • HuNI : Humanities Networked Infrastructure, NeCTAR funding 2013–2014.
  • FAIMS : Federated Archaeological Information Management System, NeCTAR funding 2013–2014, ARC (Australian Research Council) LIEF 2014.
  • The Dictionary of Sydney : ARC Linkage grants 2005–2010 & 2010–2013.
  • Digital Harlem : ARC Discovery grants 2003–2007 & 2010–2015.
  • Book of Remembrance Online University of Sydney Chancellor’s Award 2013–2014, online Sept 2014.
  • Many smaller projects, with generous long-term infrastructure funding from the University of Sydney.

(See also infrastructure and exemplars.)

Methodology

Heurist is written in PHP and Javascript, on top of a fixed MySQL data structure (all Heurist databases have the same underlying structure, as the logical structure of the database is encoded directly in the data). Entities/record types, fields and terms are defined within the database rather than being hardcoded in the software or database structure.

Heurist uses a key-value pair approach linked to a primary data table instantiating typed entities, allowing variant data structures and repeating value fields. Relationships between entities are implemented as a relationship record which uses the same storage methodology as other record types, allowing it to be enriched with additional data as required.

Heurist has the following field types:

  • Numeric (integer or decimal).
  • Text (single line or memo).
  • Term lists (values from a controlled hierarchically organised list).
  • Date/time fields (including fuzzy dates and several alternative calendars).
  • Geographic (point, line, polygon).
  • Pointer fields allowing lookup of another record in the database, constrained to specific type(s) or unconstrained. Pointer fields support master/detail relationships allowing master records to access data within detail records.
  • Relationship marker fields allowing the creation of typed, directional, dated and annotated relationships between records (relationship marker fields do not actually contain data, they constrain the creation of relationships embedded within data entry forms).
  • File attachments; this field type also allows remote files to be referenced through a URL.

Heurist uses Smarty templates for user-defined reporting, and generates maps and timelines directly in the interface for any items which have geographic or time fields; embedding code is provided to generate the same reports / maps / timelines in a web page using Javascript or within an iframe.

Other functions include a bookmarklet for capturing web references, WYSIWYG formatted text and threaded discussions within records, user and workgroup tags, personal and shared saved searches, workgroup ownership of records, group notifications, and blogging. There is a Zotero bibliography synchronisation function.

For developers there is a Javascript programming API—HAPI—allowing direct read and write access to Heurist records independent of internal storage structure, and functions for transforming XML output to other forms using XSLT stored in records within the database. Heurist source code is available under GNU GPL from the Google Code repository and can be installed on any LAMP server, including virtual servers in the NeCTAR Research cloud.

Applicability & Audience

Heurist is aimed primarily at an audience of digitally-oriented Humanities researchers (‘Digital Humanists’) for managing heterogeneous and relatively unstructured data, in small to medium collections of (often textual) data such as those typically found in the Arts and Humanities, and in personal research spaces.

However, it is also suitable for a wide range of small database applications, including eResearch support staff who need a method to rapidly build databases for their clients, for developing administrative application and as a teaching tool.

It is not suitable for large, structured, homogeneous, numerical datasets typical of the Sciences. The largest Heurist database so far was 3M records, and we freely admit it was unusable (although this was probably due to limitations on an aging server as much as Heurist). Current databases with good performance are up to 200K records, but current developments in train will improve performance for larger databases.

A Brief History

Ian Johnson, Senior Research Fellow

I started designing Heurist in 2005 in the context of the SHSSERI partnership (Sydney Humanities & Social Sciences e-Research Initiative) of which I was a founding member and key contributor. However its origins go far further back to work I did on the Minark archaeological DBMS (1980 – 1987) and TimeMap (1997 onwards).

Context

Humanities data is characterised by high heterogeneity, small data volumes, qualitative and textual information, and the importance of the connections between entities, rather than the large volumes of repetitive, standardised, quantitative data characteristic of the Sciences. Humanists need complex databases but have extremely limited resources for building them. My vision for Heurist – originally styled the SHSSERI Collaborative Knowledge System – was therefore to build a user-configurable online database which would handle a wide variety of Humanities data and allow researchers to manipulate and share these data in ways adapted to the needs of Humanists. My vision was perhaps overly ambitious – although we have largely made good on it – prioritising methods which address the needs of researchers rather than item inventories and business use (the primary drivers of consumer-oriented databases). At the time I was critically aware, from over three decades developing Humanities computing applications since 1972 (Johnson 1976), that a major issue was the fragmentation of information into format-specific software silos (bibliography, text files, images, spreadsheets, GIS, bookmarking, notes, blogs etc.) and the lack of integrated tools to manage heterogeneous and often complex interlinked data. The key need I saw, therefore, was a way for Humanists to create and maintain complex collaborative databases without the need to either cobble together solutions from many disparate tools or become programmers and reinvent the wheel. Web applications were the obvious medium for this development, an area in which I had prior experience through work on TimeMap and the ECAI Clearinghouse (see later).

History

I established the Archaeological Computing Laboratory (ACL) now Arts eResearch (AeR) in 1992 through a joint Large Equipment grant with Roland Fletcher. Initially with a staff of 0.5 FTE (myself) it currently has 5.5 FTE, of whom two are programmers. Heurist is the culmination of the last two decades of work, drawing on prior applications mentioned above (and prior work back to the early 1980s). I built the prototype of Heurist, known as TMBookmarker, in August 2005 using the T1000 templating system (written by Tom Murtagh) and a web-based system for creating and structuring a database and writing T1000 templates to manage it (written by myself). From this prototype I started developing Heurist through specification of additional functions and requirements, which were developed by the Archaeological Computing Laboratory (hereafter ACL) programming staff. However, the origins of Heurist go back much further to Minark (Johnson 1984), a microcomputer database I designed and wrote between 1980 – 1987. Minark was quite widely used for state site registers and excavation databases in Australia, as well as overseas, in the 1980s and early 1990s, before the rise of FileMaker and Microsoft Access made its text-based interface obsolete. Many of the core ideas of Heurist date back to this system:

  • database structure stored within the database itself;
  • database structure configurable by the user through the interface;
  • change of record structure on the fly without loss of data;
  • flexible record structure allowing missing and repeating fields;
  • dynamic construction of data entry forms from structure definitions;
  • variable length free text;
  • enumerated and lookup fields;
  • enumerated lists extended on the fly;
  • import and export of CSV and self-documenting database dumps;
  • simple built-in user-configurable report formatting to screen, printer or disk;
  • simple built-in mapping.

The emphasis in Minark was to empower the end-user to build their own databases, and many did. In contrast, the contemporary market leader, dBase II and later III, required extensive technical programming to build an equivalent system. This philosophy of user-configured databases and user empowerment is central to the design of Heurist, and particularly to the restructuring of the database configuration and user tables which I undertook in moving Heurist to Version 3, starting in 2009. Additional research which has contributed to Heurist include: TimeMap (developed in collaboration with Artem Osmakov and his team from 1997 – 2005); the Electronic Cultural Atlas Initiative (ECAI) Clearinghouse (developed in-house from 1998 – 2001); FieldHelper (initially prototyped in-house (Vsn 0.1 – 0.4), but eventually contracted out and developed collaboratively to a practical, although unfinished due to lack of funding, system by Artem Osmakov (Vsn 2.0)).

Heurist Registration

Registration allows creation of Heurst databases and subscribes you to occasional news updates (single-click to unsubscribe).

We will not share your email information with any third party.

Thank you for registering. We have sent you an email, allowing you to confirm your registration and create your first Heurist database.