It is important to consider what format will be best for managing, sharing, and preserving your data. How you choose to represent your data is a primary factor in someone else's ability to use your data in the future.
Formats that are likely to be accessible in the future are:
Examples of preferred format choices include:
Discourage formats include:
UK Libraries recommends applying a consistent filenaming convention to your research data using the following guidelines.
First and foremost:
• Keep file names short but descriptive
• Be consistent with established conventions
• Denote dates in YYYYMMDD format
• Use unique identifier, e.g., project name or grant number
• Identify document content, e.g., questionnaire or grant proposal
• Use underscores or hyphens as delimiters; avoid space and special characters, e.g., &, *, #, etc.
• Keep track of document version either sequentially or with a unique date and time, e.g., v01, v02, 20140403_1800, etc.
• Avoid complex folder hierarchies
UK Libraries recommends considering multiple factors when choosing a file format for your research data.
• Proprietary and non-proprietary (open) formats
Proprietary formats are limited by software patents, lack of format specification details, or built-in encryption to prevent open usage by the public. This results in requiring specific software provided by one vendor in order to use the proprietary format. In contrast, an open format is a file format that is freely available for everyone to use. Because the specifications are released, opens source developers can write software to utilize the file format in the case that a particular vendor no longer supports the file format. This increases the chances that technological developments do not make particular file formats obsolete.
• Industry format adoption
In some cases, an industry or profession may treat specific file formats as a de facto standard even if the formats are proprietary and rely on expensive software. In those cases, it may be more convenient to use the same proprietary file format.
• Technical dependencies
Technical dependencies are the degree to which a particular format depends on particular hardware, operating system, or software and how these dependencies might influence future usage of the media. Using non-proprietary file formats may decrease the risk of technical obsolescence by removing the dependency on the underlying technology.
• File quality and file size
Each file type such as text, images, or sound has many file formats available. File quality, the representation of the given item’s characteristics, is a large part of the file format decision. Encoding that handles high resolution will be larger than lower quality file formats. However, the trade-off comes at the cost of storage space and convenience in disseminating the file to others.