File Formats in SURFACE
SURFACE accepts a wide range of file formats. See list of formats housed in the SURFACE institutional repository
Files must be submitted in accessible formatting such as accessible PDFs or ePubs. .
When a file is submitted to SURFACE, we assign it one of the following categories:
- Supported: we support this format. There is a high likelihood that its content, appearance, and functions will be preserved over time.
- Known: we recognize this format but cannot guarantee its support over time. Among file types in this category, Microsoft Word (doc and docx), and Rich Text Format (rtf) will all be automatically converted to PDF. The original submitted document will be preserved as a backup copy.
- Not Supported: we do not recognize this format. At best, the bit stream will be preserved but not appearance or functionality. SURFACE staff will work with the author(s)/contributor(s) to determine if the document can be hosted.
File formats that exhibit all or many of the following characteristics -- open documentation; support across a range of software platforms; wide adoption; no compression (or lossless compression); no embedded files or embedded programs/scripts; and non-proprietary format -- have the greatest likelihood of preservation into the future.
For supported formats, such as PDF or TIFF, we might choose to bulk-transform files from a current format version to a future one. SURFACE staff will continually monitor formats and techniques to ensure we can accommodate needs as they arise.
All computer file formats depend on the availability of the appropriate software to render the functions and appearance intended by the file’s creator. Over time, older software applications may no longer function on new computer platforms, leaving the files created with those applications inoperable. As migration paths become available, SURFACE will provide support for converting files so that they may remain easily accessible. Extremely popular but proprietary formats (such as Microsoft .doc, .xls, and .ppt) are more likely to remain accessible into the future simply because their prevalence makes it likely tools will be available. However, the proprietary nature of many specific file types makes it impossible to make preservation guarantees.
File Formats
The following list is neither exhaustive nor exclusive, but meant to give a sense of the variety of formats that might be housed in SURFACE. The SURFACE team will partner with SU researchers to explore ways to support file formats not included on this list.
File Format | Extensions | MIME type | Level |
---|---|---|---|
Adobe PDF | application/pdf | supported | |
XML | xml | text/xml | supported |
Text | txt, asc | text/plain | supported |
HTML | htm, html | text/html | supported |
OpenDocument Text | odt | application/vnd.oasis.opendocument.text | supported |
OpenDocument Presentation | odp | application/vnd.oasis.opendocument.presentation | supported |
OpenDocument Spreadsheet | ods | application/vnd.oasis.opendocument.spreadsheet | supported |
Rich Text Format | rtf, rtx | text/richtext | supported |
MARC | marc, mrc | application/marc | supported |
JPEG | jpeg, jpg | image/jpeg | supported |
GIF | gif | image/gif | supported |
PNG | png | image/png | supported |
TIFF | tiff, tif | image/tiff | supported |
AIFF | aiff, aif, aifc, iff | audio/x-aiff | supported |
Postscript | ps, eps | application/postscript | supported |
Microsoft Word | doc, docx | application/msword | known |
Microsoft Powerpoint | ppt, pptx | application/vnd.ms-powerpoint | known |
Microsoft Excel | xls | application/vnd.ms-excel | known |
WordPerfect | wpd | application/wordperfect5.1 | known |
audio/basic | au, snd | audio/basic | known |
WAV | wav | audio/x-wav | known |
MPEG | mpeg, mpg, mpe | video/mpeg | known |
FMP3 | fm | application/x-filemaker | known |
BMP | bmp | image/x-ms-bmp | known |
Photoshop | psd, pdd | application/x-photoshop | known |
Video Quicktime | mov, qt | video/quicktime | known |
MPEG Audio | mpa, abs, mpega | audio/x-mpeg | known |
Microsoft Project | mpp, mpx, mpd | application/vnd.ms-project | known |
Mathematica | ma | application/mathematica | known |
LateX | latex | application/x-latex | known |
TeX | tex | application/x-tex | known |
TeX dvi | dvi | application/x-dvi | known |
SGML | sgm, sgml | application/sgml | known |
RealAudio | ra, ram | audio/x-pn-realaudio | known |
AutoCAD | dwg | known | |
AutoCAD Exchange Format | dfx | known | |
AutoCAD Internet Files | dwf | known | |
DejaVu | djv | known | |
RealVideo | ra, ram | video/x-pn-realvideo | known |
Unknown | application/octet-stream | unknown |
Note: Microsoft Word and Rich Text Format files will be automatically converted to and distributed as PDF format as part of the submission process. The original document will be preserved but will not be distributed via SURFACE unless the author makes arrangements with SURFACE to release it as a supplemental version. XML file submissions should, ideally, be accompanied by a validation schema and stylesheet. Consider providing a text-based format such as tab or comma delimited in addition to Excel or Open Office, especially if calculations, formulas, and other special attributes are included in the file.