Difference between revisions of "ARK2/Files"

From ARK
Jump to: navigation, search
(File Management)
(File Management)
Line 7: Line 7:
 
* File versioning is supported.
 
* File versioning is supported.
 
* Image management uses [https://github.com/avalanche123/Imagine Imagine].
 
* Image management uses [https://github.com/avalanche123/Imagine Imagine].
 +
* Thumbnails are stored separately from the master files to make backups, archiving, and regeneration easier
  
The file structure for each instance is as follows:
+
The Media Types and File Types supported are configurable, with standard types provided by default:
 +
* Image
 +
** jpg, png, etc
 +
* Audio
 +
** mp3, etc
 +
* Video
 +
** mkv, etc
 +
* Text
 +
** txt, etc
 +
* Document / Application
 +
** pdf, odt, ods, etc
 +
* Other
 +
** Anything else
 +
* Spatial?
 +
** Special case for shapefiles, etc?
 +
 
 +
In multi-tenant mode, the files for each ARK instance are stored separately under the 'sites' folder, e.g. 'sites/mysite/files'.
 +
 
 +
The directory structure for each instance is as follows:
  
 
  - files
 
  - files
Line 15: Line 34:
 
  |-tmp
 
  |-tmp
 
  |-data
 
  |-data
   |- image
+
   |- <mediatype>
  |- <tokens>
 
  |- audio
 
  |- <tokens>
 
  |- video
 
  |- <tokens>
 
  |- text
 
  |- <tokens>
 
  |- document
 
  |- <tokens>
 
  |- other
 
 
   |- <tokens>
 
   |- <tokens>
 
  |-thumbs
 
  |-thumbs
   |- image
+
   |- <mediatype>
  |- <tokens>
 
  |- audio
 
  |- <tokens>
 
  |- video
 
  |- <tokens>
 
  |- text
 
  |- <tokens>
 
  |- document
 
  |- <tokens>
 
  |- other
 
 
   |- <tokens>
 
   |- <tokens>
  
 +
where the <mediattype> is as defined in the configuration, and the <token> is a configurable token to split the folders into manageable sizes. By default the <token> will be split by file id in groups of 1000, so subfolders for 0, 1000, 2000, 3000, etc.
 +
 +
The individual file names will be in a standard format. Flysystem abstracts folders and file names as a relative path, which will take the form:
 +
files/data/<mediatype>/<token>/<id>.<revision>.<suffix>
  
 
== Design Work ==
 
== Design Work ==

Revision as of 18:32, 16 November 2016

File Management

File and Media management have seen major changes in ARK2, with a more flexible system implemented.

  • Files are now a core Module instead of a dataclass and look-up table, allowing for flexible metadata based on file type, and using files as either data properties or relationships.
  • The file system is abstracted using FlySystem allowing files to be transparently saved locally or to cloud providers.
  • File versioning is supported.
  • Image management uses Imagine.
  • Thumbnails are stored separately from the master files to make backups, archiving, and regeneration easier

The Media Types and File Types supported are configurable, with standard types provided by default:

  • Image
    • jpg, png, etc
  • Audio
    • mp3, etc
  • Video
    • mkv, etc
  • Text
    • txt, etc
  • Document / Application
    • pdf, odt, ods, etc
  • Other
    • Anything else
  • Spatial?
    • Special case for shapefiles, etc?

In multi-tenant mode, the files for each ARK instance are stored separately under the 'sites' folder, e.g. 'sites/mysite/files'.

The directory structure for each instance is as follows:

- files
|-download
|-upload
|-tmp
|-data
 |- <mediatype>
  |- <tokens>
|-thumbs
 |- <mediatype>
  |- <tokens>

where the <mediattype> is as defined in the configuration, and the <token> is a configurable token to split the folders into manageable sizes. By default the <token> will be split by file id in groups of 1000, so subfolders for 0, 1000, 2000, 3000, etc.

The individual file names will be in a standard format. Flysystem abstracts folders and file names as a relative path, which will take the form:

files/data/<mediatype>/<token>/<id>.<revision>.<suffix>

Design Work

File Manage needs to be flexible and fast.

  • File attachments to data items
  • Data downloads / exports
  • Documentation
  • Temp files
  • Mapping files
  • Image management, generated thumbnails, etc
  • Metadata, mimetypes, etc
  • Cloud storage
  • Efficiency
  • Security

Current Structure:

- data
-- downloads
-- files
-- mapping
-- tmp
-- uploads

Problems with current structure:

  • All files in one directory, can be slow for large volumes
  • Thumbnails in same directory, slows performance, harder to maintain
  • No versioning
  • No expiry

Full document management and versioning workflow is a stretch goal needed for Avalon. Try use CMIS standard as used in LibreOffice.

Need to plan for moving to advanced model while keeping simple for first release.

  • Files are a system module, with modtype for core file type and attributes as required
  • Required module properties for title and description, all else optional, some core fields replicated on item table to performance
  • Use FlySystem for file system abstraction, extend to allow subpaths to be transparently mounted on.
  • Use Imagine for image handling
  • Split files into subdirs with max number of files per dir
  • Thumbnails into separate dir, managed by image code (on-the-fly creation, etc)
  • Support versioning
  • Support expiry date for tmp files
  • Support mimetypes + core types (image, video, audio, document, text)

Very similar to http://documentation.concrete5.org/developers/working-with-files-and-the-file-manager/overview